Architecture Overview¶
This document describes the architecture of router-hosts, a Rust CLI tool for managing DNS host entries on routers and servers.
System Overview¶
router-hosts uses a client-server architecture:
- Server runs on the target machine (router, server, container), manages a configurable hosts file via event-sourced storage
- Client runs on workstation, connects via gRPC over TLS with mutual authentication
flowchart TB
subgraph Client
CLI[router-hosts CLI]
end
subgraph Server
GRPC[gRPC Server]
ES[Event Store]
SNAP[Snapshot Store]
PROJ[Host Projection]
end
subgraph Storage
SQLite[(SQLite)]
PG[(PostgreSQL)]
Duck[(DuckDB)]
end
CLI -->|mTLS| GRPC
GRPC --> ES
GRPC --> SNAP
GRPC --> PROJ
ES --> SQLite
ES --> PG
ES --> Duck
See docs/plans/2025-12-01-router-hosts-v1-design.md for complete design specification.
Workspace Structure¶
Six crates in a Cargo workspace:
router-hosts-common¶
Shared library containing: - Protocol buffer definitions and generated code - Validation logic (IP addresses, hostnames) - Shared types and utilities
router-hosts-storage¶
Storage abstraction layer:
- Storage trait defining EventStore, SnapshotStore, and HostProjection
- SQLite backend (default, lightweight, embedded, single-file)
- PostgreSQL backend (multi-instance, cloud deployments, connection pooling)
- DuckDB backend (embedded, feature-rich analytics) - requires separate binary
- Shared test suite for backend compliance (42 tests)
router-hosts¶
Main binary (client and server modes): - Client mode (default): CLI interface using clap, gRPC client wrapper, command handlers - Server mode: gRPC service implementation, storage integration, hosts file generation with atomic writes, post-edit hook execution, Prometheus metrics - Mode selection: runs in server mode when first argument is "server", otherwise client mode - Includes SQLite and PostgreSQL backends
router-hosts-duckdb¶
Variant binary with DuckDB support: - Same functionality as router-hosts - Includes all three storage backends (SQLite, PostgreSQL, DuckDB) - Larger binary size due to DuckDB dependencies
router-hosts-operator¶
Kubernetes operator for automated DNS registration: - Watches Traefik IngressRoute, IngressRouteTCP, and custom HostMapping CRDs - Automatically registers/updates host entries with router-hosts server - Leader election for high availability - Health endpoints for Kubernetes probes - See Operator Documentation for details
router-hosts-e2e¶
End-to-end acceptance tests: - Docker-based integration tests with real mTLS - 10 tests across 4 scenario files covering CRUD, auth, disaster recovery
Key Design Decisions¶
Event Sourcing¶
- All changes stored as immutable events in the storage backend
- Current state reconstructed from event log (CQRS pattern)
- Complete audit trail and time-travel query capability
- Optimistic concurrency via event versions
See docs/architecture/event-sourcing.md for detailed event sourcing documentation.
Streaming APIs¶
- All multi-item operations use gRPC streaming (not arrays/lists)
ListHosts,SearchHosts,ExportHosts- server streamingImportHosts- bidirectional streaming- Better memory efficiency and flow control
Request/Response Messages¶
- All gRPC methods use dedicated request/response types
- Never bare parameters - enables API evolution without breaking changes
Atomic /etc/hosts Updates¶
- Generate to
.tmpfile -> fsync -> atomic rename - Original file unchanged on failure
- Post-edit hooks run after success/failure
Versioning¶
- Storage backend stores snapshots of /etc/hosts at points in time
- Configurable retention (max count and max age)
- Rollback creates snapshot before restoring old version
Security¶
- TLS with mutual authentication (client certs) is mandatory
- No fallback to insecure connections
- Server validates client certificates against configured CA
Observability¶
Prometheus Metrics¶
The server exposes Prometheus metrics on a configurable HTTP endpoint:
- Request metrics:
router_hosts_requests_total,router_hosts_request_duration_seconds - Storage metrics:
router_hosts_storage_operations_total,router_hosts_storage_duration_seconds - Host metrics:
router_hosts_hosts_entries - Hook metrics:
router_hosts_hook_executions_total,router_hosts_hook_duration_seconds
See Operations Guide for configuration.
Health Endpoints¶
- Server:
Liveness,Readiness, andHealthRPCs withinHostsServicefor monitoring probes - Operator: HTTP endpoints at
/healthz(liveness) and/readyz(readiness)
Configuration¶
Server Configuration¶
Server requires:
- hosts_file_path setting (no default) - prevents accidental overwrites
- TLS certificate paths
- Storage backend: SQLite (default), PostgreSQL URL, or DuckDB path
- Optional: retention policy, hooks, metrics endpoint, timeout settings
Client Configuration¶
- Config file optional (CLI args override)
- Server address and TLS cert paths
Storage Layer¶
- Storage trait in
router-hosts-storageabstracts database operations - Available backends:
- SQLite (default): Lightweight embedded, single file, wide compatibility
- PostgreSQL: Multi-instance deployments, connection pooling, cloud-ready
- DuckDB: Embedded, single file, feature-rich analytics (requires
router-hosts-duckdbbinary) - Default path:
~/.local/share/router-hosts/hosts.db(XDG-compliant) - Use in-memory mode for tests:
SqliteStorage::new(":memory:") - Shared test suite validates any
Storageimplementation (42 tests)
Validation¶
All validation logic lives in router-hosts-common/src/validation.rs:
- IPv4/IPv6 address validation
- Hostname validation (DNS compliance)
- Duplicate detection happens at database level
Error Handling¶
Map domain errors to appropriate gRPC status codes:
- INVALID_ARGUMENT - validation failures
- ALREADY_EXISTS - duplicates
- NOT_FOUND - missing entries/snapshots
- ABORTED - concurrent write conflicts (version mismatch)
- PERMISSION_DENIED - TLS auth failures
Include detailed error context in response messages.
Testing Strategy¶
- Unit tests: Mock filesystem for /etc/hosts operations
- Integration tests: Use in-memory DuckDB, self-signed certs
- Storage tests: Shared test suite in
router-hosts-storage/tests/common/(42 tests) - Any new storage backend must pass all tests via
run_all_tests(&storage).await - E2E tests: Docker containers with real mTLS (10 tests across 4 scenarios)
- No real file system writes in tests (use tempfiles or mocks)
/etc/hosts Format¶
Generated file includes:
- Header comment with metadata (timestamp, entry count)
- Sorted entries (by IP, then hostname)
- Hostname aliases (sorted alphabetically after canonical hostname)
- Inline comments from entry metadata
- Tags shown as [tag1, tag2] in comments
Example:
# Generated by router-hosts
# Last updated: 2025-11-28 20:45:32 UTC
# Entry count: 42
192.168.1.10 server.local srv web # Main server [prod]
192.168.1.20 nas.home.local # NAS storage [homelab]
Hostname Aliases¶
Full support for hostname aliases per hosts(5) format.
CLI Usage¶
# Add host with aliases (--alias is repeatable)
router-hosts host add --ip 192.168.1.10 --hostname server.local \
--alias srv --alias web
# Update aliases (replaces all)
router-hosts host update <id> --alias primary --alias backup
# Clear all aliases
router-hosts host update <id> --clear-aliases
# Import with alias conflict override
router-hosts host import --file hosts.txt --conflict-mode strict --force
Key Behaviors¶
- Aliases are sorted alphabetically in all output for deterministic results
- Search matches both canonical hostname and aliases (case-insensitive)
- Validation prevents alias matching canonical hostname or duplicates
- CSV format: aliases are semicolon-separated (e.g.,
srv;web;api)
API Notes¶
UpdateHostRequestusesAliasesUpdatewrapper message for aliasesNone= preserve existing,Some(vec![])= clear,Some(values)= replace- Same pattern used for tags via
TagsUpdatewrapper
Rust Best Practices¶
Error Handling¶
- Use
Result<T, E>for fallible operations (neverpanic!in library code) - Use
thiserrorfor custom error types with good error messages - Use
anyhowfor application-level error handling - Propagate errors with
?operator, not.unwrap()or.expect() - Only use
.expect()in tests or when invariant is guaranteed by type system
Type Safety¶
- Use newtypes for domain concepts:
struct HostId(String)not bareString - Use builder pattern for complex constructors
- Leverage Rust's type system to make invalid states unrepresentable
- Use
#[non_exhaustive]for public enums that might grow
Async Patterns¶
- Prefer
tokio::spawnfor CPU-bound work in separate tasks - Use
tokio::select!carefully (ensure all branches are cancel-safe) - Avoid holding locks across
.awaitpoints - Use
#[tokio::test]for async tests
Performance¶
- Use
&strfor read-only string data,Stringfor owned - Prefer
&[T]over&Vec<T>in function parameters - Use
Cow<'_, str>when you might need to own or borrow - Avoid unnecessary clones - use references when possible
- Use
Arc<T>for shared ownership across threads
Memory Safety¶
- Minimize
unsafecode (justify each use with SAFETY comment) - Use
#[must_use]for types/functions where ignoring return is likely a bug - Prefer stack allocation over heap when possible
Code Organization¶
- Keep functions small (< 50 lines)
- Maximum cyclomatic complexity of 10 per function
- Use modules to organize related functionality
- Public APIs should be minimal and well-documented
Documentation¶
- All public items must have doc comments (
///) - Include examples in doc comments for non-trivial APIs
- Use
//!module-level docs to explain module purpose - Document panics, errors, and safety requirements
Modern Rust Features (Edition 2021+)¶
Use these patterns:
- if let chains: if let Some(x) = opt && x > 5 { }
- let else: let Some(x) = opt else { return }
- impl Trait in function signatures for clarity
- async fn in traits (requires async-trait or nightly)
- Const generics where applicable
Avoid:
- .clone() on Arc<T> without understanding ref counting
- Rc<RefCell<T>> in async code (not Send)
- String allocations in hot paths
- Excessive trait bounds (use where clauses for readability)
Dependencies¶
Core dependencies (see Cargo.toml for versions):
- tonic + prost - gRPC/protobuf
- tonic-build + protobuf-src - protobuf code generation with bundled protoc
- duckdb - embedded database
- tokio - async runtime
- clap - CLI parsing
- serde + toml - config
- rustls - TLS
- tracing - logging
- proptest - property-based testing
Note on Protocol Buffers: The project uses protobuf-src to provide a bundled
Protocol Buffers compiler (protoc), eliminating the need for system installation.
This makes the build self-contained and portable across development environments.
Dependency Management¶
Philosophy:
- Minimize dependencies (each dependency is a liability)
- Prefer well-maintained crates with recent updates
- Check cargo-audit regularly for security issues
- Pin versions in Cargo.lock (committed for binaries)
Workspace Dependencies:
- All dependency versions defined in workspace Cargo.toml
- Individual crates use workspace = true references
- Keep dependencies up-to-date (check monthly)
Security: