ADR-003: Clock Authority & Time Synchronization
Decision record for clock authority, time synchronization, and fallback policies across nodes and central.
Metadata
- ADR ID: ADR-003
- Title: Clock Authority and Time Synchronization Governance
- Status: Proposed
- Date: 2026-03-15 (proposed)
- Owner: Operations/Hardware lead
- Target Decision Date: 2026-04-10
- Relates to: System-Gaps-Deferred#time-synchronization-and-clock-authority
Problem / Context
The architecture specifies that nodes acquire GNSS timing from a receiver (Node-Hardware-Interface) and the central aggregator synchronizes clocks across the system. However, several questions remain unresolved:
- Primary authority: When node and central disagree on time, who wins? (node GNSS, central NTP, or hybrid consensus?)
- Accuracy targets: What clock drift is acceptable? (1 ms, 1 sec, 10 sec?)
- Degradation handling: If GNSS is unavailable (urban canyon, jamming), what's the fallback?
- Central sync mechanism: NTP, PTP, or custom protocol?
- Timestamp consistency: Are all frame timestamps node-side GNSS, or do we add central-side timestamps?
- Audit trail: Can we reconstruct which clock authority was in effect when each frame was decoded?
Current State
- Node-Hardware-Interface mentions GNSS receiver but doesn't specify timing responsibility
- System-CentralAggregator mentions "synchronization" but no protocol
- No fallback defined for GNSS outage
- Existing prototypes use system clock (no GNSS or NTP)
Why This Matters
- Telemetry accuracy: Rocket trajectory reconstruction depends on precise timing
- Multi-target synchronization: If two targets' timestamps drift, relative DoA estimates degrade
- Compliance: Audit logs must have trustworthy timestamps
- Security: Time is used in certificate validation (ADR-001); incorrect time breaks authentication
Deferred Decision Options
Option A: Central NTP Authority (Simple)
Approach: Central aggregator runs NTP server. All nodes (and frontend) sync to central via NTP. Node GNSS used only for mission-critical fields (geographic position), not timing.
Pros:
- Single source of truth (central NTP)
- Proven technology (NTP is decades old, stable)
- Backward compatible: works in air-gapped environments (no internet needed)
- Simple to debug: time comes from one place
Cons:
- Central is single point of failure for timing (if central dies, node clock drifts)
- Requires network connectivity (if node is isolated, loses sync)
- Less accurate than GNSS (NTP typical accuracy ~100ms; GNSS ~1-10 µs)
Implementation:
- Central:
systemd-timesyncd+chronyNTP server (bind to 127.0.0.1 for internal use) - Node:
ntpdateorchronyclient pointing to central IP - Sync interval: every 60 seconds (with jitter to avoid thundering herd)
- Fallback: if NTP sync fails for >5 mins, node enters degraded timing mode (warn operator)
Option B: Node GNSS Authority (Distributed)
Approach: Each node is responsible for own GNSS timing. Central collects timestamps from nodes, trusts them. No central NTP server.
Pros:
- Highly accurate: GNSS timing is excellent (~1-10 µs)
- Decoupled: node doesn't depend on central for timing
- Natural for rocket telemetry: GNSS is already on board for position
- Survives network partition: node keeps accurate time independently
Cons:
- Requires GNSS receiver on all nodes (hardware cost + startup delay)
- GNSS not available indoors or in dense urban (urban canyon, launch pad under roof)
- Multi-target sync depends on each target having good GNSS lock (hard to guarantee)
- Mis-synchronized clocks harder to debug (no single reference)
Implementation:
- Node: Read GNSS receiver at 1 Hz; update system clock via
adjtime() - Node: Export clock confidence metric (e.g., "GNSS locked 95% of time")