ADR-0009 — Autonomy Under Uncertainty¶

Status: Accepted
Date: 2026-06-04

Context¶

ADR-0008 freezes the mechanism layer for perception failure: a typed Estimate[T] envelope, a Validity ladder, and a closed catalog of PerceptionMode values with quantitative entry criteria. It deliberately stops short of saying what the vehicle should do in each mode.

ADR-0000 commits the project to a single behavioral principle: the system must know when it knows, know when it does not know, and alter its behavior accordingly. The architecture red-team review (§2.6) is sharper: without a documented degradation policy, the GPS-denied autonomy claim is marketing.

A behavioral policy under uncertainty has to resolve, at minimum, four questions:

What does the vehicle do when it enters each perception mode? (Hover? RTL? Land? Hand to pilot?)
Who has authority — autonomous controller, pre-recorded recovery routine, human pilot, safety supervisor?
How does the mission layer reason about uncertainty when planning, not only when reacting?
What invariants are non-negotiable even under degraded perception (kill switch, no-fly geometry, max altitude)?

These are policy decisions with a different change rate than the mechanism. They will be revisited as the project moves from PyBullet to Gazebo+PX4 to hardware. Putting them in a separate ADR keeps the mechanism stable while the policy evolves.

This ADR fixes the policy for Phases 1–6. Phases 7+ (real hardware) may add stricter rules without breaking this contract; they may not loosen it.

Decision¶

Project Ghost adopts a three-tier autonomy stack in which perception modes (ADR-0008) map to authority transitions, mission-level uncertainty reasoning is explicit, and a small set of safety invariants is enforced unconditionally.

1. Tiered autonomy¶

Tier	Name	Authority over actuators	When active
T0	Safety supervisor	Veto on every command; cannot generate motion of its own except `STOP`/`KILL`.	Always.
T1	Reflex layer	Closed-loop stabilization, attitude/rate control, dead reckoning hold. Deterministic, no perception input.	Always running; takes command when T2/T3 yield.
T2	Reactive layer	Behaviors per perception mode (hover, slow ascend, blind RTL). Consumes `Estimate[T]` but uses no global map.	Engaged in `DEGRADED` modes per §2.
T3	Deliberative layer	Mission planning, SLAM consumer, frontier exploration. Trusts `VALID` estimates.	Engaged only in `NOMINAL` mode.

Authority flows downward when uncertainty rises. T3 yields to T2 when leaving NOMINAL. T2 yields to T1 in PERCEPTION_DEAD. T0 may pre-empt any tier at any time.

2. Behavioral response per perception mode¶

The mapping below is the canonical policy. Modules implementing reactive behaviors MUST consult this table; deviations require an ADR amendment.

`PerceptionMode`	Active tier	Behavior
`NOMINAL`	T3	Mission plan executes normally.
`LOW_TEXTURE`	T2	Reduce horizontal speed to `low_texture_max_speed_mps` (default 0.5). Continue mission if path is collision-free per last known map; otherwise hover and request replan.
`LOW_LIGHT`	T2	Stop forward motion. Hold position with inflated covariance. Emit `RECOVERY_REQUESTED` event. After `low_light_recovery_timeout_ms` (default 5000), ascend slowly at `slow_ascend_mps` (default 0.3) toward `recovery_altitude_m` (default 2.0 above takeoff) seeking texture.
`IMU_SATURATION`	T1 + T0	T2/T3 suspended for `imu_recovery_hold_ms` (default 200). T1 reduces commanded body rates to zero. T0 inspects: if saturation persists > `imu_kill_threshold_ms` (default 1000), escalate to `KILL`.
`VIO_LOST`	T2	Hover on dead reckoning for `dr_hover_window_ms` (default 3000). If VIO does not recover, attempt blind RTL toward takeoff coordinate using IMU integration with growing covariance. If covariance crosses `dr_abort_covariance`, switch to controlled descent.
`MAP_AMBIGUOUS`	T2	Reject loop closures while ambiguous. Continue VO-only navigation. Do not commit map updates. Replan to revisit a previously confident landmark if available.
`PERCEPTION_DEAD`	T1 + T0	All deliberative and reactive control suspended. T1 commands controlled descent at `dead_descent_mps` (default 0.5). T0 monitors altitude; cuts thrust at `kill_altitude_m` (default 0.3 above takeoff).

Every mapping above is a default. Scenario configs (configs/missions/*.yaml) may override numeric thresholds but may not change the active tier or the qualitative behavior.

3. Uncertainty-aware mission planning (T3)¶

The deliberative layer treats uncertainty as a planning input, not only a runtime signal.

Plans are scored against an expected information gain term and a risk term. The risk term integrates expected validity along the planned path, using a forward uncertainty model documented in docs/specs/mission.md.
The planner rejects plans whose worst-case predicted covariance exceeds mission_max_covariance_norm over a contiguous segment longer than mission_max_blind_segment_m (default 5).
When uncertainty is unavoidable (e.g. crossing a textureless area is required to reach the goal), the planner inserts an explicit active perception sub-goal: slow ascent, yaw scan, or revisit of a known landmark before committing.
Goals carry an uncertainty_budget: the planner records, accepts, and exposes the maximum tolerated covariance for each goal. Mission code may not silently exceed the budget.

These mechanisms exist from Phase 5 onward; Phases 1–4 satisfy the contract trivially (no planner, no mission).

4. Safety invariants (T0)¶

The safety supervisor enforces invariants that no tier may violate, regardless of perception state. They are evaluated on every actuator command before transmission.

Invariant	Default	Override
`max_altitude_m`	30.0 (sim), 5.0 (hardware)	Scenario config; hardware floor `≤ scenario.max_altitude`.
`geofence_polygon`	Scenario-defined	Must be present for any non-`empty_room` scenario.
`max_battery_age_s`	1200	Hardware-only; ignored in sim.
`command_freshness_ms`	200	Same as ADR-0001 `command_timeout_ns`.
`nan_inf_reject`	Always on	Cannot be disabled.
`kill_on_loss_of_comms_ms`	None in sim; 5000 on hardware	Cannot be loosened on hardware.

Any violation produces a SAFETY_VIOLATION event with severity CRITICAL, an immediate STOP or KILL command per a small lookup table, and a persistent telemetry record.

5. Human override¶

When a pilot input channel is present (manual flight, Phase 1; HIL, Phase 8; hardware, Phase 9), pilot input is treated as a T2-level authority by default and is upgradable to T0 by an explicit kill or RTL command.

Pilot input never re-enables a tier the safety supervisor has disabled.
Pilot input cannot disarm safety invariants from §4.
A pilot may force the vehicle into a worse mode (e.g. demand aggressive maneuver under LOW_TEXTURE); the system complies but logs a PILOT_OVERRIDE_DEGRADED event with the active mode and validity at the time.

6. Honesty obligations¶

This section makes explicit a set of practices that follow from the principle "know when you do not know":

No estimator may publish validity == VALID when its input covariance is missing or its innovation gate is failing. The default in ambiguity is DEGRADED.
No planner may consume an Estimate without explicitly reading validity. Static analysis (Phase 5 task) will enforce this.
Every RECOVERY_REQUESTED event must specify which producer asked for recovery and why. Anonymous recovery requests are rejected.
Replays surface mode history as a first-class telemetry channel (/perception/mode). Any review of an incident starts there.

Consequences¶

Positive.

A single document maps every perception failure mode to a behavior. Reviewers can audit the table directly.
Authority transitions are explicit and testable. T0 is the only tier with veto power; this is the structural reason hardware deployment will not require rewriting upper layers.
Mission planning that is honest about uncertainty is now a contract, not an aspiration.
Safety invariants are centralized; hardware can tighten them without touching reactive or deliberative code.
The pilot model is realistic about how humans actually interact with semi-autonomous systems: they intervene under stress, often making things worse. The system records this rather than denying it.

Negative.

The table in §2 is opinionated. Some choices (e.g. blind RTL after VIO_LOST) will need empirical revision. Each revision is an ADR amendment, which is friction. Justified because silent policy drift is worse.
Active perception in mission planning (§3) is non-trivial to implement; it is deferred to Phase 5+ but the contract is fixed now.
The pilot override semantics (§5) constrain UX design: the system cannot offer a "trust me, disable safety" mode. This is intentional.
Hardware-only invariants (battery age, comms loss kill) create a divergence between sim and hardware behaviors. The divergence is bounded and documented, not hidden.

Alternatives considered¶

A. Single monolithic autonomy layer. No T1/T2/T3 split; one controller does everything with a state machine inside. Rejected: experience in PX4, ArduPilot, and academic stacks shows that mixing reflex with deliberation makes both untestable. The tiered structure is widely validated.

B. Behavior trees as the policy language. Adopt a BT framework (py_trees, BehaviorTree.CPP) for §2. Rejected for Phase 1: introduces a runtime dependency and a debugger surface ahead of need. May be revisited at Phase 5 if the table in §2 outgrows hand-written dispatch. The contract here would survive that change.

C. Continuous-mode controller (no discrete modes). Mode-free MPC with uncertainty in the cost function. Rejected: not auditable, not interpretable to a pilot, hard to replay-debug. May coexist with this ADR inside a single mode (e.g. NOMINAL) but does not replace the discrete dispatch.

D. Pilot as T0. Promote the human pilot above the safety supervisor. Rejected: real pilots make wrong calls under stress (this is the whole reason for autonomy). Safety supervisor invariants are a property of the airframe and its environment, not of pilot intent.

E. Defer policy to "when we have hardware". Phase 0 punts on the policy, builds the mechanism, and writes the ADR in Phase 8. Rejected: every Phase 1–7 design decision that touches actuator commands, mission planning, or estimator output depends on knowing what the policy will be. Deferring leaks ad-hoc policy into every layer.

F. Learn the policy from data (RL). A learned controller chooses the response per mode. Rejected as the primary policy: not safe to deploy on hardware, not interpretable, breaks replay determinism. Acceptable as an opt-in sub-policy within a mode, when its outputs flow through the same actuator boundary and T0 invariants.