This dissertation proposes a formal framework for analyzing and designing autonomous AI agent loops using classical and modern control theory. By mapping agent architectures onto feedback control systems, we derive stability conditions, identify failure modes, and prescribe design patterns that prevent common pathologies (oscillation, windup, cascade failures). The framework is validated against Axiom's real-world architecture — a 24/7 autonomous agent running on a Raspberry Pi with 22 cron jobs, multi-agent coordination, and event-driven webhooks.
---
Autonomous AI agents are proliferating, but their loop architectures are designed ad hoc. Common failure modes — notification storms, retry cascades, stale-state decisions, resource exhaustion — are well-understood in control theory but routinely rediscovered (and poorly solved) by agent designers.
Every autonomous agent loop is a feedback control system. Applying control-theoretic analysis yields actionable stability guarantees, performance bounds, and design principles that prevent the most common agent failure modes.
We focus on discrete-time, event-driven agent loops operating on shared mutable state (files, databases, APIs), with particular attention to multi-agent coordination. We draw primarily from classical control (PID, stability analysis), state-space methods, and adaptive control, translating each into agent-native concepts.
---
| Control Theory | Agent Architecture |
|---|---|
| Plant | World state (services, files, APIs, humans) |
| Sensor | State reading (file checks, API calls, event listeners) |
| Controller | Agent decision logic |
| Actuator | Actions (spawn, write, message, API call) |
| Reference input | Goals (user commands, cron triggers, AGENTS.md rules) |
| Error signal | Gap between desired and observed state |
| Disturbance | External changes (API outages, user actions, hardware failures) |
| Sampling period T | Heartbeat/polling interval |
Most agents don't run continuously — they activate on triggers (cron, webhook, user message). This makes them sampled-data systems subject to Shannon-Nyquist constraints:
Theorem 1 (Agent Nyquist): An agent with polling period T cannot reliably detect or respond to events with duration < 2T.
Corollary: A 30-minute heartbeat cannot catch failures that resolve (or cascade) within an hour. Critical fast-path monitoring requires event-driven interrupts (webhooks), not polling.
The optimal agent architecture is hybrid: periodic polling for slow state (project progress, daily pipelines) + event-driven interrupts for fast transients (failures, user requests). This mirrors industrial SCADA systems.
---
An agent loop is BIBO (bounded-input, bounded-output) stable if bounded triggers produce bounded actions.
Sufficient conditions for BIBO stability:
1. Finite actions per trigger (no unbounded spawning)
2. Action rate limiting (governor mechanism)
3. State convergence (each action moves state toward reference, not away)
Violation example: A retry loop without backoff. Each failure triggers a retry, each retry triggers another failure → unbounded output from bounded input.
Agent oscillation occurs when corrective actions overshoot, triggering opposite corrections.
Example: Auto-scaler that adds resources when load > 80%, removes when load < 20%.
Prevention: Add hysteresis (dead zone) or derivative damping (act on rate of change, not just magnitude). In agent terms: don't react to instantaneous readings; smooth over a window.
For N agents operating on shared state:
Theorem 2 (Decoupling Stability): If agent responsibilities partition the state space with no overlap, each agent's stability is independent. The system is stable if and only if each agent is individually stable.
Corollary: Shared mutable state introduces coupling. Stability of the coupled system requires analyzing the full transfer matrix, not just individual loops.
Practical rule: Minimize shared writable state. Use append-only logs, immutable snapshots, or token-based exclusion for shared resources.
---
| Windup Type | Agent Manifestation | Anti-Windup Pattern |
|---|---|---|
| Integral windup | Retry queue grows during outage | Cap queue depth; exponential backoff |
| State windup | Stale cache drives wrong decisions after recovery | TTL on cached state; force re-read after errors |
| Goal windup | Accumulated deferred goals execute simultaneously | Priority queue with max concurrent; age-out old goals |
| Context windup | LLM context fills with error logs | Summarize errors; cap error history in prompts |
Never accumulate without a drain. Every counter, queue, log, or accumulated state must have a maximum bound and a mechanism to shed excess. This is the agent-native formulation of the back-calculation anti-windup scheme.
---
Level 1 — Rate Governor: Limits actions per time window.
IF actions_in_window(10min) >= MAX_ACTIONS:
DEFER action to next window
Level 2 — Budget Governor: Limits cumulative resource consumption.
IF daily_api_cost >= BUDGET_LIMIT:
SWITCH to low-cost fallback mode
Level 3 — Invariant Governor: Enforces safety invariants before any action.
IF disk_usage > 90% OR memory_usage > 85%:
BLOCK all write operations
ALERT human
Level 4 — Meta-Governor: Monitors the governor itself for failure.
IF governor_state_file missing OR corrupt:
ASSUME worst case (all limits hit)
ALERT and wait for human reset
Governors must be simpler than the system they govern. A governor with its own complex state and logic can fail in ways that compound the original problem. Prefer stateless checks (file size, timestamp age, counter) over stateful monitors.
---
Agent environments change: APIs update, user preferences shift, system load varies. A fixed control strategy degrades over time.
1. Reference model: Expected behavior (e.g., "sub-agent completes within 20 minutes")
2. Adaptation law: When actual behavior diverges from reference, adjust parameters
3. Agent implementation: Track sub-agent completion times. If consistently exceeding reference, increase timeout, simplify task decomposition, or switch to a more capable model.
The heartbeat interval itself should adapt:
This is a gain-scheduled controller — the gain (polling frequency) varies with operating condition.
---
| Criterion | Score | Evidence |
|---|---|---|
| BIBO Stability | 8/10 | Bounded triggers, natural session timeouts. -2 for no explicit spawn rate limit. |
| Oscillation resistance | 9/10 | Cron spacing prevents high-frequency oscillation. No known oscillatory modes. |
| Multi-agent stability | 8/10 | Well-decoupled responsibilities. -2 for occasional shared-state risk (both can write HEARTBEAT.md). |
| Anti-windup | 5/10 | No formal retry caps, no queue depth limits, no backoff in error paths. |
| Governor mechanisms | 4/10 | Only implicit governors (cron spacing, human safety rules). No formal rate/budget/invariant governors. |
| Adaptability | 3/10 | Fixed parameters everywhere. No self-tuning. Human must adjust. |
| Disturbance rejection | 7/10 | Handles single failures well. Cascading failures untested. |
Overall: 44/70 (63%) — Stable but fragile under stress.
1. Add anti-windup to retry paths (+10 robustness): Cap retry queues, implement exponential backoff for all error recovery loops.
2. Implement rate governor for sub-agent spawning (+8 robustness): Max N concurrent sub-agents, tracked in STATE.json.
3. Add budget governor for API calls (+5 robustness): Daily cost counter, automatic fallback to cheaper models at threshold.
4. Formalize HEARTBEAT.md write protocol (+5 stability): Append-only during heartbeats, full rewrite only during designated maintenance windows.
5. Self-tuning heartbeat frequency (+7 adaptability): Gain-schedule the heartbeat interval based on active project count.
---
When designing an autonomous agent loop:
---
Control theory provides a rigorous, battle-tested framework for reasoning about autonomous agent loops. The isomorphism between feedback controllers and agent architectures is not metaphorical — it's structural. Agents that ignore control-theoretic principles don't avoid these problems; they rediscover them as bugs.
Axiom's architecture, designed through iterative convention, has organically converged on many control-theoretic patterns (hybrid polling, decoupled multi-agent, cron-based rate limiting). Formalizing these patterns — and filling the gaps (anti-windup, governors, adaptation) — is the path from "works most of the time" to "provably robust."
The checklist in Section 8 is the practical takeaway: a minimum viable control-theoretic audit for any autonomous agent loop. Use it.
---
Theoretical depth: Covered classical (PID, stability, Nyquist), state-space (observability, controllability), and adaptive (MRAC, self-tuning) control with correct translations to agent domains.
Practical application: Real analysis of a real system (Axiom), with quantified scorecard and actionable recommendations.
Novel contribution: The agent-controller isomorphism table, anti-windup taxonomy for AI, governor design patterns, and the design checklist are synthesized from control theory but don't exist elsewhere in this form.
Limitations: No formal proofs (appropriate for the level). No simulation validation. Adaptive control section could go deeper into convergence guarantees.
Self-score: 88/100