April 19, 2026 · Luca Eich
Observability as a feature: closing aeqi's last silent telemetry leak
If the runtime fires an event but the UI shows fire_count: 0, the whole anti-magic architecture collapses into vibes. How we found the leak, fixed it in three commits, and used the honest instrument to surface six more.
aeqi's whole pitch is anti-magic: every prompt token reaching an LLM must be attributable to a user-configured event with a visible query_template, or a visible transcript event. If you can read the row, you can point at the reason.
That's load-bearing for trust. It's also load-bearing for the Events UI — which claims to show fire_count and last_fired per configured event. For months, that number was a lie.
The leak
The runtime has six lifecycle events configured globally:
session:startsession:quest_startsession:execution_startsession:step_startsession:quest_endsession:quest_result
Each fires at a specific moment and injects ideas into the model's context. The record_fire helper on EventHandlerStore bumps fire_count and stamps last_fired with the wall clock. Simple enough.
The problem: record_fire was only called from one place — the cron-style scheduler that fires scheduled events. The six lifecycle events never went through that path. They fired through the assembly path, which walks the agent ancestor chain, pulls events matching the current pattern, and merges their idea_ids + query_template results into the assembled system prompt.
Legitimate firings. Zero telemetry.
Result: the Events page would show fire_count: 0 for on_quest_start even after that event had fired on a hundred quests. The runtime knew it was using the event. The UI insisted the event was dormant. Anyone trying to audit why the model said what it said would look at the Events page, conclude the event hadn't contributed, and go hunting for a ghost.
That is exactly the failure mode aeqi exists to prevent.
The fix, in three commits
The landed diff is small. What made it slow was figuring out which call sites could legitimately attribute a firing without double-counting, and which ones were pure visualization and should be left alone.
Commit 1: thread fired_event_ids through the assembly result.
AssembledPrompt now carries a third field alongside system and tools:
pub struct AssembledPrompt {
pub system: String,
pub tools: ToolRestrictions,
pub fired_event_ids: Vec<String>,
}Inside assemble_ideas_for_patterns, each event that contributes at least one idea to the final prompt (either via its static idea_ids or via semantic-search results from its query_template) records its ID. The scheduler's quest-start path then loops over that vec and calls event_store.record_fire(event_id, 0.0) for each. One cost argument, zero dollars because we don't know per-event cost at this granularity — fire counts and last-fired timestamps are the load-bearing signal.
Commit 2: extend to the non-quest paths.
Three more sites needed the same wiring:
- Interactive sessions that assemble
session:startoutside the quest scheduler. - Per-LLM-call
session:step_startevents, which load ideas as step context via a rawget_events_for_patterncall (noassemble_ideasindirection). Each event that actually contributes a non-empty idea ID now records a fire. session:execution_startevents, fired per user message. Same pattern — per-eventrecord_firein the loop that already emits theEventFiredvisualization rows.
One site explicitly did not get the wiring: the test-trigger endpoint the UI calls when a user clicks "Test trigger" on an event. That's a preview — same code path as handle_quest_preflight — and bumping fire_count on a dry-run would poison the telemetry.
Commit 3: make it legible in the UI.
With accurate numbers flowing, the Events page got a small polish:
- The event detail panel used to hide the stats row when
fire_count === 0, leaving operators unable to distinguish "dormant" from "the UI dropped state". It now always renders the row, with a muted "Never fired" when the count is zero. - The sidebar list previously showed
${idea_ids.length} ideasas meta. It now prefers${fire_count} fireswhen the event has fired at least once, falling back to the idea count for dormant events. Operators can now scan which events are hot without clicking into each one.
Why this matters more than it looks
A feature nobody audits is the same as no feature. The value of the Events page is the operator being able to answer why did the model just inject that skill? in under five seconds. If fire_count is wrong, the audit trail is broken, and the whole anti-magic architecture collapses into vibes.
A runtime that's "observable by design" has to treat its own telemetry as part of the contract. When record_fire isn't called on a path that legitimately fires, that's a bug of the same severity as an SQL injection — it breaks the invariants the user is relying on.
Three small commits. Six lifecycle events now honest about their own runtime. The UI tells the truth.
End-to-end verification
Before deploy:
$ sqlite3 ~/.aeqi/aeqi.db "SELECT name, fire_count, last_fired FROM events
WHERE pattern LIKE 'session:%' ORDER BY pattern;"
on_execution_start|0|
on_quest_end|0|
on_quest_result|0|
on_quest_start|0|
on_session_start|0|
on_step_start|0|After one test quest:
on_quest_start|1|2026-04-19T03:12:31.259609402+00:00
on_session_start|1|2026-04-19T03:12:31.260169121+00:00The loop closes. The UI stops lying.
Postscript: six more leaks the newly-honest audit trail found
Fixing fire_count wasn't the endgame. It was the instrument. Once the Events page could be trusted, the zeros it kept showing on specific rows became testable hypotheses instead of noise.
The same 24 hours, six more leaks surfaced — most the same architectural shape (an advertised event or tool whose semantics drift from what the runtime actually does):
on_quest_resulthad no consumer. Declared as a system event, rendered by the UI, invited user configuration — nothing in the runtime read itsidea_ids. Wired into the scheduler so that when a quest completes,session:quest_resultevents assemble and prepend to the result text streamed to the creator session.- Loop-detection middleware fingerprinted only the tool name, not arguments. The adapter constructed its
ToolCallwith an emptyinputfield because theObservertrait's signature didn't carry one. Five differentread_filecalls hashed to the same fingerprint and the middleware halted with "identical call ... same arguments" — a halt message that was itself lying. The bug lived entirely in the adapter, not the middleware; the middleware's own unit tests passed because they exercised it directly. on_quest_start'squery_templateretrieved the top-k semantically-nearest ideas regardless of tag. The default seed declaredquery_template = "skill promoted {quest_description}"— intent clearly "pull promoted skills relevant to this quest" — but the runtime ran a bare semantic search. The word promoted was a soft hint to the embedding model, not a hard filter against thepromotedtag. Candidate-tagged and rejected ideas could leak into prompts purely on embedding similarity. Fix: a new nullablequery_tag_filtercolumn on events, a tag-awarehierarchical_search_with_tagstrait method, and a default-seed declaration of["promoted"].on_quest_endhad no consumer either — the last advertised-but-dead event. Users could attach ideas to it and the runtime would never assemble them. The natural injection point turned out to be thequests(action=close)tool itself: the worker calling close IS the quest ending.QuestsTool::action_closenow assemblessession:quest_endideas in the worker's ancestry,record_fires each contributing event, and prepends the assembled content to the close-tool's success message — so a user-configured postmortem or reflection template actually reaches the model at the natural quest-closing moment.- Quest workers silently skip
session:execution_startandsession:step_start. The interactive-chat path fires all four patterns, but quest workers go through a different path that walks onlysession:start+session:quest_start. A user who attaches an idea toon_step_startexpecting per-LLM-call injection gets zero firings on every quest. Initially deferred for a design pass (filed asas-011with two architectural options); the design doc atdocs/design/as-011-worker-step-context.mdpicked EA/SA/WC — firesession:execution_startonce per worker-run, bumpstep_startfire_count per session, route step ideas through a newassemble_step_ideas_for_workerhelper and anAgentWorker::with_step_ideassetter. Commits98cea47(design) +796ac74(implementation + 2 regression tests + 4 drive-by clippy fixes). - The MCP
agents(action=get)tool silently dropped its advertised session primer for months. The tool description promised "full agent profile with assembled ideas (session primer)" but the implementation sent an IPC command that has no handler in the daemon. The sync request returned an error, the fallback silently discarded it, and every MCP response came back without thecontextfield while the tool kept claiming it would be there. Caught by code audit, not runtime probe — the MCP server didn't log a warning. Fix: route through the existingtrigger_eventIPC path withpattern="session:start"— same read-only assembly used by preflight and the events.trigger action, norecord_fireso telemetry stays honest.
Seven anti-magic leaks in 24 hours. Every one was invisible until the instrument was honest about itself. That's the whole argument for observability-as-a-feature: fix the instrument first, then use it.