Architecture
How Maida works: OTel span schema, local storage, viewer API, UI, guardrails, integrations, and loop detection. These pieces provide the behavioral evidence used by baselines, assertions, diffs, and downstream reliability workflows. For the full public contract, see the Trace format reference.
Span schema
Maida stores completed runs as OpenTelemetry-compatible spans. Every stored span is a JSON object with a common set of top-level fields:
| Field | Type | Description |
|---|---|---|
trace_id |
32-hex-character string | OTel trace ID for the run |
span_id |
16-hex-character string | OTel span ID |
parent_span_id |
16-hex-character string or null |
Parent span ID; null for the run root span |
name |
string | Label (run name, tool name, model name, etc.) |
kind |
string | OTel span kind such as INTERNAL or CLIENT |
start_time |
ISO8601 UTC with microseconds | Span start time |
end_time |
ISO8601 UTC with microseconds or null |
Span end time |
duration_ms |
integer or null |
Duration if applicable |
attributes |
object | Redacted/truncated span attributes |
events |
array | In-span events |
status_code |
string | OK, ERROR, or UNSET |
status_description |
string | Error description when present |
The root span represents the whole run. Child spans represent LLM calls, tool calls, state updates, loop warnings, and errors.
Derived event view
Some Maida workflows still need a flat event list. spans_to_events() projects the OTel span tree into event-like records such as RUN_START, LLM_CALL, TOOL_CALL, STATE_UPDATE, ERROR, LOOP_WARNING, and RUN_END. Baselines, assertions, diffs, and the viewer can use this projected view without changing the storage contract back to event files.
Storage layout
- Base directory:
~/.maida/(orMAIDA_DATA_DIR). - Per run:
runs/<trace_id_hex>/ - meta.json - Run metadata:
trace_id,run_name,started_at,ended_at,duration_ms,status,counts(llm_calls, tool_calls, errors, loop_warnings). - spans.jsonl - Append-only OTel span records; one span JSON object per line.
meta.json may be created with status: "running" while child spans are exported. When the root span ends, meta.json is overwritten with final status, counts, end time, and duration.
The CLI and some JSON payloads still use the user-facing term run_id for compatibility. In current storage, that value resolves to the OTel trace_id; short prefixes are resolved with trace ID prefix matching.
Viewer API
The local server (FastAPI) exposes:
| Endpoint | Description |
|---|---|
GET /api/runs |
List recent runs (metadata only). |
GET /api/runs/{trace_id} |
Run metadata (meta.json). |
GET /api/runs/{trace_id}/spans |
Span array plus the compatibility events projection for the run. |
GET /api/runs/{trace_id}/paths |
Local filesystem paths for the run (run_dir, meta_json, spans_jsonl). |
GET /api/runs/{trace_id}/rename |
Validate that a run can be renamed. |
POST /api/runs/{trace_id}/rename |
Rename a run (body: {"run_name": "..."}, updates meta.json). |
DELETE /api/runs/{trace_id} |
Delete a run directory and its contents (returns 204). |
GET / |
Static UI (maida/ui_static/index.html). |
Default bind: 127.0.0.1:8712. The UI fetches runs and span data from these endpoints and renders a timeline.
UI overview
- Multi-file static UI (HTML, JS, CSS); no build step. Served from
maida/ui_static/. - Loads run list from
/api/runs; when a run is selected (orrun_id/runin query), loads/api/runs/{trace_id}/spans. - Flat timeline: the UI renders the compatibility
eventsprojection in chronological order. Each event can be expanded with payload/meta shown as formatted JSON. - Span data available: the
/spansresponse also includes raw OTel span records for consumers that need trace/span hierarchy. LOOP_WARNINGevents are displayed prominently.
Guardrails
Guardrails are opt-in limits that stop a run before it burns more time, tokens, or tool calls than you intended. They are runtime safety limits and evidence capture tools, not the post-run policy gate.
Available guardrails: stop_on_loop, stop_on_loop_min_repetitions, max_llm_calls, max_tool_calls, max_events, max_duration_s. All default to disabled.
Behavior when a guardrail triggers:
- The triggering evidence is recorded using the existing projected event view
LoopAbortorGuardrailExceededis raisedERRORis recorded withguardrail,threshold, andactualRUN_END(status="error")finalizes the projected run view- The exception propagates to the caller
Configuration precedence (highest wins): function args (@trace(...), traced_run(...)) > env vars > project YAML > user YAML > defaults.
See Guardrails for usage examples and Configuration reference for all settings.
Live-refresh viewer
The UI supports automatic polling so you can start maida view once and re-run your agent without manually refreshing.
- Run list sidebar: polls
GET /api/runsevery 3 seconds (configurable viapoll_runsURL param, 1-60s). New runs appear automatically; removed runs are cleared from the sidebar. - Timeline: when the current run has
status: "running", span data polls every 2 seconds from/api/runs/{trace_id}/spans(configurable viapoll_eventsURL param, 1-60s). Polling stops when the run finishes. - Visibility gating: polling pauses when the browser tab is not visible (Page Visibility API) and resumes when you switch back.
- Visual indicator: runs with
status: "running"show a pulsing dot in the sidebar.
Integration architecture
Maida adapters are thin translation layers that hook into a framework's callbacks and record LLM/tool evidence into the active Maida run. They do not introduce new trace types.
| Integration | Module | Hook mechanism |
|---|---|---|
| LangChain / LangGraph | maida.integrations.langchain |
Callback handler (on_llm_start/on_tool_start) |
| OpenAI Agents SDK | maida.integrations.openai_agents |
Tracing processor (GenerationSpanData, FunctionSpanData, HandoffSpanData) |
| CrewAI | maida.integrations.crewai |
Execution hooks (before/after_llm_call, before/after_tool_call) |
Integration lifecycle: maida._integration_utils provides _invoke_run_enter / _invoke_run_exit callbacks that adapters register with. This ensures adapters activate only when an explicit Maida run is active.
Guardrails with integrations: when a guardrail fires inside a framework callback, adapters raise _MaidaAbortSignal (a BaseException subclass) to bypass the framework's except Exception error handling and stop execution immediately. The lifecycle layer unwraps that signal so user code sees LoopAbort or GuardrailExceeded.
All integrations are optional dependencies; the core package does not depend on any framework. See Integrations for usage details.
Loop detection
- Input: A sliding window of the last N projected events (default N=12;
MAIDA_LOOP_WINDOW). - Signature: Each event is reduced to a string: for
LLM_CALL->"LLM_CALL:"+model, forTOOL_CALL->"TOOL_CALL:"+tool_name, elseevent_type. - Rule: Look for a contiguous block of signatures that repeats K times (default K=3;
MAIDA_LOOP_REPETITIONS) at the end of the window. If found, emit oneLOOP_WARNINGper distinct pattern per run (deduplicated by pattern + repetitions). - Payload:
pattern(e.g. "LLM_CALL:gpt-4 -> TOOL_CALL:search"),repetitions,window_size,evidence_event_ids.
No ML; purely pattern-based on event type and name to give quick feedback on repetitive agent behavior.