The Agent Run Timeline: Building a Replay UI
A timeline you can scrub. The web component, the data model, and the keyboard shortcuts that turn an opaque run into something a junior SRE can debug.
The UI shape
The replay UI is a horizontal timeline. Each step is a rectangle sized by duration and coloured by type (model call, tool call, decision); click a step to see the full payload, right-arrow to jump to next step, slash key to search across steps; persisted state in URL (run_id and focused step) so sharing a link shares a specific moment in the run.
- Horizontal timeline. Each step as a rectangle; sized by duration; coloured by type.
- Click for payload. Full step detail revealed on click; the inspection surface.
- Right-arrow and slash navigation. Next step, search; the keyboard primitives.
- URL state persistence. run_id and focused step; share specific moments via link.
Data model
The data model is a flat array. Steps stored with start_ms, duration_ms, type, payload; the flat array sorts and renders trivially. Decisions are first-class steps (“decided to call tool X with args Y” is a step the operator can scrub to); errors highlighted with a red bar so tool failures, hallucination flags, and refusals surface visually.
- Flat array of steps. start_ms, duration_ms, type, payload; sorts and renders trivially.
- Decisions as steps. “Decided to call X with Y” is a first-class step the operator can scrub to.
- Red-bar errors. Tool failures, hallucination flags, refusals; surfaced visually.
- Per-step payload. Full context per step; supports deep investigation.
Performance considerations
Performance matters for long runs. Hundreds of steps need virtualisation (render only the visible region, compute the rest on demand); payload bodies can be large so lazy-load them and the timeline list stays fast even when individual payloads are heavy; cache aggressively because replay is read-only on a finished run.
- Virtualise long runs. Render only the visible region; the rest is computed on demand.
- Lazy-load payloads. Bodies can be large; timeline list stays fast.
- Aggressive caching. Read-only on finished run; cache the entire timeline structure.
- Per-run cache invalidation. Run id is the cache key; supports correct refresh.
Keyboard shortcuts
Treat the UI like a developer tool. Right/left arrow for next/previous step, shift+arrow to jump 10 steps, home/end for first/last step; slash to focus search, enter on result to jump to that step; esc to close detail panel, question mark to show shortcut reference.
- Arrow navigation. Right/left for next/previous; shift+arrow jumps 10; home/end for first/last.
- Slash for search. Focus search; enter on result jumps to step.
- Esc closes detail. The standard escape; consistent with developer tools.
- Question mark for help. Shortcut reference visible; supports learnability.
Export and share
Three export formats cover the main needs. Export as JSON (whole run as downloadable file, useful for offline analysis and bug reports); export as Markdown (narrative version with each step’s key info, useful for postmortem writeups); direct link (URL that opens the timeline at a specific step, useful for chat conversations about specific moments).
- JSON export. Whole run downloadable; offline analysis and bug reports.
- Markdown export. Narrative version with key info per step; postmortem writeups.
- Direct link. URL opens timeline at specific step; chat conversations.
- Per-format intended use. Each format optimised for a specific consumer; supports the right surface.