The Agent Audit Log: What Goes In, What Comes Out
Auditors will ask. The audit-log schema that satisfies SOC2, PCI, and your own future investigation, with retention policy and access-control notes.
What goes in
Three event types belong in the audit log. Every action (who: agent identity; what: action type and args; when: timestamp; why: reasoning trace summarised; where: environment, resource); every approval (proposal, approver identity, approval reason, timestamp); every refusal (what the agent refused to do, why, what triggered the refusal which guardrail). Each event is a complete, replayable record.
- Action events. Who, what, when, why, where; the five-W audit row.
- Approval events. Proposal, approver, reason, timestamp; the human-decision trail.
- Refusal events. Refused action, reason, triggering guardrail; the safety-net record.
- Per-event replayability. Each event captures enough to reproduce the situation.
Retention policy
Retention is compliance-driven and tiered. SOC2 typically requires 1 year minimum, many teams keep 3 years, PCI requires longer for financial actions; hot logs (90 days) are queryable from the dashboard, warm logs (1 year) via API, cold logs (3 years) in object storage; document the retention because “this log type is retained for X years per Y compliance framework” is a sentence in your audit binder.
- SOC2: 1 year minimum. Many teams keep 3 years; PCI requires longer for financial actions.
- Hot 90 days. Queryable from the dashboard; the active surface.
- Warm 1 year. Queryable via API; the recent history.
- Cold 3 years. Object storage; compliance and deep history.
Immutability
Immutability is the audit log’s defining property. Append-only writes (the log infrastructure: Loki, Cloud Logging, S3 with object lock support this); modifications would themselves be logged in a meta-log but in practice the original log line is never modified and corrections are new entries; auditor satisfaction is that entries from 18 months ago can be verified as unaltered (object-store object-locks plus checksums satisfy this).
- Append-only writes. Loki, Cloud Logging, S3 object lock; use the infrastructure.
- Corrections as new entries. Original line never modified; corrections append.
- Auditor verification. 18-month-old entry verified unaltered via object-lock plus checksums.
- Per-line cryptographic anchor. Checksums plus object-lock; supports forensic verification.
Access control
Access control protects the integrity. Read access scoped by team (the team that owns the agent reads its logs, cross-team reads require a documented purpose); no write access except for the agent service itself (even admins do not write directly); all reads are logged in a separate append-only read-audit log so “who looked at what audit data when” is itself audited.
- Read access by team. Team that owns agent reads its logs; cross-team needs documented purpose.
- No human write access. Even admins don’t write directly; only the agent service.
- Read-audit log separate. Append-only; tracks who read what audit data when.
- Per-access cryptographic check. Service-account auth required; supports correct attribution.
Common audit queries
Three queries dominate the audit usage. “All actions taken on resource R in the last 30 days” (standard incident-investigation query); “all approvals by approver A” (standard approval-pattern audit); “all refusals in the last week, grouped by reason” (operational query that tells you what your guardrails are catching).
- Actions on resource R. 30 days; the incident-investigation query.
- Approvals by approver A. Approval-pattern audit; surfaces unusual patterns.
- Refusals grouped by reason. Operational query; tells you what guardrails caught.
- Per-query stored as view. The queries committed; supports continued audit fluency.