Back to Nova AI Ops
RESOURCE LIBRARY

Everything we've written for SREs.

Blog posts, guides, glossary, use cases, the product tour, cheat sheets, and the things we wish someone had handed us when we were on-call. The library is growing as we publish; if you can't find something, ask.

Browse by format
Cheat sheets
Buyer's guides
Topic guides
Category

AI SRE & Agentic SRE

The two pillar guides to AI-driven reliability: AI SRE and Agentic SRE, the architecture for autonomous operations.

2 pillar guides
Detect & Resolve

AIOps & incident response

AIOps, AI incident response, incident management, root cause analysis, and self-healing infrastructure.

5 guides
Metrics & Practice

SRE metrics & practices

MTTR, SLOs, golden signals, alert fatigue, on-call, runbooks, postmortems, chaos engineering, and toil.

9 guides
Delivery & Platform

DevOps & platform engineering

DevOps, DevOps automation, platform engineering, CI/CD, infrastructure as code, SRE, capacity planning, and cloud cost optimization.

8 guides
AI Systems

Reliability for AI systems

For teams shipping AI in production: the AI engineer's guide to production reliability and LLMOps.

2 guides
Webinars & talks
We're not scheduling webinars yet. The first ones will land here when we do. In the meantime, the 60-second product tour walks through the core Nova workflow.
eBooks & analyst reports
No long-form eBooks or analyst placements yet. We'll publish a Founder's Guide to Agentic SRE this quarter. For something specific, email product@novaaiops.com.