SLA
Service-level agreement — a contractual promise of a target metric, often availability, with consequences for missing it.
An SLA (service-level agreement) is a written contract between a service provider and its customer that commits to a measurable level of service over a defined period, with defined consequences if the commitment is missed.
A typical SaaS SLA promises monthly uptime — 99.9%, 99.95%, 99.99% — and offers service credits (a percentage of the monthly fee) when the actual uptime falls below the target. Read the SLA closely: most explicitly exclude scheduled maintenance, force majeure, customer-side issues, and many forms of partial degradation.
An SLA is a customer-facing artifact. The internal metric a team operates against is the SLO (service-level objective); the underlying measurement is the SLI (service-level indicator). Most teams set the SLO tighter than the SLA so that missing the SLO is a learning event, not a contract event.
uptime
The proportion of time a service is reachable and responding correctly, usually expressed as a percentage over a window.
five nines
99.999% availability — about five minutes of downtime per year. The aspirational target for critical infrastructure.
MTTR
Mean time to recovery — the average elapsed time from an incident’s start to its resolution. A core reliability KPI.