downtime
Any period during which a service is unavailable or degraded for end users. The inverse of uptime.
Downtime is the period during which a service is unavailable, unreachable, or degraded for end users. It’s the inverse of uptime: a service with 99.9% monthly uptime has up to 43:49 of downtime per month.
Operators distinguish between total downtime (the service returns errors or doesn’t respond at all) and partial downtime (some users or some features are affected — a single region, a single feature, a single dependency). Most monitoring tools report total downtime by default; surfacing partial downtime usually requires multi-region checks and per-feature health endpoints.
Downtime has cost. The PingPane downtime cost calculator translates a dollar-per-hour revenue rate into an estimated cost-per-incident; it’s useful when justifying investment in better monitoring or on-call processes.
uptime
The proportion of time a service is reachable and responding correctly, usually expressed as a percentage over a window.
incident
A discrete event during which a service was unavailable or degraded, with a defined start, updates, and resolution.
MTTR
Mean time to recovery — the average elapsed time from an incident’s start to its resolution. A core reliability KPI.