RTO vs RPO: differences, examples and how to actually measure them

RTO and RPO are the two numbers that define every DR project. We explain what they are, how to calculate them and why in practice they are almost always under-declared.

3 min read

TL;DR

RTO is how many minutes you can be down. RPO is how many minutes of work you can lose. Two different numbers, very different costs, almost always confused. Let us walk through real examples on an 80-employee company.

The definitions you need

  • RTO (Recovery Time Objective) — the maximum acceptable time from disaster to return to operations.
  • RPO (Recovery Point Objective) — the maximum amount of data the business is willing to lose, expressed in time (usually minutes or hours).

Both come from a Business Impact Analysis (BIA): for each critical process, the cost of being down (RTO) and the cost of redoing work (RPO).

Concrete examples

Take three typical processes at an SMB with ERP, mail and file server.

Case 1 — Invoicing ERP

  • Cost of downtime: ~€1,200/hour in lost revenue.
  • Cost of data loss: every unrecorded order must be re-entered manually. ~30 minutes per order.
  • Target numbers: RTO 30 minutes, RPO 15 minutes.

Case 2 — Email

  • Cost of downtime: communication stall with customers. Hard to quantify but high.
  • Cost of data loss: every lost email is potential commercial risk.
  • Target numbers: RTO 15 minutes, RPO 5 minutes.

Case 3 — Document file server

  • Cost of downtime: medium, people can work offline.
  • Cost of data loss: high if signed documents are involved.
  • Target numbers: RTO 60 minutes, RPO 60 minutes.

Mistake #1: low RTO without the budget

It is easy to claim "RTO 5 minutes, RPO 1 minute" at sales time. In practice it costs 4-6× more than an RTO/RPO of 30/15 minutes. The jump from minutes to seconds requires:

  • continuous replication (CDP) instead of interval snapshots;
  • automatic failover, not manual;
  • dedicated link between sites (not the Internet).

The first step is understanding the true cost of downtime per process. Done properly, a 30-minute RTO is enough for 90% of SMBs.

Mistake #2: confusing RPO and backup frequency

A backup every four hours does not mean an RPO of four hours. It means that, worst case, you lose up to four hours minus data propagation time to the DR site. On an average week the numbers are better; on a bad day worse.

If your contract promises RPO 15 minutes, the backup must run every 15 minutes and replication must complete within 15 minutes of the first event.

How to measure real RTO and RPO

The honest way is a drill. A simple procedure:

  1. Schedule a 2-hour test window.
  2. Power down (in test) the target system.
  3. Run the DR runbook, timing every step.
  4. Verify total time (= real RTO).
  5. Verify the last data present in the recovered system (= real RPO).

Document it. If the numbers do not meet objectives, you have a documented problem (and an action plan).

The "zero RPO" trap

Zero RPO (synchronous replication) exists, but:

  • requires latency < 5 ms between sites;
  • requires dedicated bandwidth, not the Internet;
  • doubles infrastructure cost.

In Italy "zero RPO" is realistic between sites in the same datacentre or between nearby cities with dark fibre. For distributed cloud DR, the honest minimum is ~5-second RPO with block-based CDP.

FAQ

Should RTO and RPO be equal?

No. They are independent dimensions. A system can have a low RTO (boots fast) and a high RPO (loses some hours), e.g. a batch reporting system.

Who formally approves RTO and RPO?

Senior management, ideally with the CFO or COO signing. They are business numbers, not technical ones.

How long are defined RTO/RPO valid?

Typically 12-18 months. Process changes, M&A, new systems: review them.


For the RTO calculation method, read How to calculate your real RTO. For snapshots vs continuous replication, see Snapshot vs continuous replication.

Want to see Sefthy in action?

Same IP, same subnet, RTO in minutes. Try it free for 7 days or talk to one of our specialists.