Updated on 17 Mar, 202625 mins read 840 views

Back-of-the-envelope calculations are rough, quick estimates used to gauge how much infrastructure, bandwidth, or storage your system might need before doing detailed planning.

Think of it as engineering intuition: you don't need precise numbers, just enough to make decisions and sanity-check feasibility.

It's basically:

“Can this system handle the load?”

“Roughly how big/fast/expensive will this be?”

We use simple math + assumptions to get a ballpark answer.

Why It's Important

When scaling systems to millions of users:

  • You need to know how many servers you might need.
  • You need to know how much bandwidth or storage to provision.
  • You need to know if your architecture can handle peak loads.

BoE estimations allow you to answer these questions quickly, without waiting for detailed load testing or simulations.

Power of Two

In computer systems, most resources – memory, storage, network buffers – are organized in powers of two:

1, 2, 4, 8, 16, 32, 64, 128,…

This is because binary systems in computers naturally allocate memory and storage blocks of 2^n.

Why Powers of Two Are Useful in BoE

  1. Rough estimation is easier
    1. Instead of calculating exact numbers, rounding to the nearest power of two gives a good approximation.
    2. Example: If you need 3,000 users per DB shard, estimate 4,096 users per shard -> simplifies planning.
  2. Memory alignment
    1. Most RAM and disk blocks are sized in powers of two (e.g., 4 KB pages).
    2. Estimating in powers of two avoids underestimating resource needs.
  3. Network and bandwidth calculations
    1. Buffers, packet sizes, and throughput often scale in powers of two.
  4. Server and cluster scaling
    1. Horizontal scaling often doubles capacity (2, 4, 8, 16 servers) -> fits naturally with powers of two.

Example:

Suppose your back-of-the-envelope calculation shows you need 7 web servers.

  • Using powers of two, you round up to 8 servers.
  • This makes load balancing and partitioning simpler, and keeps a spare for failover.

Quick Reference of Powers of Two

PowerValueApproximate Human-Friendly SizeApproximate Value
2^011 byte 
2^101,024~1 KBThousands
2^201,048,576~1 MBMillions
2^301,073,741,824~1 GBBillions
2^401,099,511,627,776~1 TBTrillions
2^50~1 PB~1,125 TBQuadrillions

Latency Numbers

“Latency refers to the time taken by request to receive a response.”

For example: When you click on any link on a website, it might take some time say around 500 milliseconds in total before the page starts loading. That 500 milliseconds is the latency.

Network latency = Request Time + Response Time.

Latency numbers are the mental cheat sheet for back-of-the-envelope calculations in system design. They help us quickly estimate how long operations take across different layers (CPU -> memory -> disk -> network).

The classic reference is from Jeff Dean (Google), often called:

“Numbers Every Programmer Should Know”

These are just approximate numbers:

Here, ns = nano second, µs = microsecond, ms = millisecond
1 ns = 10^-9 seconds
1 µs = 10^-6 seconds
1 ms = 10^-3 seconds = 1,000 µs = 1,000,000 ns

CPU & Memory

L1, L2 and L3

  • L1 cache access: ~1 ns
  • L2 cache: ~4 ns
  • L3 cache: ~10-20 ns

They are usually built onto the microprocessor chip.

Main memory (RAM): ~100 ns

It takes around 100 ns to read data from memory. Redis is an in-memory data store, so it takes about 100 ns to read data from Redis.

Accessing RAM is ~100x slower than L1 cache.

Storage:

  • SSD read: ~100 micro seconds (0.1ms)
  • HDD seek: ~5-10 milli seconds

Disk is ~100,000x slower than RAM

Network:

  • Same data center: ~0.5 ms
  • Cross-region (same continent): ~10-15 ms
  • Intercontinental: ~100-200 ms

Network latency dominates distributed systems.

System Operations:

  • Mutex lock/unlock: ~100 ns
  • Context switch: ~1-5 micro seconds
  • System call: ~1 micro second

Mental Model

Think in orders of magnitude jumps:

Cache → RAM → SSD → Network
  ns   → ns   → µs  → ms
  ns   → µs   → ms  → 100 ms

Memory is nano seconds, disk is microseconds, network is milliseconds.

Real World Analogy

  • L1 cache: reaching into your pocket
  • RAM: grabbing something from your desk
  • SSD: walking to another room
  • Network: flying to another country

Now it feels right:

  • CPU is instant
  • Disk is slow
  • Network is painfully slow

Queries Per Second (QPS)

The number of queries (read or write) a system (usually a database) received per second.

It's one of the most important metrics in capacity planning because it tells you how much load your database or API needs to handle.

Read queries: Select

Write queries: Insert, Update, and Delete

QPS = Total Requests / Time (in seconds)

Example 1: From daily users

  • 10 millions users/day also known as DAU (Daily Active Users)
  • Each makes 10 requests/day

Total requests/day = 100 million

Now convert to seconds:

  • 1 day = 86,400 seconds
QPS = 100,000,000 / 86,400
    ~ 1,200 QPS

Example 2: Peak QPS (very important)

Traffic is NOT uniform.

Use a peak factor (2x-5x):

Peak QPS = Average QPS * 2-5

So:

  • Avg QPS = 1,200
  • Peak ~ 2,500-6,000 QPS

Read vs Write QPS

Split traffic:

Example:

  • 1,000 QPS total
  • 80% reads, 20% writes

Read QPS = 800

Write QPS = 200

Storage Units

When building systems that scale to millions of user, you need to estimate how much storage your data will occupy. To do that, it's essential to understand storage units and how they scale.

Common Storage Units

UnitAbbreviationSize in Bytes
ByteB1 byte
KilobyteKB1 KB = 1,024 B (~10³ B in approximate usage)
MegabyteMB1 MB = 1,024 KB (~10⁶ B)
GigabyteGB1 GB = 1,024 MB (~10⁹ B)
TerabyteTB1 TB = 1,024 GB (~10¹² B)
PetabytePB1 PB = 1,024 TB (~10¹⁵ B)
ExabyteEB1 EB = 1,024 PB (~10¹⁸ B)
ZettabyteZB1 ZB = 1,024 EB (~10²¹ B)
YottabyteYB1 YB = 1,024 ZB (~10²⁴ B)

Basic formula

Storage = Number of items * Size per item

Then scale it over time:

Total Storage = Daily Storage * Number of days

Estimate data per action

Example:

  • One photo = 2 MB
  • One message = 1 KB
  • One video = 50 MB

Don't forget replication

Most systems store multiple copies:

  • 3 replicas (common)

    Actual Storage = Raw Storage * Replication Factor

Back of the envelope estimation of Instagram

Assumptions

  1. Monthly Active users: ~2 Billion

    60% of these users use insta daily: ~1.2 Billion (DAU: Daily Active Users)

  2. User See feed: (1 Feed request – 1 Query)
    or User Post photo/video/reel (1 Post request – 1 Query)
  3. User check feed 30 times a day (Average)
  4. Estimate for 5 years
  5. User publishes 1 picture/reel a day (Average)

Feed View QPS

Feed request per seconds in Instagram:

Daily Feed Request: (DAU * 30 times) / (24* 3600) seconds
: (1.2B * 30) / 86400
: ~420K/sec

Peak QPS

Peak QPS: 2*QPS
: ~420K * 2
: ~840K/sec

Upload QPS

Daily Upload Request: (1.2B * 1/86400)
: ~14K/sec
Peak QPS: 2*QPS
: ~14K * 2
: ~28K/sec

Storage Unit:

Assumptions:

  • 20% of users upload video
    • A single video is 50MB (Average)
  • 80% of users upload photo
    • A single photo is 1MB (Average)

Photos/day:

photos/day = 1.2B*80% = ~1B * 1MB
= 2^30*1*2^20
= 2^50
= ~1 PB

Videos/day:

videos/day = 1.2B*20% = ~0.25B
= ~0.25B * 50MB
= 0.25*2^30 * 50*2^20
= 2^50 * 0.25*50
= 2^50 * 12
= ~12 PB

Total:

= Photos/day + Videos/day
= 1PB + 12PB
= 13 PB

This much storage is needed for a single day for storing photos and video.

Calculate for 5 years:

13PB * 365*5
= ~24000 PB
= 24*10^3
= 24 * 2^10 * 2^50
= 24 * 2^60
= 24 EB (Exabyte)
Buy Me A Coffee

Leave a comment

Your email address will not be published. Required fields are marked *