Batching in System Design

Updated on 30 Jun, 202616 mins read 266 views

When engineers think about scaling systems, they usually think about:

caching,
sharding,
load balancing,
queues,
replication,
distributed systems.

But one of the simplest and most powerful scalability techniques is often overlooked:

Batching

Batching appears everywhere in large-scale systems:

databases,
distributed queues,
stream processing,
APIs,
machine learning pipelines,
logging systems,
search indexing,
analytics systems.

In many cases, batching alone can improve throughput by 10x-100x.

What Is Batching?

Batching means:

Combining multiple operations into a single larger operation.

Instead of processing requests one by one:

Request 1 -> Process
Request 2 -> Process
Request 3 -> Process

We combine them:

[Request 1, Request 2, Request 3]
              ↓
  Single Processing Unit

The core idea:

amortize overhead,
reduce round trips,
improve resource utilization.

Why Batching Improves Performance

Most systems have fixed overhead per operation.

Examples:

network handshake,
TCP setup,
serialization,
disk seek,
database transaction setup,
thread scheduling,
lock acquisition.

If each request pays the full overhead independently, systems become inefficient.

Batching spreads overhead across many operations.

Simple Example: Database Writes

Imagine writing 1 million rows individually.

Without batch:

INSERT row1
INSERT row2
INSERT row3
...

Each write incurs:

network latency,
transaction cost,
logging overhead,
disk synchronization.

Now batch them:

INSERT INTO events VALUES
(...),
(...),
(...);

One transaction.

One network round trip.

One commit.

Massive throughput improvement.

This is why systems like:

Kafka consumers,
analytics pipelines,
ETL jobs,
search indexing systems

heavily rely on batching.

The Core Tradeoff

Batching improves throughput.

But batching increases latency.

This is the fundamental tradeoff.

Without batching

Lower latency
Worse throughput

With batching

Higher throughput
Increased waiting time

This tradeoff appears everywhere in distributed systems.

Throughput vs Latency

Suppose processing one request costs:

5ms fixed overhead
1ms actual work

Processing 100 requests individually:

100 x (5 + 1) = 600 ms

Batching 100 requests:

Actual work = 1ms x 100 = 100ms
fixed overhead = 5ms

fixed overhead + actual work
5ms + 100 = 105ms

Huge througput gain.

But:

the first request may wait while the batch fills.

This is why batching must be carefully tuned.

Common Types of Batching

1 Time-Based Batching

Process requests every fixed interval.

Example:

Flush every 100ms

Common in:

logging systems,
metrics pipelines,
telemetry systems.

Advantages

predicatable timing,
simple implementation.

Drawbacks

adds fixed latency,
inefficent during low traffic.

2 Size-Based Batching

Flush when batch reaches a threshold.

Example:

Flush after 1000 events

Common in:

Kafka produces,
bulk database inserts,
search indexing.

Advantages

high efficiency,
maximize througput.

Drawbacks

small traffic may wait indefinitely.

3 Hybrid Batching

Most production systems use both:

Flush when:

– batch size reaches 1000

– 100 ms passes

This balances:

latency,
throughput.

Very common in real systems.

Batching in APIs

Many APIs support batch operations.

Instead of:

GET /user/1
GET /user/2
GET /user/3

Use:

POST /users/batch
{
	"ids": [1, 2, 3]
}

Benefits:

fewer network calls,
lower latency,
reduced server load.

Google APIs, GraphQL, and internal microservices frequently use this pattern.

Real World Analogy

Delivery Truck

Let's consider an delivery truck example.

Instead of a courier delivering one package at a time:

Warehouse → House A
Warehouse → House B
Warehouse → House C

the company trucks waits until the truck is reasonably full and then delivers many packages in one trip:

Warehouse → Multiple houses in one route

That is batching.

Why this helps

Without batching

More fuel
More trips
More time
More cost

With batching

Better efficiency
Lower overhead
Higher throughput

Exactly like systems:

fewer DB calls
fewer network requests
fewer disk writes

Tradeoff

If the company waits too long to fill the truck:

Higher efficiency
BUT slower delivery

This is the same tradeoff in system design:

Bigger Batch	Smaller Batch
Better throughput	Lower latency
More efficient	Faster response
More memory	Less memory

Washing Machine

You don't wash:

1 sock at a time

You wait for:

a full load

Efficient:

water
electricity
detergent

But waiting too long delays clean clothes.

Airport Shuttle

Shuttle leaves when:

enough passengers arrive

waiting time expires

This perfectly explains:

hybrid batching
timeout flushing

System equivalent:

Process when:
100 requests collected
OR
5 seconds elapsed

Your email address will not be published. Required fields are marked *

Batching in System Design

What Is Batching?

Why Batching Improves Performance

Simple Example: Database Writes

The Core Tradeoff

Without batching

With batching

Throughput vs Latency

Common Types of Batching

1 Time-Based Batching

2 Size-Based Batching

3 Hybrid Batching

Batching in APIs

Real World Analogy

Delivery Truck

Why this helps

Tradeoff

Washing Machine

Airport Shuttle

Leave a comment

Tags

Quick links

Newsletter