When engineers think about scaling systems, they usually think about:
- caching,
- sharding,
- load balancing,
- queues,
- replication,
- distributed systems.
But one of the simplest and most powerful scalability techniques is often overlooked:
Batching
Batching appears everywhere in large-scale systems:
- databases,
- distributed queues,
- stream processing,
- APIs,
- machine learning pipelines,
- logging systems,
- search indexing,
- analytics systems.
In many cases, batching alone can improve throughput by 10x-100x.
What Is Batching?
Batching means:
Combining multiple operations into a single larger operation.
Instead of processing requests one by one:
Request 1 -> Process
Request 2 -> Process
Request 3 -> ProcessWe combine them:
[Request 1, Request 2, Request 3]
↓
Single Processing UnitThe core idea:
- amortize overhead,
- reduce round trips,
- improve resource utilization.
Why Batching Improves Performance
Most systems have fixed overhead per operation.
Examples:
- network handshake,
- TCP setup,
- serialization,
- disk seek,
- database transaction setup,
- thread scheduling,
- lock acquisition.
If each request pays the full overhead independently, systems become inefficient.
Batching spreads overhead across many operations.
Simple Example: Database Writes
Imagine writing 1 million rows individually.
Without batch:
INSERT row1
INSERT row2
INSERT row3
...Each write incurs:
- network latency,
- transaction cost,
- logging overhead,
- disk synchronization.
Now batch them:
INSERT INTO events VALUES
(...),
(...),
(...);One transaction.
One network round trip.
One commit.
Massive throughput improvement.
This is why systems like:
- Kafka consumers,
- analytics pipelines,
- ETL jobs,
- search indexing systems
heavily rely on batching.
The Core Tradeoff
Batching improves throughput.
But batching increases latency.
This is the fundamental tradeoff.
Without batching
- Lower latency
- Worse throughput
With batching
- Higher throughput
- Increased waiting time
This tradeoff appears everywhere in distributed systems.
Throughput vs Latency
Suppose processing one request costs:
5ms fixed overhead
1ms actual workProcessing 100 requests individually:
100 x (5 + 1) = 600 msBatching 100 requests:
Actual work = 1ms x 100 = 100ms
fixed overhead = 5ms
fixed overhead + actual work
5ms + 100 = 105msHuge througput gain.
But:
- the first request may wait while the batch fills.
This is why batching must be carefully tuned.
Common Types of Batching
1 Time-Based Batching
Process requests every fixed interval.
Example:
Flush every 100msCommon in:
- logging systems,
- metrics pipelines,
- telemetry systems.
Advantages
- predicatable timing,
- simple implementation.
Drawbacks
- adds fixed latency,
- inefficent during low traffic.
2 Size-Based Batching
Flush when batch reaches a threshold.
Example:
Flush after 1000 eventsCommon in:
- Kafka produces,
- bulk database inserts,
- search indexing.
Advantages
- high efficiency,
- maximize througput.
Drawbacks
- small traffic may wait indefinitely.
3 Hybrid Batching
Most production systems use both:
Flush when:
– batch size reaches 1000
OR
– 100 ms passes
This balances:
- latency,
- throughput.
Very common in real systems.
Batching in APIs
Many APIs support batch operations.
Instead of:
GET /user/1
GET /user/2
GET /user/3Use:
POST /users/batch
{
"ids": [1, 2, 3]
}Benefits:
- fewer network calls,
- lower latency,
- reduced server load.
Google APIs, GraphQL, and internal microservices frequently use this pattern.
Real World Analogy
Delivery Truck
Let's consider an delivery truck example.
Instead of a courier delivering one package at a time:
Warehouse → House A
Warehouse → House B
Warehouse → House Cthe company trucks waits until the truck is reasonably full and then delivers many packages in one trip:
Warehouse → Multiple houses in one routeThat is batching.
Why this helps
Without batching
- More fuel
- More trips
- More time
- More cost
With batching
- Better efficiency
- Lower overhead
- Higher throughput
Exactly like systems:
- fewer DB calls
- fewer network requests
- fewer disk writes
Tradeoff
If the company waits too long to fill the truck:
Higher efficiency
BUT slower deliveryThis is the same tradeoff in system design:
| Bigger Batch | Smaller Batch |
|---|---|
| Better throughput | Lower latency |
| More efficient | Faster response |
| More memory | Less memory |
Washing Machine
You don't wash:
- 1 sock at a time
You wait for:
- a full load
Efficient:
- water
- electricity
- detergent
But waiting too long delays clean clothes.
Airport Shuttle
Shuttle leaves when:
- enough passengers arrive
OR
- waiting time expires
This perfectly explains:
- hybrid batching
- timeout flushing
System equivalent:
Process when:
100 requests collected
OR
5 seconds elapsed
Leave a comment
Your email address will not be published. Required fields are marked *


