WebRTC Is Not a Protocol
Many beginners think:
HTTP → Protocol
TCP → Protocol
UDP → Protocol
WebRTC → ProtocolWrong.
WebRTC is not a protocol.
WebRTC is a framework composed of many protocol and technologies.
Think of it like a car.
When someone says:
Carthey are not referring to one component.
A car contains:
Engine
Transmission
Brakes
Fuel System
Steering
Electronicsworking together.
Similarly, WebRTC contains:
Media Capture
Peer Connectivity
NAT Traversal
Media Transport
Encryption
Congestion Controlworking together.
When we say:
WebRTCwe are referring to an entire communication platform.
The Core Problem WebRTC Solves
Before looking at architecture, let's revisit the goal.
Alice wants to communicate with Bob.
The communication may involve.
Audio
Video
Data
Screen Sharing
FilesThe system must:
- Capture media
- Discover peers
- Traverse NAT
- Establish security
- Transport media
- Adapt to network conditions
Every component inside WebRTC exist because one of these requirement exists.
A Bird's-Eye View of the Architecture
At the highest level, a WebRTC application looks like this:
+----------------------------------+
| Application Layer |
| React / Angular / Vue / JS |
+----------------------------------+
+----------------------------------+
| WebRTC APIs |
+----------------------------------+
+----------------------------------+
| WebRTC Engine |
+----------------------------------+
+----------------------------------+
| Network Layer |
+----------------------------------+
+----------------------------------+
| Internet |
+----------------------------------+Let's understand each layer.
Layer 1: Application Layer
This is our code.
Examples:
Google Meet
Zoom Web
Discord
Custom ApplicationThis layer decides:
- when a call starts
- who joins a room
- which camera to use
- which microphone to use
For example:
joinMeeting()
leaveMeeting()
muteMicrophone()All of this belongs to the application.
Notice something important:
The application itself does not handle:
- RTP packets
- NAT traversal
- codecs
- encryption
WebRTC handles those.
Layer 2: WebRTC APIs
The browser exposes APIs.
These APIs allow applications to interact with the WebRTC engine.
The three most important APIs are:
MediaStream
RTCPeerConnection
RTCDataChannelEverything in WebRTC revolves around these three concepts.
Think of them as the public interface to the communication engine.
Layer 3: WebRTC Engine
This is where the magic happens.
Inside the browser exists a sophisticated communication stack.
Most developers never see it.
But it is doing enormous amounts of work.
The WebRTC engine contains:
Media Engine
ICE Engine
DTLS Engine
SRTP Engine
Codec Engine
Congestion Controller
Network MonitorThis layer is responsible for solving the hard problems.
Layer 4: Network Layer
Eventually all communication becomes packets.
Those packets travel using:
UDP
TCP
IPand eventually across the Internet.
Understanding the Three Primary APIs
MediaStream
Imagine you turn on your camera.
The browser must represent that media somehow.
The representation is:
MediaStreamThink of a MediaStream as:
A container of media sources.
Example:
const stream =
await navigator.mediaDevices.getUserMedia({
video: true,
audio: true
});The browser returns a MediaStream.
Understanding the Stream Concept
MediaStream acts more like a collection.
Think:
MediaStream
|
+---- Video Track
|
+---- Audio TrackThe actual media originates from tracks.
MediaStreamTrack
A MediaStreamTrack represents a single source of media.
Examples:
Camera
Microphone
Screen ShareEvery media source becomes a track.
MediaStream
|
+---- Camera Track
|
+---- Microphone TrackWhy Track Exist
Suppose you are in a meeting.
You click:
Mute MicrophoneWhat happens?
The video should continue.
Only audio should stop.
The is possible because audio and video are independent tracks.
You can disable one without affecting the other.
Example: Screen Sharing
Suppose you start sharing your screen.
Now your stream may look like:
MediaStream
|
+---- Screen Trackor:
MediaStream
|
+---- Camera Track
|
+---- Screen Track
|
+---- Audio TrackThe architecture remains consistent.
RTCPeerConnection
Everything eventually revolves around:
new RTCPeerConnection()Most beginners think:
RTCPeerConnection is a connection
This is technically true.
But it's much more useful to think:
RTCPeerConnection is a communication engine.
Because internally it contains numerous subsystems.
What Problems Must RTCPeerConnection Solve?
The connection system must:
Find Routes
Cross NAT
Encrypt Traffic
Send Media
Monitor Quality
Handle Packet Loss
Adapt BitrateThat's a lot of work.
RTCPeerConnection orchestrates all of it.
A Conceptual View
Think of it like:
RTCPeerConnection
|
+---- ICE
|
+---- STUN
|
+---- TURN
|
+---- DTLS
|
+---- SRTP
|
+---- RTP
|
+---- Congestion ControlThe browser hides this complexity behind on API.
The ICE Engine
One subsystem inside RTCPeerConnection is ICE.
Recall our networking problems.
Private IPs
NAT
FirewallsICE exists to discover usable network paths.
It asks:
Can we connect directly?
Should we use TURN?
Which address works?
Which routes is fastest?ICE is effectively the networking brain of WebRTC.
The Security Engine
WebRTC requires encryption.
Not optional.
Mandatory.
This responsibilitiy belongs to:
DTLS
SRTPwhich we will study in detail later.
For now remember:
Every media packet is encrypted before transmission.
The Media Engine
Once connectivity exists, media must be transported.
The media engine handles:
Audio Transport
Video Transport
Synchronization
Packetization
DepacketizationThis engine works continuously during a call.
The Codec Engine
Raw media is enormous.
Consider:
1920 x 1080 video
30 FPSRaw bandwidth requirements would be absurdly high.
Therefore media must be compressed.
The codec engine performs:
Encoding
Decoding
Compression
Decompressionwithout which real-time video would be impractical.
RTCDataChannel
Many developers associate WebRTC exclusively with audio and video.
This is a mistake.
WebRTC can also transport arbitraty data.
For example:
Messages
Files
Game State
Collaborative Edits
Cursor PositionsThis capability is provided by:
RTCDataChannelWhy Data Channels Matter
Imagine building:
Multiplayer games
Collaborative Whiteboards
Remote Desktop SystemsYou need more than audio and video.
You need structured data.
Data channels provide a peer-to-peer mechanism for that communication.
The Complete Architectural Picture
We can now refine our earlier diagram.
Application
|
|
V
+----------------------+
| MediaStream |
| MediaStreamTrack |
| RTCPeerConnection |
| RTCDataChannel |
+----------------------+
|
V
+----------------------+
| ICE |
| STUN |
| TURN |
| DTLS |
| SRTP |
| RTP |
| Codecs |
+----------------------+
|
V
UDP / TCP
|
V
InternetLeave a comment
Your email address will not be published. Required fields are marked *
