Meta logo Meta ยท System Design

Design WhatsApp Messaging

Frequency: 90/100 Scale: 2B users, 100B messages/day, 50M concurrent connections

Problem Statement

Design the core messaging infrastructure for WhatsApp. Two billion users send text, images, and voice messages. Messages must be delivered in under 500ms P99, work reliably on low-bandwidth mobile connections, and preserve end-to-end encryption across all message types.

Requirements Clarification

Functional:

  • Send and receive messages in 1:1 and group chats (up to 1024 members)
  • Support text, images, video, voice messages, and documents
  • Delivery receipts: sent (server received), delivered (device received), read (user opened)
  • Offline delivery: messages queue when recipient is offline and deliver on reconnect
  • End-to-end encryption: server never holds plaintext

Non-Functional:

  • 2B registered users, 500M DAU
  • 100B messages/day: ~1.15M messages/second average, 5x peak
  • P99 delivery latency < 500ms for online recipients
  • Works on 2G networks (200kbps, 500ms RTT)
  • Messages durable: no loss after server acknowledgment

Scale Estimation

100B messages/day. Average message size: 1KB text, 50KB voice, 500KB image. Text dominates by count; media dominates by bytes. Media storage: assuming 20% of messages are images at 500KB compressed, that is 10B * 500KB = 5PB per day added to storage. Media is stored separately from message metadata: the message record holds a reference, not the bytes.

Connection Model

Each device maintains a persistent TCP connection to a connection server. WhatsApp uses XMPP over a custom binary protocol (evolved from XMPP but compressed). Why persistent TCP: on mobile networks, TCP handshake costs 200-400ms. Re-establishing a connection per message on 2G is prohibitive. The persistent connection is kept alive with periodic pings (~60 second intervals).

Connection servers are stateless routing nodes. They hold the socket but not message state. A mapping service (backed by Redis) maps user ID to the current connection server handling that user's socket. When a message arrives for user B, the system looks up B's connection server and routes there.

Message Delivery Flow

  1. Sender writes message to the server. Server acknowledges with a server-side message ID. Sender displays single tick.
  2. Server routes message to recipient's connection server.
  3. Recipient's device receives the message and sends a delivery acknowledgment.
  4. Server marks the message as delivered, notifies sender. Sender displays double tick.
  5. Recipient opens the conversation. Device sends a read receipt.
  6. Server notifies sender. Sender displays blue double tick.

For offline recipients: messages persist in the server queue with a TTL of 30 days. On reconnect, the device fetches the queue and sends delivery acknowledgments as messages process.

Group Messaging

Group chats with up to 1024 members introduce a fan-out problem. For a 1024-member group, one message triggers up to 1024 individual deliveries.

WhatsApp uses a fan-out-on-write approach for groups: when a message is sent, the server writes one message record and then asynchronously delivers to each member's connection server via a message queue. Members who are offline get queued delivery. This decouples the sender's acknowledgment from the delivery fan-out: the sender gets their acknowledgment after the server persists the message, not after all 1024 deliveries complete.

End-to-End Encryption

WhatsApp uses the Signal Protocol. Each device generates a public/private key pair. The server stores public keys only. Messages are encrypted on the sender's device with the recipient's public key and decrypted only on the recipient's device.

The server cannot read message content. Delivery receipts are metadata, not content, and are not encrypted. For group messages, the sender encrypts the message separately for each recipient using their individual public key. At 1024 members, that is 1024 encryption operations client-side before the message is sent. On modern hardware this completes in under 100ms.

Media Handling

Media is not sent through the message pipeline. The sender uploads media to object storage (a distributed blob store), receives a URL and an encryption key. The message record contains the URL and key, not the bytes. The recipient downloads media directly from the blob store using the URL, decrypts locally using the key. The server never holds unencrypted media.

This design keeps message delivery latency independent of media size. A 1GB video sends in the same time as a 1KB text message from the server's perspective.

Interview Tip

Most candidates describe a standard messaging system without addressing two WhatsApp-specific constraints: the Signal Protocol encryption model (which means the server cannot read content, and group fan-out requires per-member encryption), and the 2G mobile network constraint (which explains persistent TCP connections and compressed binary protocols). Interviewers at Meta will also probe the delivery receipt state machine: exactly when each tick appears, what server state tracks it, and how the system handles the case where the sender goes offline before receiving the read receipt back.