Meta logo Meta ยท System Design

Design Instagram's News Feed

Frequency: 95/100 Scale: 500M DAU, 10k write QPS, 100k read QPS

Problem Statement

Design a news feed system for Instagram. Users can post photos and videos. Each user sees a ranked feed of posts from people they follow.

Requirements Clarification

Functional:

  • Users can create posts (images, short videos)
  • Users follow other users
  • Users see a feed of posts from followed accounts, ranked by recency and relevance
  • Feed updates in near real-time

Non-Functional:

  • 500M DAU, 10k write QPS, 100k read QPS
  • Feed load P99 < 200ms
  • Posts must appear in online followers' feeds within 5 seconds of publishing
  • 99.99% availability
  • 7-day post retention minimum; long-tail content stored indefinitely

Scale Estimation

500M DAU. Assume each user opens the app 3 times per day: 1.5B feed loads per day, ~17k feed reads per second. Each user posts once per week on average: ~8M posts per day, ~95 posts per second. Write QPS is low; read QPS dominates. Design for read throughput.

Core Design Challenge: Feed Generation

The problem is not storing posts. It is assembling a personalised, ranked list of posts from the accounts a user follows, at 17k reads per second, in under 200ms.

Fan-Out on Read (Pull)

When a user opens their feed, query all followed accounts for recent posts, merge the results, and rank them. Simple to implement. Expensive at scale: a user following 500 accounts requires 500 database lookups per feed load. At 17k reads per second, this is 8.5M database operations per second. Unworkable.

Fan-Out on Write (Push)

When a user posts, write a feed entry to every follower's feed cache. Feed loads become single-key cache lookups. Read cost is O(1). Write cost is O(followers): a celebrity with 50M followers triggers 50M cache writes per post. At 500ms per batch of 1000 writes, a celebrity post takes ~7 hours to fan out. Not acceptable.

Hybrid (Production Approach)

Instagram uses a threshold-based hybrid: push for users with fewer than 10,000 followers, pull for users above. The threshold is tunable. When a celebrity posts, only the top-N followers (sorted by engagement score) receive a push entry. The rest receive a pull-merged result when they load the feed. This caps the worst-case fan-out at a manageable size while keeping feed loads fast for the common case.

High-Level Architecture

Post service: Accepts uploads, stores media blobs in object storage (S3), writes post metadata to a sharded PostgreSQL cluster (sharded by user_id). Publishes a PostCreated event to Kafka.

Feed fan-out worker: Consumes PostCreated events from Kafka. For each event, looks up the poster's followers from a follow graph service (backed by a graph database or denormalised Redis sorted set). Writes feed entries to the feed cache (Redis) for each follower within the threshold. For celebrity posts, writes only to an activity log and relies on pull at read time.

Feed read service: On feed load, fetches the pre-computed feed from Redis. If the user follows any celebrities, fetches their recent posts separately and merges client-side or in a server-side aggregation step. Applies ranking (recency + engagement score) and returns the top 20 posts.

Media serving: Images and videos are served from object storage via a CDN. The feed service returns post metadata including CDN URLs; the client fetches media directly from the CDN, never through the feed service.

Ranking

Simple reverse-chronological ranking is the baseline. Production Instagram weights posts by: poster relationship strength (close friends, family), content type preference (video vs photo), predicted engagement probability (from a lightweight ML model scoring each candidate post). The ranking model runs during feed assembly, not post ingestion.

Feed Cache Schema

Each user's feed in Redis is a sorted set: member = post_id, score = timestamp. Feed loads are ZREVRANGE by score. New posts from fan-out are ZADD operations. The sorted set is capped at 1000 entries per user (ZREMRANGEBYRANK removes the oldest entries). Users who haven't opened the app in 30 days have their feed cache evicted; their next open triggers a cold feed build from the database.

Interview Tip

Interviewers at Meta specifically probe the fan-out problem. Most candidates state the hybrid approach correctly but cannot quantify when to switch strategies or explain the feed cache eviction policy. The follow-up question is almost always: "what happens when a user with 100M followers posts?" Walk through the fan-out worker, the follower threshold, the Kafka consumption lag under heavy load, and how you'd prioritise which followers receive the push entry first (highest engagement score first). That level of operational detail is what separates L5 from L6 answers.