Google logo Google ยท System Design

Design Google Maps

Frequency: 88/100 Scale: 1B users, 30M places, real-time routing for 20M concurrent navigations

Problem Statement

Design Google Maps: a system that stores the world's road network, renders map tiles at any zoom level, computes optimal routes between two points, and provides live traffic updates to 20M concurrent navigating users.

Requirements Clarification

Functional:

  • Display map tiles at any zoom level globally
  • Point-of-interest (POI) search: find restaurants, hospitals, businesses by name or category near a location
  • Turn-by-turn navigation: compute the fastest route between two points
  • Real-time traffic: reflect current road conditions in routing decisions
  • ETA: predict arrival time accounting for traffic

Non-Functional:

  • 1B monthly active users
  • Map tile serving: P99 < 100ms globally
  • Route computation: P99 < 2 seconds for cross-city routes
  • Traffic updates reflected in routing within 60 seconds of data collection
  • 99.99% availability for navigation (people rely on it while driving)

Map Data: The Foundation

The world's road network is modeled as a weighted directed graph. Nodes are intersections. Edges are road segments with attributes: distance, speed limit, road type (highway vs residential), turn restrictions. The full graph contains roughly 400 million nodes and 1 billion edges for global coverage.

Raw map data comes from multiple sources: satellite imagery (Google's own fleet), Street View cameras, government GIS databases, and user contributions (Google Map Maker). Processing this into a navigable graph is a continuous pipeline, not a one-time operation.

Map Tile Rendering

The map is divided into tiles at each zoom level. At zoom 0, the entire earth is one tile. Each zoom level doubles the resolution in both dimensions: zoom 14 (city block level) contains 268 million tiles. Total tiles across all zoom levels: approximately 1.1 trillion.

Pre-rendering all tiles is impractical. Google uses a combination of pre-rendered tiles for popular zoom levels and regions, and on-demand rendering for less-frequently accessed areas. Pre-rendered tiles are stored in object storage and served via a global CDN. Cache hit rates exceed 95% for zoom levels 10-15 in populated areas.

Tiles are served as vector tiles (not raster images) for zoom levels above 10. Vector tiles contain geometric primitives (lines, polygons, labels) rather than pixel data. The client renders them client-side. This reduces bandwidth by 4-5x compared to raster tiles, which is critical for mobile users.

Routing: Dijkstra Is Not Enough

Naive Dijkstra on a 1 billion edge graph takes 30-60 seconds. Google Maps computes routes in under 2 seconds. The optimizations:

Bidirectional Dijkstra: Search simultaneously from source and destination. The two frontiers meet somewhere in the middle. This reduces the explored nodes from O(n) to roughly O(sqrt(n)), cutting computation by 100x for cross-city routes.

Contraction Hierarchies (CH): A preprocessing step that adds shortcut edges representing pre-computed optimal paths between important nodes (major intersections, highway on/off ramps). At query time, routing mostly uses shortcuts and only expands detail in the source and destination neighborhoods. After preprocessing, route queries on the global road graph complete in milliseconds.

Graph partitioning: The global graph is partitioned into regions. Most routes stay within 1-3 partitions. Cross-partition routing uses precomputed boundary crossing costs. Each region fits in memory on one server.

Real-Time Traffic

Traffic data comes from two sources: historical speed data (average speeds by road segment, hour of day, day of week) and real-time probe data (anonymized GPS traces from users with navigation active).

The traffic pipeline: GPS traces arrive as events, are matched to road segments via map-matching algorithms (Hidden Markov Models), and aggregated into current speed estimates per segment. The graph edge weights (travel time = distance / speed) are updated every 60 seconds.

At 20M concurrent navigating users, GPS traces arrive at roughly 20M events/second (one per second per user). This data flows through a stream processing system (Pub/Sub into Dataflow) that performs map-matching and aggregates speed estimates. The updated segment speeds are written to a distributed key-value store and pushed to the routing servers.

ETA Prediction

ETA is not simply distance / current speed. Traffic conditions change during the trip. Google's ETA model simulates the route time-slice by time-slice: the first 5 minutes of the route use current traffic, the next 10 minutes use predicted traffic (from historical patterns adjusted for current deviations), and so on. A gradient-boosted model trained on historical trip data predicts ETA with median error under 5% for routes under 2 hours.

Interview Tip

Routing algorithm depth is where this question separates L5 from L6 answers. Most candidates know Dijkstra. Interviewers expect you to recognize that Dijkstra is insufficient at this scale and describe at least one optimization (bidirectional search is the minimum; Contraction Hierarchies is the answer that demonstrates real depth). The second probe is real-time traffic: how do you get 20M GPS traces per second into the routing graph within 60 seconds? The expected answer involves a stream processing pipeline, map-matching, and incremental graph weight updates, not a full graph recompute every 60 seconds.