0% completed
Partitioning is the process of dividing a single database table or data set into smaller subsets, each stored and managed independently. While users interact with the database as a whole, the underlying system accesses only the relevant partition(s), significantly improving query performance.
Partitioning is typically categorized into two main types: Horizontal Partitioning and Vertical Partitioning.
Horizontal partitioning involves splitting rows of a table into smaller partitions based on a specific criterion. Each partition contains a subset of rows, but all partitions have the same columns.
Let’s say we have a table Orders
with the following schema:
OrderID | CustomerID | OrderDate | Amount |
---|---|---|---|
1 | 101 | 2023-01-05 | $50 |
2 | 102 | 2023-02-12 | $80 |
3 | 103 | 2023-01-18 | $100 |
4 | 104 | 2023-03-07 | $40 |
We can horizontally partition this table based on OrderDate
:
OrderID | CustomerID | OrderDate | Amount |
---|---|---|---|
1 | 101 | 2023-01-05 | $50 |
3 | 103 | 2023-01-18 | $100 |
OrderID | CustomerID | OrderDate | Amount |
---|---|---|---|
2 | 102 | 2023-02-12 | $80 |
4 | 104 | 2023-03-07 | $40 |
Vertical partitioning splits a table into smaller tables based on columns rather than rows. Each partition contains a subset of the columns, typically grouping them based on their usage patterns.
Consider the same Orders
table:
OrderID | CustomerID | OrderDate | Amount |
---|---|---|---|
1 | 101 | 2023-01-05 | $50 |
2 | 102 | 2023-02-12 | $80 |
3 | 103 | 2023-01-18 | $100 |
4 | 104 | 2023-03-07 | $40 |
We can vertically partition this table as follows:
OrderID | OrderDate | Amount |
---|---|---|
1 | 2023-01-05 | $50 |
2 | 2023-02-12 | $80 |
3 | 2023-01-18 | $100 |
4 | 2023-03-07 | $40 |
OrderID | CustomerID |
---|---|
1 | 101 |
2 | 102 |
3 | 103 |
4 | 104 |
Partitioning is particularly useful in the following scenarios:
Aspect | Horizontal Partitioning | Vertical Partitioning |
---|---|---|
Division Criteria | Based on rows (e.g., date ranges or regions). | Based on columns (e.g., frequently used columns). |
Storage | Partitions stored as subsets of rows. | Partitions stored as subsets of columns. |
Query Optimization | Reduces row scanning for specific data ranges. | Reduces column retrieval for specific queries. |
Use Case | Distributed databases with large datasets. | Databases with column-specific access patterns. |
Partitioning is a powerful technique to improve the scalability, performance, and manageability of databases. By splitting data into smaller, focused subsets, both horizontal and vertical partitioning reduce the load on individual nodes and optimize query performance. Horizontal partitioning is ideal for distributing data across nodes, while vertical partitioning is better suited for optimizing column-based queries.
In the next lesson, we will explore specific Partitioning Strategies in Databases to understand how partitioning can be implemented effectively in distributed systems.
.....
.....
.....