0% completed
Hash-based indexing is a technique used to store and retrieve records efficiently by mapping keys to specific locations using a hash function. Unlike tree-based indexing, which organizes data hierarchically, hash-based indexing directly maps keys to buckets or slots, making it particularly efficient for point queries.
Hash-based indexing employs a hash function to compute a hash value for each key, determining the bucket or slot where the corresponding record will be stored. This eliminates the need for sequential searching, offering constant-time complexity O(1) in ideal cases.
In static hashing, the number of buckets is fixed when the hash table is created. Each record is mapped to a bucket based on its hash value. This structure works well when the size of the dataset is known and remains relatively constant.
Consider the image below.
Data Mapping:
98
, 106
, 104
, and 102
.Hash Calculation:
Bucket Representation:
98
(e.g., John, Delhi
).106
.104
.102
.This setup ensures efficient lookups because the hash function directly maps keys to their corresponding buckets.
Now, suppose we want to add the key 103
to the hash table:
98
, leading to a collision.Chaining is a common method to resolve collisions. When two or more keys map to the same bucket, a linked list (or chain) is created to store multiple records in that bucket.
Consider the image below.
Collision Handling:
98
and 103
both map to bucket 3. Instead of overwriting data, a linked list is created within bucket 3 to store both records.How It Works:
103
), the hash function maps it to bucket 3, and the linked list is traversed to locate the specific record.Chaining ensures that no data is lost due to collisions, and the hash table remains efficient.
Dynamic hashing, also known as extendible hashing, overcomes the limitations of static hashing by allowing the number of buckets to grow or shrink dynamically. This ensures better handling of collisions and adapts to changes in the dataset size.
Consider the image below.
Global Depth:
00
, 01
, 10
, and 11
.Mapping Keys to Buckets:
Handling Bucket Overflow:
If a bucket overflows (e.g., due to multiple keys hashing to the same bucket), bucket splitting occurs:
For example, if ( B_2 ) overflows, the directory IDs are updated to 100
and 101
, and the bucket splits to redistribute the records.
Keys are redistributed between the new buckets based on their updated hash values.
Dynamic hashing provides a scalable and efficient solution for managing large and dynamic datasets.
Feature | Static Hashing | Dynamic Hashing |
---|---|---|
Number of Buckets | Fixed | Grows or shrinks dynamically |
Collision Handling | Chaining or open addressing | Bucket splitting with updated directories |
Flexibility | Limited | Highly flexible |
Use Cases | Small, stable datasets | Large, dynamic datasets |
.....
.....
.....