L-6.2: Collision Resolution Techniques in Hashing | What are the collision resolution techniques?

Gate Smashers

5 chapters5 takeaways12 key terms5 questions

Overview

This video explains collision resolution techniques in hashing, a crucial concept for efficient data storage and retrieval. It begins by defining what a hash collision is using an example and then introduces two primary methods for handling them: chaining (open hashing) and closed hashing (open addressing). Chaining involves creating linked lists to store multiple keys that hash to the same index. Closed hashing, on the other hand, probes for the next available slot within the hash table itself. The video briefly touches upon three common closed hashing strategies: linear probing, quadratic probing, and double hashing, providing a foundational understanding of how to manage and resolve collisions in hash tables.

How was this?

Save this permanently with flashcards, quizzes, and AI chat

Chapters

Hashing maps keys to indices in a hash table using a hash function (e.g., K mod N).
A collision occurs when two different keys are mapped to the same index in the hash table.
Example: In a K mod 6 table (indices 0-5), keys 32 and 44 both hash to index 2, causing a collision.

Collisions are inevitable in hashing and must be handled to ensure data can be stored and retrieved correctly without overwriting existing information.

When keys 32 and 44 are hashed using `K mod 6`, both result in an index of 2. Since index 2 can only hold one value directly, a collision occurs.

Chaining, also known as open hashing, resolves collisions by creating an external data structure, typically a linked list, at each hash table index.
When a collision occurs, the new key is added to the linked list at that index.
This method utilizes additional memory (for the linked lists) but keeps the primary hash table slots available.

Chaining is a straightforward method that allows multiple elements to occupy the same hash index, preventing data loss and maintaining the integrity of the hash table.

If key 44 hashes to index 2, but index 2 is already occupied by key 32, key 44 is added to a linked list associated with index 2.

Closed hashing, or open addressing, resolves collisions by finding an alternative empty slot within the hash table itself.
It aims to utilize the existing table space efficiently before resorting to external structures.
Three common techniques are linear probing, quadratic probing, and double hashing.

Closed hashing keeps all data within the primary hash table, potentially leading to better cache performance and simpler memory management compared to chaining.

Instead of using a linked list, if index 2 is full, closed hashing will look for the next available slot (e.g., index 3, then 4, etc.) to store the colliding key.

Linear probing checks the next sequential slot (index + 1, index + 2, ...) until an empty slot is found.
Quadratic probing uses a quadratic function (index + i^2) to find the next slot, helping to reduce primary clustering.
Both methods involve probing the table based on a calculated sequence when a collision occurs.

These probing techniques offer different strategies for searching for an open slot, impacting performance and the distribution of elements within the table.

For quadratic probing, if key 30 hashes to index 0 (30 mod 6 = 0) and index 0 is full, the next probes would be at (0 + 1^2) mod 6 = 1, then (0 + 2^2) mod 6 = 4, and so on, until an empty slot is found.

Double hashing employs a second hash function to determine the step size for probing when a collision occurs.
This technique uses the result of the second hash function to jump to different slots, further reducing clustering.
It aims to provide a more effective distribution of keys compared to linear or quadratic probing.

By using a second hash function, double hashing creates a more dynamic probing sequence, which can significantly improve performance and reduce the likelihood of long probe sequences.

If a collision occurs, instead of adding 1 or i^2, the step size for probing is determined by a separate hash function applied to the key, leading to a more varied search path.

Key takeaways

1Collisions are a fundamental challenge in hashing that require specific resolution strategies.
2Chaining (open hashing) resolves collisions by using external linked lists, while closed hashing (open addressing) finds alternative slots within the table.
3Linear probing is simple but can lead to clustering; quadratic probing and double hashing offer more sophisticated ways to find open slots.
4The choice of collision resolution technique impacts the efficiency (time complexity) and memory usage of a hash table.
5Understanding these techniques is vital for designing and analyzing data structures used in databases, caches, and symbol tables.

Key terms

HashingHash FunctionCollisionCollision ResolutionChainingOpen HashingClosed HashingOpen AddressingLinear ProbingQuadratic ProbingDouble HashingProbe Number

Test your understanding

1What is a hash collision and why does it occur?
2How does chaining resolve collisions differently from closed hashing?
3What is the primary advantage of using chaining over closed hashing?
4Explain the difference between linear probing and quadratic probing in terms of how they find an alternative slot.
5Why is double hashing considered an improvement over linear or quadratic probing?