Understanding Hashing: Efficient Data Storage Technique

1 / 9

Embed Share

Hashing is a technique that allows for efficient data storage and retrieval by assigning unique locations for storing data based on a hash function. This method aims to achieve constant lookup time and optimal memory usage. The concept involves handling collisions and implementing universal hash functions to ensure randomness. By using modular arithmetic and designing a suitable hash function, efficient data structures like hash tables can be constructed.

mat_mc Follow

Uploaded on Mar 17, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Lecture 10 Hashing

Motivation: Set and Map Goal: An array whose index can be any object. Example: Dictionary Dictionary[ hash ] = a dish of diced or chopped meat and often vegetables Properties: 1. Efficient lookup: Hope lookup is O(1) 2. Space: space is within constant factor to a list.

Nave implementation of a set Method 1: Maintain a linked list. Problem: Lookup takes O(n) time. Method 2: Use a large array a[i] = 1 if i is in the set Problem: Needs huge amount of memory.

Hashing Idea: for each number, assign a random location Example: {3, 10, 3424, 643523} Store number i in a[f(i)] f(i): hash function.

Collisions Problem: want to add 123, f(123) = 4 = f(3424). (This will always happen because of pigeon hole principle) Solution: 123 and 3424 will share this location. null 10 null 3 3424 null null null null 643523 123

Fixed Hash Function If the hash function is fixed, then it can be very slow for some bad examples. Example: We can try to find n numbers x1, x2, , xn such that f(xi) = y for some fixed y (always possible by pigeon hole principle) Then hash table degenerates into a linked list. Solution: Use a family of random hash functions.

Universal Hash Function Hash function should be as random as possible. Ideally: Choose a random function out of all functions! However: cannot store a totally random function. Can use modular arithmetic to construct good hash functions! Goal: Construct a family of hash functions F, such that for any x y, we have =1 Pr? ?? ? = ? ? ?.

Recap: Modular Arithmetic For a prime number p, only consider numbers {0, 1, 2, 3, , p-1} Can do addition, subtraction, multiplication the usual way (take mod p at the end). Inverse: For any integer 0 < x < p, there is an integer 0 < y < p such that ?? 1(??? ?) Example: p = 7, x = 2, then y = 4. We call y = x-1 Inverse can be computed efficiently.

Designing the Hash function Pick a prime number p, construct a hash family with p2 functions For every a, b in {0,1,2, , p-1}, we have ??,?? = ?? + ? (??? ?) Claim: For every x, y (x y), any two numbers u, v in {0, 1, 2, , p-1}, we have 1 ?2 Pr ? ? = ?,? ? = ? =

Understanding Hashing: Efficient Data Storage Technique

Download Presentation

Presentation Transcript

Related

More Related Content