Hashing with Linear Probing

HASHING

To store data into the hash table, one must know that in order to do this, a certain value must be divided. For example if the data is consist of n integers and we have k number of cells. Then the address wherein the data would be stored is the function:

hash= n % k;

The above algorithm will only work if the given data is in integer. How about strings or character datas? For strings of integers this is done by splitting the string into equal numbers of substring, where each substring can be represented as an integer. In this way, the n in the hash function can be found by adding the value of each substring. While in the character scenario, this can be solved by assigning each letter to their corresponding place in the alphabet. For example, the letter A is 1 while Z is 26. In this manner, all letters has their own number correspondence. In the string of character problem, this problem can be solved by just adding the corresponding number of each letter, thus gaining a single number to be hashed.

The main problem that would occur is the collision, this is a scenario wherein, there will be a limited number of memory, and two or more data can collide in a single address. This problem can be solved by a rehashing algorithm

Linear Probing

The need to have a rehash function arises when a collision occurs. This happens when two or more information would collide on the same cell allocated for the hash table. Thus, a rehashing is needed.

Since we know that the hash function for storing the information or data into the allocated memory is done by getting the remainder of the data’s numerical equivalent divide by the number of space allocated. Then, the rehashing function is the method for finding the second or third or so on location for the information.

One rehashing technique is the Linear Probing, where the rehashing is done by looking for the next empty space that it can occupy. The function for the rehashing is the following:

rehash(key)=(n+1) % k;

This method works in such a way that if the first location is not free, then it will go to the next location and check if that location is free or not, and so on until it finds a free location or can’t find anyone at all.

For formality and familiarity’s sake, an empty space would be given a -1 value while a deleted data’s space would be -2. In this way, finding an empty space is easy and also the search for a stored item would be easier.

To test this algorithm, the use of the following example is needed.

For example, we have a hash table that could accommodate 9 information, and the data to be stored were integers. To input 27, we use hash(key)= 27 % 9=0. Therefore, 27 is stored at 0. If another input 18 occurs, and we know that 18 % 9= 0, then a collision would occur. In this event, the need to rehash is needed. Using linear probing, we have the rehash(key)= (18+1) % 9= 1. Since 1 is empty, 18 can be stored in it.

To retrieve data, the hash function and the rehash function were also useful. Using the example from above, retrieving 18 is done by using the hash function to find the key and check if the data would coincide to the data needed. If not, then the rehash would be needed. Until such time the correct location is found or an empty space is encountered(that is the value of that space is -1), which means that the data does not exist. This is authentic because, the path of the search function would be the same path that was used in storing the data. In this sense, if an empty space is encountered, it all means that the data does not exist, logical isn’t it?

The use of the -2 value for deleted items is useful in such a way that in traversing the hash table, encountering a deleted cell would not end the traversal.

MORE EXAMPLE(Linear Probing)

For example, 5 spaces for integers. Input: 1,5,27,25 21, 30. Delete 5. Look for 25, 29.

Solution:

INPUT:

*To insert 1, 1 %5=1, therefore 1 is stored at array[1].

*To insert 5, 5%5=0, therefore 5 is stored at array[0].

*To insert 27, 27%5=2, therefore off to array[2].

*To insert 25, 25 % 5=0, since 0 is occupied, rehashing is done, 26%5=1, rehashing is done again, 27%5=2, then 28%5=3. Since 3 is -1, 25 can be stored here.

*30 cannot b e stored for insufficient number of memory allocation.

DELETE:

*Look for 5, using the hash, we found 5 at location 0. Then, we assign a new value -2 location 0 to indicate this value have been deleted.

RETRIEVE:

To look for 25, we use the hash key, to find its location. And found 0, but then the value stored I n array[0] is not the same, therefore, the rehashing would be used. Then we traverse the hash table, until we found the right place.
To see if 29 does exist, we use the hash and rehash to find if they exist.

6 Comments (+add yours?)

hima
Dec 19, 2009 @ 15:04:54

thanks

Dan
Jan 05, 2011 @ 21:30:21

Hey! Thanks a lot for linear probing! Can u do also quadratic probing as well ?

Mani
Feb 22, 2011 @ 05:03:14

Nice one.

Thanks

asif
Jun 21, 2011 @ 07:59:22

not bad… but need more detail..

kavita
Jan 11, 2012 @ 05:57:29

thanks a lot fr claring all my doubts f linear probing.cn u lso hlp me in sql queries??????????????

Luis
Feb 04, 2012 @ 14:15:12

Thanks so much for this.
I was just studying hashing tables and I was having some problems finding out how to use linear probing!
So useful, thanks!

	chandan on So I’m really alone
	chandan on So I’m really alone
	jheiyn on So I’m really alone
	jheiyn on So I’m really alone
	jheiyn on Fighting Paranoia while walkin…

From deep within