To hash means to grind up, and thatís essentially what hashing is
all about. The heart of a hashing algorithm is a hash function that
takes your nice, neat data and grinds it into some random-looking
The idea behind hashing is that some data either has no inherent
ordering (such as images) or is expensive to compare (such as
images). If the data has no inherent ordering, you canít perform
If the data is expensive to compare, the number of comparisons used
even by a binary search might be too many. So instead of looking at
the data themselves, youíll condense (hash) the data to an integer
(its hash value) and keep all the data with the same hash value in
the same place. This task is carried out by using the hash value as
an index into an array.
To search for an item, you simply hash it and look at all the data
whose hash values match that of the data youíre looking for. This
technique greatly lessens the number of items you have to look at.
If the parameters are set up with care and enough storage is
available for the hash table, the number of comparisons needed to
find an item can be made arbitrarily close to one.
One aspect that affects the efficiency of a hashing implementation
is the hash function itself. It should ideally distribute data
randomly throughout the entire hash table, to reduce the likelihood
of collisions. Collisions occur when two different keys have the
same hash value.
There are two ways to resolve this problem. In open addressing, the
collision is resolved by the choosing of another position in the
hash table for the element inserted later. When the hash table is
searched, if the entry is not found at its hashed position in the
table, the search continues checking until either the element is
found or an empty position in the table is found.
The second method of resolving a hash collision is called chaining.
In this method, a bucket or linked list holds all the elements whose
keys hash to the same value. When the hash table is searched, the
list must be searched linearly.