In Java, Hashtable has buckets whose quantity is equal to its capacity. Now how does it determine that it has to store an object in a particular bucket? I know it uses hashcode of the object but hashcode is a weird long string, what does hashtable do to the hashcode to determine place of entry?
3 Answers
Implementation-dependent (as in, if you rely on it working this way, your code is broken; the things HashMap guarantees are spelled out in its javadoc, and none of what I'm about to type is in there):
hashes are just a number. Between about -2billion and +2billion. That 'long weird string' you see is just a more convenient way to show it to you.
First, the higher digits of that number are mixed into the lower digits (actually, the higher bits are XORed into the lower ones): 12340005 is turned into 12341239.
Then, that number is divided by how many buckets there currently are, but the result is tossed out, it's the remainder we are interested in. This remainder is necessarily 0 or higher, and never more than '# of buckets there are', so always points exactly at one of the buckets.
That's the bucket that the object goes into.
if buckets grow too large, a resizing is done.
For more, well, HashMap is open source, as is HashSet - just have a look.
For behavior as of jdk7 see:
int index = (hash & 0x7FFFFFFF) % tab.length;
This is a common technique for hash tables. First bit is discarded (to make the value positive). The index is the remainder from division by table size.
I know it uses hashcode of the object but hashcode is a weird long string, what does hashtable do to the hashcode to determine place of entry?
A hashcode is not a "weird long string". It is a 32 bit signed integer.
(I think you are confusing the hashcode and what you get when you call Object::toString
... which is a string consisting of a hashcode AND a Java internal type name.)
So what HashMap
and HashTable
(and HashSet
and LinkedHashMap
) actually do is:
- call
hashCode()
to get the 32 bit integer, - perform some implementation-specific mangling1 on the integer,
- convert the mangled integer to a non-negative integer by removing the sign bit,
- compute an array index (for the bucket) as
value % array.length
wherearray
is the hash table's current array of hash chains (or trees).
1 - Some implementations of HashMap
/ HashTable
perform some simple / cheap bitwise mangling. The purpose is to reduce clustering in the case that the low few bits of the hashcode values are not evenly distributed.
but hashcode is a weird long string
- no.hashCode()
returns anint
.int
.HashTable
's source?