Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

5
  • i know the post is pretty old but can someone explain what (remainder + 1) means here
    – Hari
    Commented Dec 29, 2014 at 12:45
  • 3
    @Hari remainder refers to the result of the original modulo calculation, and we add 1 to it in order to find the next available slot.
    – x4nd3r
    Commented Jan 5, 2015 at 22:52
  • "The array itself will contain something in each slot. At a minimum you will store the hashvalue or the value itself in this slot." - it's common for "slots" (buckets) to store no value at all; open addressing implementations often store either NULL or a pointer to the first node in a linked list - with no value directly in the slot/bucket. "would be interested in any others" - the "+1" you illustrate is called linear probing, oft-better-performing: quadratic probing. "generally encounter very few to no bucket/slot collisions" - @ 70% capacity, ~12% slots w/ 2 values, ~3% 3.... Commented May 2, 2016 at 1:35
  • "I've done this using long instead of int size hashcode values and it's worked quite well on up to 32 million entires hashvalues in the hashtable with 0 collisions." - this simply isn't possible in the general case where the values of keys are effectively random in a much larger range than the number of buckets. Note that having distinct hash values is often easy enough (and your talk of long hash values implies that's what you've achieved), but ensuring they don't collide in the hash table after the mod/% operation is not (in the general case). Commented May 2, 2016 at 1:46
  • 1
    (Avoiding all collisions is known as perfect hashing. In general it's practical for a few hundred or thousand keys that are known in advance - gperf is an example of a tool to calculate such a hash function. You can also write your own in very limited circumstances - e.g. if your keys are pointers to objects from your own memory pool which is kept fairly full, with each pointer a fixed distance apart, you can divide the pointers by that distance and effectively have an index into a slightly-sparse array, avoiding collisions.) Commented May 2, 2016 at 1:49