1

The total possible no of unique git commit hash values are - 16^40, 16 possible hexadecimal digits and 40 total digits in the SHA value.

This approximates to ~10^48 (more than that, but just an approximation).

My question is - as the values are unique for commits, how are they not exhausted by now?

Or

Are these values unique only inside a repository i.e. locally unique which will prevent them from being exhausted ever?

As you can see, I am not sure whether they are locally unique or globally.

Edit -

The question has been answered but I would recommend this question Git hash duplicates as it is somewhat similar to my question. Thanks to @torek for mentioning this question.

8
  • 5
    I don't think you appreciate how big a number 10^48 actually is :)
    – hobbs
    Commented Jul 15, 2022 at 4:35
  • 3
    For comparison, it is estimated that the number of grains of sand on Earth is around 7.5*10^18. Let's round that up to roughly 10^19. By comparison 10^48 is a huge number. However, the worry currently is not the exhaustion of SHA values but deliberate generation of files with the same SHA1 hash (see: zdnet.com/article/…). Due to this git is slowly migrating to SHA-256
    – slebetman
    Commented Jul 15, 2022 at 4:43
  • @slebetman That's interesting. thanks for sharing. Commented Jul 15, 2022 at 4:49
  • 1
    Plus... Git will migrate to SHA-256 anyway (and here is why).
    – VonC
    Commented Jul 15, 2022 at 5:19
  • 1
    See also Git hash duplicates. Note my answer in particular, which mentions the birthday problem. For some pre-digested numbers, see my answer to stackoverflow.com/q/34802500/1256452 as well.
    – torek
    Commented Jul 15, 2022 at 13:37

1 Answer 1

7

Pay attention to what that "48" is counting. That's how many zeroes after the leading "1".

Say there's ten billion people on earth. That's 1e10. Say all ten billion people on earth are using Git and generating ten billion hash codes each, every second, non stop. That's 1e20 hash codes used per second if we dedicate the entire human race full time with fantasy hardware. How long would it take them to get through even 0.01% of the Git hash codes? There's 1e28 left, 0.01% of that is 1e24, at 1e8 seconds per year is 1e16 years, that's ten million billion years. We'd have gotten almost 0.0000014 of the way to using 0.01% of the Git hash codes by now if we'd started before the big bang.

2
  • A single collision could be likely be found with the square root of that amount of effort, or after about 10^24 steps. Commented Jul 15, 2022 at 12:24
  • @PresidentJamesK.Polk Work out how many terabytes just that many hash codes occupy.
    – jthill
    Commented Jul 15, 2022 at 13:43

Not the answer you're looking for? Browse other questions tagged or ask your own question.