12

Can I specify Git commit hash, instead of letting Git generate it?

Is it even possible? I didn't find any way in Git documentation or online.


For some background: I'm recreating a new Git public repository from a private one programmatically. I need to make lot of changes to each commit to remove confidential information. But I'd like to preserve commit IDs.

1
  • To add on to all the answers, think about it - an arbitrary identifier can't be a hash value, by definition.
    – j4nw
    Commented Jan 12, 2018 at 13:00

4 Answers 4

9

A git commit hash is a cryptographic checksum that is calculated from the state of your repository, including the hash of all the files in the repository, the hash of the previous commit, the current date and time, etc.

It is not possible to specify this manually.

More more information, see this question.

8

You can't decide of the hash because it is built using the content of the commit.

Here is an example of the content of a commit (anonymized 😉 ) that you could get using git cat-file -p da500aa4f54cbf8f3eb47a1dc2c136715c9197b9 (replace with the hash/sha1 of one of your commits):

tree 48038c4d189536a0862a2c20ed832dc34bd1c8b2
parent f0bb5942f47193d153a205dc089cbbf38299dd1a
author Firstname Lastname <[email protected]> 1513256382 +0100
committer Firstname Lastname <[email protected]> 1515152927 +0100

This is a commit message

If one of these data changes, all the hash changes (because that's this content described above that is hashed to get the hash of a commit):

  • The tree is the sha1 calculated from the content of the files and directories contents.

  • Parent is the parent commit hash.

  • Notice that there is also dates inside, so if you do exactly the same commit but at different moment, the sha1 will change also

PS:

  1. You could continue with the command git cat-file -p to continue explore the tree and better understand the way git store data.
  2. to be exact, the content hashed for a commit is the one described above but prefix with the string "commit" (the blob type) and the number of characters (wc -c). Then the file is compressed with zlib and stored in the file (you could see it using command line cat .git/objects/da/500aa4f54cbf8f3eb47a1dc2c136715c9197b9 | perl -MCompress::Zlib -e 'undef $/; print uncompress(<>)')
5
  • What's da500aa4f54cbf8f3eb47a1dc2c136715c9197b9?
    – cst1992
    Commented Jan 12, 2018 at 13:42
  • I recently published a tool 'HashBeaf' (github.com/cnugteren/hashbeaf) to automatically change this data (in particular the date) in such a way to obtain a user specified commit hash. If you just want your commit to start with 4 characters of your preference (e.g. c0de) it is often sufficient to only go forward one or two minutes in time.
    – CNugteren
    Commented Sep 13, 2023 at 7:14
  • @CNugteren that's funny but you are overselling it: you don't choose the hash but try to generate a hash with part of it, the start, is a hexa string chosen by the user. And it's brute force so not really something that we could use in day to day workflow. (And you just invented again the proof of work of bitcoin :D )
    – Philippe
    Commented Sep 13, 2023 at 12:38
  • When I said 'automatically' I meant from the user's perspective. Indeed, the computation is brute-force. But that doesn't prevent me from using it in day-to-day work: with a 4-digit start-of-the-hash the computation time is not noticeable and the changed commit time is typically within a minute or two.
    – CNugteren
    Commented Sep 13, 2023 at 12:43
  • @CNugteren my remark was not on the 'automatically' but more on 'to obtain a user specified commit hash' where the user won't choose the whole commit but just the start of it. It is obvious for someone knowing how hash or git work but not for a beginner. And your comment and the project README are (deliberately?) confusing. I even imagine that the more you use it with the same prefix, the longer the abbreviated hash displayed will be (which is counter-productive).
    – Philippe
    Commented Sep 13, 2023 at 14:05
4

I do believe it is possible, although it would require some time and work and is a hack. It is true that hash is generated by values like:

  • content
  • commit date
  • ids of the previous commit(s)
  • ...

But, if you can sacrifice consistency of one of this values, it can be done. Here is an example: https://github.com/vog/beautify_git_hash
It's a script, which makes a commit, and then is looking for a specific date to make git hash the one you are looking for and then makes an amend with this date.

2

If you make changes to the commits, the IDs are going to change.(They're dependent on the commit content itself, plus what changes are in them).

So, no.

1
  • Here by 'commit content' I mean metadata such as date, author name, etc. So even if you do something like git commit --amend --reset-author it'll change the author, and so the id.
    – cst1992
    Commented Jan 12, 2018 at 13:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.