1

I have a repo in a samba share, which has 1.2 G in .git/objects. When I clone that repo with file:// protocol locally, then the cloned repo gets 1.5 GB in .git/objects. Why is Git doing this to me?

Then I tried it on another repo, which is smaller. I see in the stdout that Git is always recompressing a part of the repo in file:// protocol, which may be the reason, but with https protocol from github thats not happening.

https protocol from github to local repo2:

git clone https://github.com/someuser/somerepo.git ~/repo2
19:39:56.188035 git.c:371               trace: built-in: git 'clone' 'https://github.com/someuser/somerepo.git' 'repo2'
Klone nach 'repo2' ...
19:39:56.211036 run-command.c:350       trace: run_command: 'git-remote-https' 'origin' 'https://github.com/someuser/somerepo.git'
19:39:57.381345 run-command.c:350       trace: run_command: 'fetch-pack' '--stateless-rpc' '--stdin' '--lock-pack' '--thin' '--check-self-contained-and-connected' '--cloning' 'https://github.com/someuser/somerepo.git/'
19:39:57.400372 exec_cmd.c:116          trace: exec: 'git' 'fetch-pack' '--stateless-rpc' '--stdin' '--lock-pack' '--thin' '--check-self-contained-and-connected' '--cloning' 'https://github.com/someuser/somerepo.git/'
19:39:57.412380 git.c:371               trace: built-in: git 'fetch-pack' '--stateless-rpc' '--stdin' '--lock-pack' '--thin' '--check-self-contained-and-connected' '--cloning' 'https://github.com/someuser/somerepo.git/'
remote: Counting objects: 73839, done.
19:39:57.951242 run-command.c:350       trace: run_command: 'index-pack' '--stdin' '-v' '--fix-thin' '--keep=fetch-pack 122 on SAJTY' '--check-self-contained-and-connected' '--pack_header=2,73839'
19:39:57.960749 exec_cmd.c:116          trace: exec: 'git' 'index-pack' '--stdin' '-v' '--fix-thin' '--keep=fetch-pack 122 on SAJTY' '--check-self-contained-and-connected' '--pack_header=2,73839'
19:39:57.976259 git.c:371               trace: built-in: git 'index-pack' '--stdin' '-v' '--fix-thin' '--keep=fetch-pack 122 on SAJTY' '--check-self-contained-and-connected' '--pack_header=2,73839'
remote: Total 73839 (delta 0), reused 0 (delta 0), pack-reused 73839
Empfange Objekte: 100% (73839/73839), 21.96 MiB | 112.00 KiB/s, Fertig.
Löse Unterschiede auf: 100% (60640/60640), Fertig.
19:44:31.187832 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' '--progress=Prüfe Konnektivität'
19:44:31.190358 exec_cmd.c:116          trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' '--progress=Prüfe Konnektivität'
19:44:31.195362 git.c:371               trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' '--progress=Prüfe Konnektivität'
remote: Total 73839 (delta 0), reused 0 (delta 0), pack-reused 73839

file protocol from local repo2 to repo3:

git clone file://$HOME/repo2 ~/repo3
19:46:53.852660 git.c:371               trace: built-in: git 'clone' 'file:///home/sajty/repo2' 'repo3'
Klone nach 'repo3' ...
19:46:53.874679 run-command.c:350       trace: run_command: 'git-upload-pack '\''/home/sajty/repo2'\'''
19:46:53.878697 run-command.c:209       trace: exec: '/bin/sh' '-c' 'git-upload-pack '\''/home/sajty/repo2'\''' 'git-upload-pack '\''/home/sajty/repo2'\'''
19:46:53.903236 run-command.c:350       trace: run_command: 'index-pack' '--stdin' '-v' '--fix-thin' '--keep=fetch-pack 129 on SAJTY' '--check-self-contained-and-connected'
19:46:53.903236 run-command.c:350       trace: run_command: 'pack-objects' '--revs' '--thin' '--stdout' '--progress' '--delta-base-offset'
19:46:53.907238 exec_cmd.c:116          trace: exec: 'git' 'index-pack' '--stdin' '-v' '--fix-thin' '--keep=fetch-pack 129 on SAJTY' '--check-self-contained-and-connected'
remote: 19:46:53.910226 exec_cmd.c:116          trace: exec: 'git' 'pack-objects' '--revs' '--thin' '--stdout' '--progress' '--delta-base-offset'
19:46:53.915230 git.c:371               trace: built-in: git 'index-pack' '--stdin' '-v' '--fix-thin' '--keep=fetch-pack 129 on SAJTY' '--check-self-contained-and-connected'
remote: 19:46:53.918234 git.c:371               trace: built-in: git 'pack-objects' '--revs' '--thin' '--stdout' '--progress' '--delta-base-offset'
remote: Zähle Objekte: 72156, Fertig.
remote: Komprimiere Objekte: 100% (12623/12623), Fertig.
remote: Total 72156 (delta 59287), reused 72156 (delta 59287)
Empfange Objekte: 100% (72156/72156), 21.65 MiB | 16.91 MiB/s, Fertig.
Löse Unterschiede auf: 100% (59287/59287), Fertig.
19:46:57.648781 run-command.c:350       trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' '--progress=Prüfe Konnektivität'
19:46:57.653804 exec_cmd.c:116          trace: exec: 'git' 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' '--progress=Prüfe Konnektivität'
19:46:57.658788 git.c:371               trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' '--progress=Prüfe Konnektivität'

So how do I disable recompression when cloning with file:// protocol? This wastes HDD space and cloning takes 1 minute longer. I dont understand it. I just want to have the same packfile as on github.

Note1: I know about the --local option, which disables recompression, but then I dont see progress/download speed, which is even worse on a big repo with slow network connection.

Note2: I'm using default packing and gc settings and I could reproduce this with multiple repos on different PCs.

1 Answer 1

1

You can try and see if a clone without compression is quicker without compression:

git -c core.compression=0 clone ...

See git config core.compression:

An integer -1..9, indicating the compression level for objects that are not in a pack file.
-1 is the zlib default.
0 means no compression, and 1..9 are various speed/size tradeoffs, 9 being slowest.
If not set, defaults to core.compression. If that is not set, defaults to 1 (best speed).

Also make sure to use the latest Git for Windows (2.12 is available since today, Feb. 2017)

See also "git pull without remotely compressing objects":

core.compression 0 should disable zlib compression of loose objects and objects within packfiles. It can save a little time for objects which won't compress, but you will lose the size benefits for any text files.
But it won't turn off delta compression, which is what the "Compressing..." phase during push and pull is doing. And which is much more likely the cause of slowness.

pack.window 0 sets the number of other objects git will consider when doing delta compression. Setting it low should improve your push/pull times.
But you will lose the substantial benefit of delta-compression of your non-image files (and git's meta objects).

So you can try also:

git -c pack.window=0 clone ... 

The conclusion is that, while you can avoid the "Compressing" phase, this is not recommended.
It is better to disable delta for certain files only through a .gitattributes directive rather than removing delta compression for a all repo.

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.