8

My local repository has about 500 files and a total size of about 125MB. I initialized a git repository on storage provided by "http://repositoryhosting.com/".

I did the following steps via Git GUI

  • git commit (of my local repo)
  • git remote add
  • git push

It said that it uploaded onto the Remote Repo and I could see the files, but the repo now had a size of only 26 MB.

I tried to git clone and git pull on two different occasions on another machine from the remote repo. They seemed to download exactly the 26MB that was on the Remote repo. But when I check the size of the folder on the machine, it shows that it is 125MB.

Questions:

  1. Does 'git push' compress data while uploading to Remote Repo?
  2. Am I losing data?
  3. If I'm trying to make a copy of the Remote Repo on multiple local machines so that multiple people can work on the same project, do I use Git Clone or Git Pull?

3 Answers 3

7

Does 'git push' compress data while uploading to Remote Repo?

Yes. It pushes diff delta pack files.

Am I losing data?

No.
Once you start working on a repo, you:

  • checkout those packed files in a working tree
  • work with added files stored in .git/objects, which aren't re-packed yet.
    see "Git Internals - Packfiles" for more.

If I'm trying to make a copy of the Remote Repo on multiple local machines so that multiple people can work on the same project, do I use Git Clone or Git Pull?

git clone for the initial copy and checkout of that repo.
Then git pull.

3
  • Any chance you can explain the last answer in a little more detail?
    – nitred
    Commented Feb 13, 2014 at 6:04
  • 1
    @NitRed for the first intialization on the repo on those machine, use a git clone (one clone for each machine). Then a git pull (again in each cloned repo) will be enough to update those cloned repos.
    – VonC
    Commented Feb 13, 2014 at 6:34
  • git Clone > cd into Repo > git pull. Hope that sounds good. Thanks a lot!
    – nitred
    Commented Feb 13, 2014 at 9:21
1

Apart from what's already been said, Git's content-addressable storage model naturally deduplicates data, i.e. files with identical contents are stored only once. I highly doubt that this comes to play in your case, but generally and depending on what type of data you store this is another reason why Git's storage is fairly efficient.

0

You're not losing data as git pushes data using a delta encoding. By the way, you could cleanup unnecessary files and optimize the local repository by doing a:

git gc 

From the manual page of gc:

Runs a number of housekeeping tasks within the current repository, such as compressing file revisions (to reduce disk space and increase performance) and removing unreachable objects which may have been created from prior invocations of git add.

Not the answer you're looking for? Browse other questions tagged or ask your own question.