1

Say I pulled down 100GB of content from a VisualSVN server. Can I make any inferences about the space that is used on the server itself to store that content?

If it is all compressed I would imagine it takes slightly less than 100GB.

If there are a tremendous about of logs or whatnot, that would theoretically add to the size.

Any way I can reliably determine the used storage amount of the server?

1 Answer 1

2

Reliably? Not a chance.

Svn uses similar techniques to diff to reduce the size of individual commits, files are effectively compressed to a "base" file and the additions and subtractions from that file. This apparently even works for binary files

The problem is though that you're not pulling down all of these intermediate blobs, over the course of several revisions the amount of space used to work out the difference between revisions could well be several times larger than the file itself.

You also don't pull down deleted files. If you are working with particularly large files that regularly get deleted then that deleted file would be stored on the server until the end of time, but never be seen on your hard drive. This would make the server have a larger quantity of data compared to your copy.

If you use externals a lot to link large projects together then you could effectively end up checking out several times more data than if you'd checked out just the individual projects. Tags also take nearly no space on the server (unless you edit files under them) so checking them out does not correspond to space taken on the server either.

The only inference you can really make is that given file deltas, logs, and deleted files and as long as you exclude externals and tags/branches, the server probably has more data than your local copy. File compression might affect this as well though.

2
  • What is the range of disparity that one might see? Could the storage on the server be using twice what the checked out size is?
    – Enigma
    Commented Jun 25, 2015 at 22:28
  • 2
    It's hard to say with any real authority. If your checkout includes a lot of tags or externals then the server could be using a tiny fraction of the size of your checkout. If your check out is clear of tags and externals and the repository is relatively young then maybe the sizes would be similar or at least within the same order of magnitude. It could be using twice as much, but at that sort of size I would be surprised and you would have to be using a lot of incompressible data.
    – Mokubai
    Commented Jun 26, 2015 at 6:07

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .