Skip to main content
added 95 characters in body
Source Link
nietras
  • 4k
  • 2
  • 36
  • 39

We have some really big repositories in git, in these we have observed how remote/server compression is a bottleneck when cloning/pulling. Given how pervasive git has become and that is uses zlib, has this zlib compression been optimized?

An Intel paper details how they can speedup the DEFLATE compression with a factor of about ~4 times although with a smaller compression ratio:

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-deflate-compression-paper.pdf

Another paper indicates a speed up of ~1.8 times where compression ratios are preserved for most compression 'levels' (1-9):

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/zlib-compression-whitepaper-copy.pdf

This latter optimization is it seems available on github: https://github.com/jtkukunas/zlib

zlib seems to be quite old (in this fast paced industry) latest release is from april, 2013. Have there been any attempts to SIMD optimize zlib for new processor generations? Or are there alternatives to using zlib in git?

I do understand you can specify a compression level in git that will impact speed and compression ratio. However, the above indicates there can be made quite big performance improvements on zlib without hurting compression ratios.

So to recap, are there any existing git implementation that uses a highly optimized zlib or zlib alternative?

PS: It seems a lot of devs/servers would benefit from this (even green house gas emission ;)).

We have some really big repositories in git, in these we have observed how remote/server compression is a bottleneck when cloning/pulling. Given how pervasive git has become and that is uses zlib, has this zlib compression been optimized?

An Intel paper details how they can speedup the DEFLATE compression with a factor of about ~4 times although with a smaller compression ratio:

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-deflate-compression-paper.pdf

Another paper indicates a speed up of ~1.8 times where compression ratios are preserved for most compression 'levels' (1-9):

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/zlib-compression-whitepaper-copy.pdf

zlib seems to be quite old (in this fast paced industry) latest release is from april, 2013. Have there been any attempts to SIMD optimize zlib for new processor generations? Or are there alternatives to using zlib in git?

I do understand you can specify a compression level in git that will impact speed and compression ratio. However, the above indicates there can be made quite big performance improvements on zlib without hurting compression ratios.

So to recap, are there any existing git implementation that uses a highly optimized zlib or zlib alternative?

PS: It seems a lot of devs/servers would benefit from this (even green house gas emission ;)).

We have some really big repositories in git, in these we have observed how remote/server compression is a bottleneck when cloning/pulling. Given how pervasive git has become and that is uses zlib, has this zlib compression been optimized?

An Intel paper details how they can speedup the DEFLATE compression with a factor of about ~4 times although with a smaller compression ratio:

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-deflate-compression-paper.pdf

Another paper indicates a speed up of ~1.8 times where compression ratios are preserved for most compression 'levels' (1-9):

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/zlib-compression-whitepaper-copy.pdf

This latter optimization is it seems available on github: https://github.com/jtkukunas/zlib

zlib seems to be quite old (in this fast paced industry) latest release is from april, 2013. Have there been any attempts to SIMD optimize zlib for new processor generations? Or are there alternatives to using zlib in git?

I do understand you can specify a compression level in git that will impact speed and compression ratio. However, the above indicates there can be made quite big performance improvements on zlib without hurting compression ratios.

So to recap, are there any existing git implementation that uses a highly optimized zlib or zlib alternative?

PS: It seems a lot of devs/servers would benefit from this (even green house gas emission ;)).

Source Link
nietras
  • 4k
  • 2
  • 36
  • 39

Git DEFLATE/optimized zlib

We have some really big repositories in git, in these we have observed how remote/server compression is a bottleneck when cloning/pulling. Given how pervasive git has become and that is uses zlib, has this zlib compression been optimized?

An Intel paper details how they can speedup the DEFLATE compression with a factor of about ~4 times although with a smaller compression ratio:

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-deflate-compression-paper.pdf

Another paper indicates a speed up of ~1.8 times where compression ratios are preserved for most compression 'levels' (1-9):

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/zlib-compression-whitepaper-copy.pdf

zlib seems to be quite old (in this fast paced industry) latest release is from april, 2013. Have there been any attempts to SIMD optimize zlib for new processor generations? Or are there alternatives to using zlib in git?

I do understand you can specify a compression level in git that will impact speed and compression ratio. However, the above indicates there can be made quite big performance improvements on zlib without hurting compression ratios.

So to recap, are there any existing git implementation that uses a highly optimized zlib or zlib alternative?

PS: It seems a lot of devs/servers would benefit from this (even green house gas emission ;)).