42

In 7 Zip when adding a folder to an archive there is the option to change the Word Size.

How does this word size affect compression, in particular the final size of the zip?

I noticed that changing the compression level increases the word size, however even on ultra it only selects a word size of 128 even though the largest option is over double that. Is there a reason why ultra doesn't select the largest? Is optimal compression size somewhere between the biggest and smallest word size?

7
  • Check out what Shell says on this post about part of your questions here --> The Post. Commented Jan 6, 2016 at 2:21
  • @LMFAO_A_JOKE that just says for some files higher is better sometimes not
    – Aequitas
    Commented Jan 6, 2016 at 2:49
  • 3
    This doesn't ANSWER all your questions in great detail but for the ONE question of --> How does this word size affect compression, in particular the final size of the zip? I think the post part stating WordSize: usually the bigger, the better (and slower) for well-compressible data (such as documents). Archive size depends quite non-monotonically of it. gives you an explanation to PART of your set of questions. This is why I only put this here for a comment and did NOT answer -- just trying to give you something!!! Commented Jan 6, 2016 at 2:52
  • What does the last sentence mean, Archive size... non monotonically of it
    – Aequitas
    Commented Jan 6, 2016 at 3:23
  • 1
    I think this means that the archive size will be smaller (decreasing in size from the original size more) "typically" with the bigger the WordSize value, but it "depends" on the compressibility of the data types that are being compressed such as text as opposed to image files perhaps as one example. The suggestion was to test the different values to get the most optimal value for your data though to know you pick the best options to suit your need. Commented Jan 6, 2016 at 3:58

1 Answer 1

13

It really depends on the data you're compressing and the algorithm used.

Word size

Enter the length of words, which will be used to find identical sequences of bytes for compression. For LZMA, big word size usually gives a little bit better compression ratio and slower compression process. Big word size parameter can significantly increase compression ratio in case when files contain long identical sequences of bytes. For PPMd word size has a big meaning. It strongly affects both compression ratio and compression/decompression speed.

There are some comparisons here

0

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .