0

The DEFLATE algorithm is specified in RFC 1951. However, the encoder can choose freely whether to insert a literal byte, or a sub-match from output buffer for each input byte. Assuming that everything is encoded in a single block with dynamic Huffman compression, what is the time complexity to achieve the optimal compression ratio as allowed by the spec?

I fear it may not be polynomial, especially because Zopfli exists (and they don't claim to produce optimal result). However, I was unable to find any actual analysis or proof that confirms this.

1 Answer 1

0

Your premise is flawed. A single dynamic block will rarely be optimal. Better compression can be achieved with multiple dynamic blocks that adapt to the changing literal, length, and distance distributions in the data. If you make the blocks too long, you lose out on compression efficiency that easily pays for the occasional dynamic block header to describe new codes.

Your question is then also flawed, if by "time complexity", you mean O(some function of the uncompressed data length). All deflate compressors that I have seen will emit multiple deflate blocks of some constrained range of sizes. In that case, the time complexity is O(n), where n is the input length.

Perhaps what you are really after is information about the constant in front of the n. That constant can be large or small, depending on how much effort you want to put into finding matches, selecting matches, and arranging them into deflate blocks. That constant is not the time complexity.

1
  • My premise wasn't that it should be possible to encode optimally using a single block -- I meant that if you_must_ use only a single block, is it possible to encode it optimally in polynomial time? Is such algorithm known?
    – Gojus
    Commented Apr 27 at 10:00

Not the answer you're looking for? Browse other questions tagged or ask your own question.