Archive size and time (real seconds) to compress and extract 10 GB of files on various systems. Pareto frontier (no result is both faster and smaller) is marked with *.
Benchmark created July 28, 2013. Last update July 25, 2019.
Dual Xeon E5-2620, 2.00 GHz, 12+12 hyperthreads, 64 GiB, Fedora Linux 2.6.18-348.1.1.e15. Tested by Matt Mahoney.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 2720359988 43888* 45359* 1 zpaq 6.41 -m 611 -th 1 2726432291 12491* 12816* 1 zpaq 6.40 -m 610 -th 4 2761115298 35500 19466 1 nanozip 0.09a -cc -m16g 2767789726 3402* 3617* 1 nanozip 0.09a -cc -m20g -p6 -t16 2791243335 4060 2926* 1 zpaq 6.40 -m 6 2917361916 1306* 852* 1 zpaq 6.40 -m 5 3009174091 1734 277* 1 nanozip 0.09a -co -m16g 3066519365 476* 251* 1 zpaq 6.40 -m 4 3307719332 821 78* 1 nanozip 0.09a -cD -m16g 3399589497 241* 115 1 zpaq 6.40 -m 3 3594933877 10003 519 1 7zip 4.47b -mx 3701584921 187* 67* 1 zpaq 6.40 -m 2 -noa 3711047191 3109 789 1 freearc 0.666 3712136038 827 592 1 bsc 3.1.0 -b1024 -e2 3712136038 1855 560 1 bsc 3.1.0 -b1024 -e2 -t 3712136038 3669 2873 1 bsc 3.1.0 -b1024 -e2 -T 3826336763 552 153 1 nanozip 0.09a -co 3833498676 177* 67 1 zpaq 6.40 -m 1 -noa 3833514680 178 68 1 zpaq 6.42 -m 1 3891900426 6600 584 1 7zip 4.47b 3939774592 764 111 1 rar 5.00b2 -s -m5 -ma5 4068167998 193 121 1 bsc 3.1.0 4092635369 149 78 1 nanozip 0.09a -cD 4163457712 393 135 1 rar 5.00b2 -s 4435077429 70* 82 1 nanozip 0.09a -cd 4473397485 2126 669 1 bzip2 1.0.3 4761505815 2975 111 1 gzip 1.3 5 -9 4797936165 1220 113 1 gzip 1.3.5 4981509000 64* 86 1 nanozip 0.09a -cf -m10g -t6 -p2 -nm -br1g -bw5g 4982889276 39* 56* 1 nanozip 0.09a -cf -m10g -t16 -p4 -nm 10005192274 21* 57 1 nanozip 0.09a -cn -p1 -t1 10005192274 15* 1 nanozip 0.09a -cn -p1 -t1 to /dev/shm 10065018880 21 30 1 tar 1.15.1 10065018880 15 1 tar 1.15.1 to /dev/shm 10065018880 14* 1 cat 10gb.tar >/dev/null 10065018880 53 1 cat 10gb.tar >10gb.out 10065018880 53 1 cp 10gb.tar 10gb.out 10065018880 171 1 cp /dev/shm/10gb.tar /dev/shm/10gb.out 10065018880 191 1 cat /dev/shm/10gb.tar >/dev/null Input error on enwik9.pmd 1 zip 2.31 Cannot find libcurl.so.4 1 freearc 0.666 Kernel too old 1 exdupe 0.4.2 GCC version 4.4 required 1 pcompress 2.4
Gateway M-7301U laptop, T3200, 2.00 GHz, 2 cores, 3 GiB, 32 bit Vista with compression to and decompression from a Buffalo MiniStation 312 GB external USB drive. Tested by Matt Mahoney.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 2960050436 15685* 1330* 2 freearc 0.666 -m9 3701584921 1529* 679* 2 zpaq 6.40 -m 2 -noa 3833521072 1162* 650* 2 zpaq 6.40 -m 1 3852399593 10394 787 2 7zip 9.30a -m 1 4851942767 2194 488* 2 zip 3.00 10065158144 879* 562 2 tar 1.11.2
Intel Core i7-3960X Extreme (OC) 4.4GHz+ 5.7GHz turbo 6 core (12 hyper-threads), 32GB PC3-17000 2133MHz DDR3, nVIDIA GeForce GTX 680, Corsair Performance Pro 256GB single SSD, Windows 8 Pro 64-bit, ramdisk. Tested by sportman.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 2791249747 4934* 5882* 3 zpaq 6.40 -m 6 2917368308 1270* 794* 3 zpaq 6.40 -m 5 2945941964 4159 365* 3 freearc 0.67 -m9 2958024159 3979 365 3 freearc 0.67 -m8 2964753763 3583 362* 3 freearc 0.67 -m7 3066525757 434* 242* 3 zpaq 6.40 -m 4 3321666324 3117 380 3 freearc 0.67 -m6 3399586889 179* 92* 3 zpaq 6.40 -m 3 3535037233 2633 245 3 freearc 0.67 -m5 3550095020 1062 107 3 7zip 9.30a -m8 3550095020 1066 108 3 7zip 9.30a -m9 3570746826 1063 108 3 7zip 9.30a -m7 3701607317 124* 32* 3 zpaq 6.40 -m 2 3706964420 203 72 3 freearc 0.67 -m4 3762859007 838 109 3 7zip 9.30a -m6 3833521072 105* 33 3 zpaq 6.40 -m 1 3858399477 105 65 3 freearc 0.67 -m3 3874193937 748 108 3 7zip 9.30a -m 5 3977316653 52* 55 3 freearc 0.67 -m2 4245851383 180 128 3 7zip 9.30a -m4 4313513505 26* 32 3 freearc 0.67 -m1 4349045028 155 141 3 7zip 9.30a -m3 4437691066 1523 3 flashzip 1.1.3 -m3 4459028364 314 87 3 rar 5.00b -m5 4468445589 261 86 3 rar 5.00b -m4 4489759408 206 71 3 rar 5.00b -m3 4492928788 124 135 3 7zip 9.30a -m2 4515946515 1220 3 flashzip 1.1.3 -m0 4566147911 111 156 3 7zip 9.30a -m1 4573249469 3997 3 packet 1.0 -m4 4617030100 152 101 3 rar 5.00b -m2 4677844032 2124 3 packet 1.0 -m0 4983787863 91 98 3 rar 5.00b -m1
Dell Latitude E6510 laptop, Core i7 M620, 2.66 GHz, 2+2 hyperthreads, 4 GB, Ubuntu. Compression to and decompression from a Buffalo MiniStation 312 GB external USB drive. win32/64 programs tested under wine. Tested by Matt Mahoney.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 2788126729 20291* 19898* 4 zpaq 6.51 -m 5 2788126729 20429 19955 4 zpaq 6.50 -m 5 2788126887 18653* 18205* 4 zpaq 7.14 -m5 2788133244 19426 18617 4 zpaq64.exe 7.14 -m5 2788133244 35345 36901 4 zpaq.exe 7.14 -m5 -t1 2791243359 21082 20305 4 zpaq 6.40 -m 6 2791485467 20947 19923* 4 zpaq 6.48 -m 6 2791485467 20764 20124 4 zpaq 6.47 -m 6 2791485467 20722 19987 4 zpaq 6.45 -m 6 2804620772 4072* 3092* 4 zcm 0.93 -m8 -t1 2893018181 9510 4623 4 zpaq 6.41 -m 57 2893742274 1756* 838* 4 pcompress 3.1 -l14 -s60 2898272738 3733 3586 4 zpaq 6.44 -m 57 2899901372 3761 3559 4 zpaq 6.45 -m 57 2917361920 6510 4293 4 zpaq 6.40 -m 5 2932527028 3395 3340 4 zpaq 6.50 -m 4 2932527615 3428 3354 4 zpaq 6.51 -m 4 2932527773 3340 3210 4 zpaq 7.14 -m4 2932534130 3453 3313 4 zpaq64.exe 7.14 -m4 2932534130 4842 4584 4 zpaq.exe 7.14 -m4 -t2 2933199750 9520 412* 4 packet 1.9 -mx -h8 -b5 -r -s 2933330987 4246 3587 4 zcm_x64 0.92 -m8 -t1 -r -s 2936202342 3403 3300 4 zpaq 6.48 -m 5 2936202342 3426 3320 4 zpaq 6.47 -m 5 2936202342 3539 3313 4 zpaq 6.45 -m 5 2937922856 1572* 754 4 pcompress 3.1 -l14 2939488402 3475 3254 4 zpaq 6.44 -m 5 2954727653 3568 328* 4 packet 1.2 -r -mx -b512 -h4 2959337875 11676 905 4 freearc 0.666-win32 -m9 2969410664 4253 3570 4 zcm_x64 0.92 -m7 -t1 -r -s 2969410664 4545 3875 4 zcm 0.92 -m7 -t1 -r -s 2969891797 3658 785 4 nanozip 0.09a -co -m3.5g 3013933248 2018 1122 4 rings 2.5 -m8 -t1 -r -s 3029112322 1675 805 4 pcompress 3.0b -l14 3035559935 1689 1313 4 zpaq 6.51 -m 3 3035560093 1659 1276 4 zpaq 7.14 -m3 3035566450 1712 1379 4 zpaq64.exe 7.14 -m3 3035566450 2548 1991 4 zpaq.exe 7.14 -m3 -t2 3039097696 1948 346 4 csarc 3.3 -m5 -d512 -t1 -r -f 3036900386 1687 1321 4 zpaq 6.50 -m 3 3048268618 2383 1420 4 zpaq 6.41 -m 47 3060935581 2256 1320 4 zpaq 6.48 -m 4 3060935581 2231 1348 4 zpaq 6.47 -m 4 3060935581 2308 1337 4 zpaq 6.45 -m 4 3061984592 1536* 650 4 pcompress 3.0b -l10 3066519369 2212 1421 4 zpaq 6.40 -m 4 3091759649 1523* 612 4 pcompress 3.0b -l8 3093233656 2812 1270 4 pcompress 3.0b -l6 -t1 -v 3093233656 1444* 596 4 pcompress 3.0b -l6 3096159495 1154* 610 4 pcompress 3.1 -l6 3112349650 8227 315* 4 lza 0.82b -mx9 -h7 -b7 -r -s 3129833881 7575 315 4 lza 0.80 -mx9 -h7 -b7 -r -s 3131331323 6411 302* 4 lza_x64 0.70b -mx9 -h7 -b7 -r -s 3144102764 1910 1721 4 pcompress 2.4 -G -L -P -c adapt2 -l 14 -t2 - 3161399085 4746 296* 4 lza_x64 0.63 -mx5 -h8 -b7 -r -s 3165275675 730* 563 4 pcompress 2.4 -G -L -P -c adapt2 - 3167195159 4025 313 4 lza_x64 0.51 -mx5 -h8 -b7 -t1 -r -s 3175066122 1130 837 4 pcompress 2.4 -G -L -P -c ppmd 3176404830 911 804 4 pcompress 2.4 -G -L -P -c adapt - 3186833109 2669 189* 4 pcompress 2.4 -G -L -P -c lzmaMt - 3186833306 2485 195 4 pcompress 2.4 -G -L -P -c lzma - 3199119158 1339 615 4 pcompress 3.0b -l4 3204716369 3454 238 4 rar 5.04b -m5 -ma5 -md512m -s -r 3234943741 4312 3804 4 zcm 0.90 -m7 -t1 -r -s 3278149230 3694 3042 4 zcm 0.92 -m7 -t2 -r -s 3314339844 7442 440 4 packet 1.1 -r -mx -b512 -h4 3324783324 5857 430 4 packet 1.1 -r -m9 -b512 -h4 3354077533 1323 600 4 zpaq 6.45 -m 38 3358033965 1226 543 4 zpaq 6.41 -m 38 3370238323 1637 222 4 csarc 3.3 -m5 -d256m -t2 -r -f 3391292321 832 492 4 zpaq 6.48 -m 3 3391292321 831 489 4 zpaq 6.47 -m 3 3391292321 944 487 4 zpaq 6.45 -m 3 3395466812 606* 296 4 pcompress 2.4 -G -L -P -c bzip2 - 3399580501 848 504 4 zpaq 6.40 -m 3 3422372179 4202 271 4 lza_x64 0.10 -mx5 -h7 -b7 -t1 -r -s 3485028038 1282 234 4 zpaq 6.51 -m 2 3485028196 1299 213 4 zpaq 7.14 -m2 3485034553 1286 252 4 zpaq64.exe 7.14 -m2 3845034553 1880 245 4 zpaq.exe 7.14 -m2 -t2 3505127388 1012 228 4 zpaq 6.50 -m 2 3522585547 3735 325 4 lza 0.51 -mx5 -h6 -b6 -t1 -r -s 3571388156 1012 203 4 csarc 3.3 -m5 -d128m -t4 -r -f 3595106502 9580 446 4 7zip 9.20 -mx 3621557490 786 216 4 csarc 3.3 -m3 -d128 -t4 -r -f 3651862591 491* 171* 4 pcompress 2.4 -G -L -P -c zlib - 3659330007 3285 493 4 RH4_x64.exe v8 -r2 c6 3669928697 1080 532 4 exdupe 0.4.2 -x3 3671183744 1138 538 4 exdupe 0.5.0b -x3 3679066660 743 211 4 csarc 3.3 -m1 -d128 -t4 -r -f 3689892103 2803 403 4 RH5_x64 -window:27 c6 3693913382 512 234 4 zpaq 6.48 -m 2 3693913382 515 238 4 zpaq 6.47 -m 2 3693913382 527 222 4 zpaq 6.45 -m 2 3701584921 529 247 4 zpaq 6.40 -m 2 -noa 3701600929 529 235 4 zpaq 6.41 -m 2 3702166757 3287 250 4 RH4_x64.exe v6 c6 3705463377 3741 531 4 RH4_x64.exe v7 c6 3711048040 1448 422 4 freearc 0.666-win32 3741983103 1702 515 4 RH4_x64.exe v8 -r2 c2 3749797683 3569 269 4 lza 0.10 -mx5 -h6 -b6 -t1 -r -s 3751765805 1421 538 4 RH4_x64.exe v8 -r0 c2 3751837479 4919 261 4 RH4_x64.exe v5 c6 3753809471 5009 289 4 lza 0.62 -mx5 -h8 -b7 -r -s 3780469075 1635 562 4 RH4_x64.exe v8 -r1 c2 3785182357 1373 269 4 RH4_x64.exe v6 c2 3788426871 1843 530 4 RH4_x64.exe v7 c2 3794814167 403* 216 4 zpaq 7.14 -m16 -t4 3794814167 403 230 4 zpaq 7.14 -m16 -t3 3794814167 453 240 4 zpaq 7.14 -m16 -t2 3794814167 675 269 4 zpaq 7.14 -m16 -t1 3800332723 359* 249 4 zpaq 6.50 -m 1 3800332723 382 256 4 zpaq 6.51 -m 1 3800332881 379 237 4 zpaq 7.14 -m1 -t4 3800332881 390 249 4 zpaq 7.14 -m1 -t3 3800332881 439 260 4 zpaq 7.14 -m1 -t2 3800332881 634 272 4 zpaq 7.14 -m1 -t1 3800339238 415 258 4 zpaq64.exe 7.14 -m1 -t4 3800339238 418 256 4 zpaq64.exe 7.14 -m1 -t3 3800339238 450 247 4 zpaq64.exe 7.14 -m1 -t2 3800339238 632 274 4 zpaq64.exe 7.14 -m1 -t1 3800339238 472 261 4 zpaq.exe 7.14 -m1 -t4 3800339238 476 275 4 zpaq.exe 7.14 -m1 -t2 3814215328 5127 646 4 flashzip 1.1.3 -r -mx3 -k7 -b1024 -s 3825567647 379 249 4 zpaq 6.48 -m 1 3825567647 378 232 4 zpaq 6.47 -m 1 3825567647 353* 231 4 zpaq 6.45 -m 1 3832641851 317* 245 4 zpaq 6.40 -m 1 -fragile 3832734358 2941 255 4 RH4_x64.exe v5 c2 3833521161 377 255 4 zpaq64.exe 6.41 -m 1 3833521161 407 272 4 zpaq.exe 6.41 -m 1 3833514684 330 238 4 zpaq 6.40 -m 1 3833514684 355 215 4 zpaq 6.41 -m 1 3833514684 345 208 4 zpaq 6.42 -m 1 3833514716 372 235 4 zpaq 6.43 -m 1 -key x 3868519543 1391 383 4 RH5_x64 3873173915 3461 424 4 packet 1.1 -r 3890128551 13179 209 4 tornado 0.6a -16 3892353718 6953 472 4 7zip 9.20 3901424352 3021 668 4 flashzip 1.1.3 -r -mx0 -k7 -b1024 -s 3903806097 304 221 4 zpaq 6.41 tar | -m 1 3938386309 3115 267 4 rar 5.00b7 -m5 -ma5 -s 3958233075 1337 560 4 obnam 1.1 --compress-with=deflate 4056247901 5236 277 4 lza 0.01 -m5 -h6 -b6 -t1 -r -s 4165451731 485 229 4 exdupe 0.4.2 -x2 4166697886 552 226 4 exdupe 0.5.0b -x2 4175719735 2392 186 4 pcompress 2.3 -c lzma - 4190018637 2634 1292 4 rings 2.2 -m7 -o -r -s 4193774664 1970 1409 4 rings 2.1 -m7 -t1 -r -s 4202837002 758 636 4 pcompress 2.3 -c adapt2 - 4215558609 952 865 4 pcompress 2.3 -c adapt - 4226884442 637 310 4 pcompress 2.3 -D -s256m -c bzip2 - 4476910745 1665 288 4 RH4_x64.exe v2 c6 4480680312 1284 277 4 RH4_x64.exe v2 c5 4480815838 1175 271 4 RH4_x64.exe v2 c4 4488096549 1240 258 4 RH4_x64.exe v2 c3 4493860462 1423 239 4 rar 5.00b7 4513183867 429 180 4 pcompress 2.3 -D -s256m -c zlib - 4524741017 1144 255 4 RH4_x64.exe v2 c2 4552586028 637 183 4 tornado 0.6a -5 4554281498 715 205 4 tornado 0.6 -5 4577377673 1101 287 4 RH4_x64.exe v2 c1 4788875190 11263 24115 4 packARC.exe 0.7RC13 -i -np 4788875190 11633 10777 4 packARC.exe 0.7RC15 -np 4802080309 2450 245 4 zip 3.00 -9 4844449350 1206 229 4 zip 3.00 5012014150 226* 178 4 pcompress 2.4 -G -L -P -c lzfx - 5093029081 229 209 4 pcompress 2.4 -G -P -c lzfx - 5101253738 189* 179 4 pcompress 2.4 -G -c lzfx - 5253142361 212 195 4 pcompress 2.4 -G -L -P -c lz4 - 5289673104 430 239 4 exdupe 0.4.2 -x1 5290752762 513 250 4 exdupe 0.5.0b -x1 5306346845 186 186 4 pcompress 2.4 -G -c lz4 - 5995380166 332 189 4 pcompress 2.3 -D -s256m -c lzfx - 6148381553 201 199 4 pcompress 2.3 -D -s32m -c lzfx - 6236232530 210 201 4 pcompress 2.3 -D -c lzfx - 6284419084 215 210 4 pcompress 2.3 -D -s1m -c lzfx - 6311889423 208 201 4 pcompress 2.3 -c lzfx - 6882012804 226 285 4 tornado 0.6a -1 8285885170 276 274 4 pcompress 2.4 -G -c none 8452938083 292 420 4 zpaq 6.40 -m 0 8465736563 478 332 4 exdupe 0.4.2 -x0 10065018880 408 368 4 tar 1.26 10065018880 334 335 4 cp 10gb.tar libgmp.so.4: wrong ELF class 4 freearc 0.666-linux-i386 close task's active + >3 hrs 4 flashzip 1.1.3 -r -mx0 -k7-b1024 -s -t3 unhandled exception: page fault 4 packARC 0.7RC11
Same as system 4 but without external USB drive. Tested by Matt Mahoney.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 3701584921 537* 319* 5 zpaq 6.40 -m 2 -noa 5289676858 499* 336 5 exdupe 0.4.2 -x1 10065018880 537 405 5 tar 1.26 10065018880 394* 394 5 cp 10gb.tar
Same as system 4 but in Windows 7 64-bit Enterprise with McAfee Antivirus. Tested by Matt Mahoney.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 3833521072 796 1159 6 zpaq64 6.41 -m 1 3833521072 996 1159 6 zpaq 6.41 -m 1 4852089611 2158 1767 6 zip 3.00
Intel Core i7 920, 2.66 GHz, 4+4 HT, 6 GB, Win7. Tested by Nania Francesco.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 3008160754 4891* 470* 7 packet 1.1 -r -m9 -b512 -h4 3108986473 6864 377* 7 lza 0.70b -r -s -mx9 -b7 -h9 3629647937 1335* 479 7 packet 1.1 -r -v -m0 -b512 -h4 3872143561 2345 629 7 flashzip 1.1.3 -r -mx0 -k7 -b1024 -s
HP Pavilion DV6 Laptop, Core i5 430M, 2.27GHz, 2+2 HT, 8GB RAM, 5.4K RPM 500GB internal HDD, 7.2K RPM 2TB WD Caviar Black external drive over eSATA. 64-bit Linux Mint 14 (Nadia) OS. Gcc 4.7.2 Compiler. Compression is from external to internal drive. Decompression is reversed. Before each test, disk cache is cleared with "echo 3 > /proc/sys/vm/drop_caches". Tested by Moinak Ghosh.
Size Compress Extract Sys Program version Options ---------- -------- -------- --- --------------- -------- 2763952499 25688* 25077* 8 nanozip 0.09a -cc -m6g 2791243359 26359 25627 8 zpaq 6.41 -m6 2917361920 8359* 5369* 8 zpaq 6.41 -m5 2969926451 5430* 1277* 8 nanozip 0.09a -co -m3.5g 2984395827 16962 1726 8 pcompress 2.4 -G -P -c lzma -s1400m -l14 -B0 -t1 - 3032331922 5369* 154* 8 pcompress 2.4 -G -P -c lzma -s120m -l14 -B0 - 3052163641 2124* 1891 8 pcompress 2.4 -G -L -P -c adapt2 -s80m -l14 -B0 - 3066519369 2841 1744 8 zpaq 6.41 -m4 3094638497 4332 164 8 pcompress 2.4 -G -L -P -c lzmaMt -s64m - 3138482578 1128* 856 8 pcompress 2.4 -G -L -P -c adapt2 -s64m - 3167835671 938* 632 8 pcompress 2.4 -G -L -P -c adapt2 - 3189732880 3277 160 8 pcompress 2.4 -G -L -P -c lzmaMt - 3235809003 5979 1406 8 nanozip 0.09a -co -m6g 3350270033 794* 356 8 pcompress 2.4 -G -L -P -c bzip2 -s64m -l9 -B0 - 3358033965 1243 687 8 zpaq 6.41 -m38 3377694567 766* 327 8 pcompress 2.4 -G -L -P -c bzip2 -s64m - 3399580501 1094 600 8 zpaq 6.41 -m3 3417368730 766 314 8 pcompress 2.4 -G -c bzip2 -s64m - 3595106502 11776 532 8 7zip 9.20 -mx 3599420509 856 105* 8 pcompress 2.4 -G -L -P -c zlib -s64m -l9 - 3630445464 457* 104* 8 pcompress 2.4 -G -L -P -c zlib -s64m - 3650211291 501 110 8 pcompress 2.4 -G -L -P -c zlib - 3670069137 1223 792 8 exdupe 0.4.3 -x3 3701584921 730 261 8 zpaq 6.41 -m2 -noa 3832641851 425 186 8 zpaq 6.41 -m1 -fragile 3833514684 514 269 8 zpaq 6.41 -m1 3892353718 8542 563 8 7zip 9.20 3938386309 3560 274 8 rar 5.00b8 -m5 -ma5 -s 4075474017 830 116 8 pcompress 2.4 -G -L -P -c lz4 -l3 - 4165458298 410* 167 8 exdupe 0.4.3 -x2 4183444673 854 368 8 pcompress 2.4 -D -c bzip2 -s256m - 4461561437 382* 92* 8 pcompress 2.4 -D -c zlib -s256m - 4493853463 1498 248 8 rar 5.00b8 4802267341 3406 232 8 zip 3.00 -9 -r 4844635686 1546 235 8 zip 3.00 -r 4979169087 197* 111 8 pcompress 2.4 -G -L -P -c lzfx -s32m -l9 -B0 - 4983709337 256 123 8 pcompress 2.4 -G -L -P -c lzfx -B0 - 5009464875 249 116 8 pcompress 2.4 -G -L -P -c lzfx - 5090836384 198 110 8 pcompress 2.4 -G -P -c lzfx - 5098696491 187* 118 8 pcompress 2.4 -G -c lzfx - 5214153196 201 127 8 pcompress 2.4 -G -L -P -c lz4 -s16m -B0 - 5250732541 230 127 8 pcompress 2.4 -G -L -P -c lz4 - 5272105182 164* 122 8 pcompress 2.4 -G -c lz4 -B0 - 5289546149 282 162 8 exdupe 0.4.3 -x1 5304341438 162* 122 8 pcompress 2.4 -G -c lz4 - 5923183422 117* 113 8 pcompress 2.4 -D -c lzfx -s256m -l9 - 6123410208 130 114 8 pcompress 2.4 -D -c lzfx -s32m - 6197343313 170 110 8 pcompress 2.4 -D -c lzfx - 6310806836 148 116 8 pcompress 2.4 -c lzfx -l9 - 6310806836 152 115 8 pcompress 2.4 -c lzfx - 8204889618 150 162 8 pcompress 2.4 -G -c none -s16m -B0 - 8279076467 146 165 8 pcompress 2.4 -G -c none -s16m - 8452938083 412 400 8 zpaq 6.41 -m0 8466685879 306 208 8 exdupe 0.4.3 -x0 10065018880 181 221 8 tar 1.26 decomp. hangs or CRC errors 8 freearc
Listed alphabetically.
7-zip. Option -mx selects best compression.
bsc. Single file compressor tested on 10gb.tar. Option -b1024 selects the maximum BWT block size of 1024 MB. Default is -b25. -e2 selects adaptive coding. Default is -e1 (fixed). Options -p (disable preprocessing), -s (adaptive block size segmentation) and -r (structured data reordering) make compression worse. All blocks are compressed in parallel, using 10 threads and 50 GB memory. -t compresses blocks serially using 5 GB but allows multithreading within blocks. -T disables all multithreading.
bzip2. Single file compressor tested on 10gb.tar. -9 selects best compression and is also the default.
csarc v3.3. Options -m1..-m5 select compression level. -d selects dictionary size. -t selects threads. -r recurses directories. -f forces overwrite of existing archive. Archives cannot be updated. File dates but not directory dates are restored.
exdupe. Compression options are -x0 (deduplicate with no compression), -x1 (quicklz, default), -x2 (zlib), -x3 (bzip2). Designed for fast backups with deduplication. v0.4.2 on system 4 and 5 restores file dates with the wrong time zone (5 hours too young in EDT), and some directory times are not restored.
flashzip for Windows only. -r selects recurse subdirectories. -m selects compression level. Maximum is -mx3 (1.1 GB memory). -k7 selects maximum ROLZ dictionary size of 256 MB. -b1024 selects I/O maximum buffer size of 1024 MB (default -b16). -s selects solid archive. -t3 selects 3 tasks (default -t1). Does not restore file dates or empty directory trees.
freearc. Option -m9 selects maximum compression. In v0.666 (system 2, Windows, and 4, wine) some directory dates are not restored. In system 2, files created during standard time and extracted during daylight savings time are restored 1 hour too young.
gzip. Single file compressor tested on 10gb.tar. -9 selects best compression.
lza (derived from zcm). -mx5 (-m5 in v0.01) selects maximum compression (default -m3). -h6 selects 512 MB LZ hash table size (default -h2, max -h7 = 1 GB). -b6 selects 512 MB LZ buffer (default -b3 max -b7 = 1 GB. Note: -h7 -b6 or -h6 -b7 or higher causes out of memory error in 32 bit versions). -t1 selects 1 thread (-t2 or higher is faster but makes compression worse and decompression slower. This option is removed in v0.62). -r selects recurse subdirectories. -s selects solid mode.
nanozip. Option -co selects default (BWT) compression level. -m16g selects 16 GB memory (default 512 MB). Memory is divided among threads. -cc selects best compression and uses 1 thread by default. All others (-cf, -cd, -cD, etc.) are multi-threaded. -t selects number of threads (default: number of cores or hyperthreads detected). -p selects number of parallel compressors (default: auto depending on -t, -c, and files). -nm does not save file metadata. -br1g sets the read buffer to 1 GB. -bw10g sets the write buffer to 10 GB. The 62 empty directories in 51 directory trees are not restored. Directory dates are not restored. Some file dates are not restored correctly on system 4.
obnam is an incremental, versioning, deduplicating backup utility for Linux. Compression options are --compress-with=deflate or no compression. The "archive" is a repository with 72265 files in 105 directories. The total size of all files is reported.
packARC.exe 0.7RC13 for Windows and Linux. The Windows version was used to extract on system 4 because the Linux version fails to extract 7483 out of 83437 files and directories. The Linux version was used to compress. The Windows version works under Wine but does not restore dates or extract empty directories. -i means ignore checksum errors. -np means don't pause when finished.
packet for Windows only. Tested in Linux/wine. -m0..-m9, -mx selects compression level (fast..slow), default -m3. -b0..-b5 selects buffer size 16..512 MB for decompression, default -b2 = 64 MB. -h0..-h9 selects hash buffer memory 8 MB..4 GB for compression, default -h2 = 32 MB. -r recurses diretories. -s selects solid mode.
pcompress for Linux only. v2.4 is a single-file compressor tested on 10gb.tar. -G selects global deduplication (in v2.3 caused decompression to crash; fixed Aug. 24, 2013). -D selects non-global deduplication within chunks using Rabin filtering to decide fragment boundaries (default is no deduplication). -s256m, -s1g selects chunk size of 256 MB or 1 GB for non-global deduplication. -c selects compression algorithm: none, lzfs, lz4, zlib, bzip2, lzma (7zip), ppmd, or libbsc. lzmaMt is multi-threaded lzma. adapt2 uses ppmd for text, else lzma. -L selects LZP preprocessing. -P selects adaptive delta coding of numeric tables. -p selects pipe from stdin to stdout. Trailing - select file to stdout, which is necessary in v2.4 to compress to external disk. Default is to add a .pz extension. -l 14 selects maximum compression for adapt2 (varies by algorithm). -t2 selects 2 threads to save memory (default is the number of cores and hyperthreads detected, 4 on system 4). -v selects verbose mode.
pcompress v3.0 beta and v3.1 are archivers tested on the directory tree. Archives cannot be updated. -l14 selects maximum compression level (range -l1 to -l14, default -l6). -s60m selects 60 MB block size for parallel compression (default 8). -v selects verbose mode. -t1 selects the number of threads. The default is the number of hyperthreads detected (4 on system 4).
rar. Option -s selects solid archive. -m5 selects maximum compression (default -m3). -ma5 selects new archive format with better compression (default -ma4 is compatible with older versions). -md512m selects 512 MB dictionary, requiring 1 GB memory per thread. -r selects recursive directory traversal (default).
RH4. Compression levels are c1 to c6. Does not restore file dates or empty directory trees. v2 refers to the release on Mar. 7, 2014. v5 is Mar. 19, 2014. v5 deduplicates whole files but does not restore empty files. v6, Mar. 22, 2014, fixes this bug, restoring empty files (but not empty directories). v7 refers to Apr. 24, 2014. v8 (Apr 29, 2014) add -r option: -r0 (default) orders files by path, -r1 by extension and size bin, -r2 by content analysis.
RH5 is a newer version of RH4. c6 selects maximum compression (default is c2). -window:27 selects maximum window size of 2^27 bytes (default is 23).
rings is a BWT based archiver. Does not save file dates or empty directories. Option -m7 (-m8 in v2.5) selects maximum compression. -t1 selects 1 thread (default). -r recurses directories. -s is solid mode. -o (v2.2, removed in v2.5) selects multi-threaded compression.
tar (GNU 1.11.2) saves but does not restore directory dates in Windows.
tornado 0.6. Single file compressor tested on 10gb.tar. -5 selects default compression level (1..16).
zcm for Windows only. Option -r mean recurse directories. -m7 selects maximum compression (1.6 GB memory per thread, default -m4). -t1 selects 1 thread (default). -s selects solid archive. Empty directory trees are not restored. File dates are not preserved.
zip. -9 selects best compression (default is -6).
zpaq. Option -noa means do not save attributes. -m selects compression level (0..6, default -m 1 or -m1). An optional second number N (0..12) specifies a block size of 2^N MB with corresponding increase in memory. For example -m 611 specifies level 6 with 2 GB blocks (requires 20 GB memory per thread). Default is -m 14, 25, 36, 46, 56, 66 selecting 16, 32, 64, 64, 64, 64 MB block size. -m 0 deduplicates with no compression. In v6.50 and higher, compression levels are 0..5 (default 1 or 14, 26..56). -th 4 or -t4 selects 4 threads (default is number of cores: 24, 2, 12, 4, 4 on systems 1..5). -fragile does not add checksums or recovery info and does not verify checksums during extraction. (Bug: v6.40 incorrectly reports extraction failure even though OK. Fixed in v6.41; no compression changes). -q or -q 1000000 is used prior to v6.49 to remove all or most console output. -key encrypts with AES-256 in CTR mode. "tar|" indicates test on 10gb.tar.
This is an open benchmark. Anyone may submit results by emailing me (Matt Mahoney) at mattmahoneyfl (at) gmail.com. I will credit you unless you prefer to be anonymous. Please indicate the program, version, compressed size of the archive, compression and decompression times in real seconds, system used (hardware and OS), and the options used to compress the directory 10gb. If you use an option to select the number of threads, then use the same option to decompress or submit a separate entry.
You can time the program with your watch or use a program like timer32.exe or timer64.exe in Windows (use global time) or the time command in Linux (use real time). If you run a test more than once, then report the fastest time. Download 10gb.zpaq here or here (3.7 GB). You will need zpaq to extract:
zpaq x 10gb.zpaqwhich will create a directory 10gb in the current directory. It may take a few minutes to extract the files. You will need 14 GB of free disk space to download and extract, and probably another 15-20 GB to do any testing. In Windows, the following commands should show 10,000,000,000 bytes in 79,431 file and 12,017 directories:
cd 10gb dir/sThere are really 4006 directories but Windows also counts "." and ".." in each one.
Example of testing an archiver (zip) in Windows:
timer32 zip -9 10gb-9.zip 10gb (get size and compression time) rename 10gb 10gb_tmp timer32 unzip 10gb-9.zip (time decompression) zpaq list 10gb -not == -force (compare file contents*) rmdir/s/q 10gb rename 10gb_tmp 10gb (restore the original data) del 10gb-9.zipAlternatively in Linux or cygwin you can test (instead of zpaq c):
diff -r 10gb 10gb_tmp
*zpaq v6.58 or later. For versions 6.46 through 6.57 use "zpaq compare 10gb". For versions 6.38 through 6.45 use "zpaq compare 10gb -force". Earlier versions cannot compare at all. It is not required that last-modified dates be restored to pass the test.
If successful, zpaq should report that 0 of 83437 files differ, or diff should print nothing. zpaq compares the external file contents for each internal file to the SHA-1 hashes stored in the archive. diff might be faster, but on my PC (system 2) it runs out of memory.
To test a single-file compressor, make a tar file to test (e.g. "tar cf 10gb.tar 10gb"). Do not include the time to create and extract the tar file. You can report tar as a separate entry.
10gb.zpaq SHA-1 hash is 224547185f873fe414d7d97ea812af5c8935a5d8 (You can use "zpaqd s 10gb.zpaq" to test). Size is exactly 3,701,584,921 bytes. From the extracted directory, you can re-create it exactly with zpaq v6.40..v6.44 with the command:
zpaq a 10gb.zpaq 10gb -method 2 -noattributes -until 20130728203305
Some virus detectors may complain about some of the .exe files, in particular, paq8hp*.exe, lpaq*.exe and bbb.exe in 10gb/www.mattmahoney.net/dc/ and 10gb/2011/www.mattmahoney.net/dc/. These are false alarms. Some of the programs were packed with upack, which compresses better than upx, but unfortunately proved to be popular with virus writers too.
The test data is designed to test archivers in realistic backup scenarios with lots of already-compressed or hard to compress files and lots of duplicate or nearly identical files. It consists of exactly 10 GB (1010) bytes in 79,431 files in 4006 directories from my Windows laptop collected from 2009 to 2013. The archive preserves file dates but not attributes. Sizes are shown in MB. Contents:
Size Name Description ---- --- ----------- 3240 hg19/ Human genome in FASTA format 1998 benchmarks/ Collection of data compression benchmarks 1584 mingw/ Several versions of MinGW g++ compiler 1212 www.mattmahoney.net/ Backup copy of my website in 2013 730 2011/ dc subdirectory of my website in 2011 678 progs/ Several open source applications 502 cygwin/ Cygwin version from 2009 52 zeropad File of all zero bytes
The human genome is extracted from chromFa.tar.gz found here. It is in FASTA format, one file per chromosome, as text. Each line is 50 bases (A, C, G, T) or N for unknown, mostly near highly repetitive regions such as the centronomes (middles) and telomeres (ends). Other repetitive regions (12 or more repetitions) are lower case. Several variants of a 4 Mb region of chromosome 6 are in separate files with high mutual information. There are a number of small files that could not be matched to the rest of the chromosomes.
The benchmarks directory contains the following:
Size Name Description ---- ---- ----------- 1000 enwik9 XML text from Wikipedia from large text benchmark. 211 silesia/ 12 files from Silesia corpus benchmark. 151 simple/ Synthetic data described in simple.zip. 139 act-jpeg 3 photos from ACT-JPEG in many formats. 100 enwik8/ First 100 MB of enwik9. 78 gimp.tar Gimp v2.0.0 from the (defunct) UCLC benchmark. 75 gimp-2.0.0/ gimp.tar extracted. 56 primes List of prime numbers up to 10^8 as a text file. 53 maxcomp/ 10 files from the Maximum Compression SFC test. 51 wav/ 16 music and speech clips in WAV format. 28 kodak/ 24 images in .bmp format. 14 reuters21578/ Newspaper articles used in machine learning research. 12 waterloo/ 8 color images in .tiff format. 11 cantrbry_large 3 files (text and DNA) from the large Canterbury corpus. 7 protein 4 files from the protein corpus. 3 calgary 14 files from the Calgary corpus. 2 cantrbry 11 files from the Canterbury corpus. 0.4 random.bin A Million Random Digits converted to binary. 0.3 e8test Files designed to test E8E9 filters.
mingw contains 7 versions of the MinGW g++ compiler: 3.4.5, 4.4.0, 4.5.0, 4.6.1, 4.7.0 (32 and 64 bit versions), and 4.8.0. The two versions 4.4.0 and 4.5.0 are also used in the mingw benchmark.
www.mattmahoney.net is a backup of my website from 2013. Most of the files are in already compressed formats. The distribution (in MB) is as follows:
Size Type Description ---- ---- ----------- 1212 All All files. 543 .zip Zip archives, mostly compression software. 205 .pmd Files compressed with ppmd (enwik9.pmd). 192 .zpaq Compressed zpaq archives. 143 .jpg JPEG images. 51 .pdf Race applications and other documents. 14 .txt Mostly race results. 9 .bmp Uncompressed images. 8 .rar RAR archives. 8 .exe Programs. 5 .png Compressed images. 4 .cpp Source code. 4 .epub "Data Compression Explained" in e-reader format. 4 .html HTML text files.
2011 contains the dc (data compression) subdirectory of www.mattmahoney.net as it existed in 2011. It has many (mostly already compressed) files in common with the current version.
progs contains several open source applications.
Size Name Description ---- ---- ----------- 676 Total size 376 OpenOffice 3.org Office software 99 VideoLAN Video player 32 InstallShield Installation Information Leftover files 30 PDFCreator Printer to PDF converter 52 Mozilla Firefox Browser 23 PeaZip Archiver with GUI 15 ImageMagick-6.5.7-Q16 Image processing software 5 MediaInfo Firefox plugins 3 7-Zip Archiver 0 Uninstall Information Empty
The size of zeropad (56 MB) was chosen to make the files total exactly 10 GB.
Overall, the most common file types (sorted by total size in MB) and their compression ratios with zpaq -m2 after deduplication from 10 GB to 8.45 GB are as follows:
Type Files Size Ratio ---- ----- ---- ----- All 83437 10000 .4655 .fa 93 3199 .3489 . 27128 1665 .3433 .zip 461 1035 .9759 .exe 1219 763 .3605 .dll 1672 529 .3559 .a 4665 504 .1190 .pmd 4 410 .9504 .jpg 1491 236 .9455 .zpaq 13 193 1.0001 .h 9377 148 .1455 .tar 5 85 .2052 .mo 1253 82 .2774 .lzma 123 73 .9365 .bmp 135 71 .5980 .pdf 157 57 .9010
1gb.zpaq (mirror) and 100mb.zpaq (mirror) are not official benchmarks but may be useful for testing and tuning compression algorithms. Both test sets are subsets of the files in the 10GB data set, within a minimal subset of the directory tree. File dates but not directory dates are preserved from 10GB. Both data sets were created on Feb. 10, 2014 and compressed with zpaq v6.49 -method 2 -noattributes.
Each set is produced by randomly selecting files from 10GB until exactly 100 MB or 1 GB is reached. Files are selected at each step with the restriction that its size be between 0.1% and 10% of the remaining space. When less than 1000 bytes remain, the size restrictions are removed. The perl script below was used, with input from a listing of d:\10gb.zpaq, then renaming the output directory 10gb to 1gb or 100mb.
#!/usr/bin/perl # Create 1GB database from input `zpaq l d:/10gb.zpaq` while (<>) { if (/>.{32} *(\d+) \d\.\d\d\d\d (.*)/) { push(@a, "$1 $2"); ++$n; } } $t=1000000000; # remove a 0 to make 100MB while ($t>0) { $r=int(rand($n)); if ((($sz,$fn)=$a[$r]=~/(\d+) (.*)/) && $sz>$t/1000 && ($sz<$t/10 || ($sz<=$t && $sz<1000))) { printf("%9d %9d %s\n", $t, $sz, $fn); `zpaq x d:/10gb "$fn" -q`; $t-=$sz; ++$count; $a[$r]=""; } } print "$count of $n files extracted\n";
MD5 hashes
6453fcff794fb515288ea8b38bb38857 10gb.zpaq 994cfce8a0a7c1048c0ceb3ce868b7bf 1gb.zpaq f05f7c80dd7a42ac1665cc28e5da5196 100mb.zpaq
Comparison of data sets.
10GB 1GB 100MB -------------- -------------- -------------- 3,701,584,921 424,346,384 44,304,455 Download size (zpaq) 10,000,000,000 1,000,000,000 100,000,000 Uncompressed size 79,431 1,524 788 Number of files 4,006 825 630 Number of directories 125,895 656,168 126,904 Average file size
Fractional distribution by top level directory.
10GB 1GB 100MB -------------- -------------- -------------- .3240 .0275 .0044 hg19 .1998 .2334 .1912 benchmarks .1584 .3184 .3738 mingw .1212 .1211 .1276 www.mattmahoney.net .0730 .0733 .0492 2011 .0502 .0744 .1104 cygwin .0678 .1516 .1431 progs
Fractional distribution by file type.
10GB 1GB 100MB -------------- -------------- -------------- .3199 .0106 .0044 .fa .1665 .0979 .0409 (no extension) .1035 .1429 .0563 .zip .0763 .1965 .1592 .exe .0529 .0151 .0926 .dll .0504 .1098 .1604 .a .0410 .0000 .0000 .pmd .0236 .0361 .1016 .jpg .0193 .0020 .0014 .zpaq .0147 .0112 .0400 .h .0085 .0787 .0000 .tar .0082 .0107 .0133 .mo .0073 .0120 .0189 .lzma .0071 .0167 .0177 .bmp .0058 .0128 .0332 .pdf .0054 .0102 .0351 .wav .0048 .0015 .0126 .html .0044 .0018 .0097 .py .0042 .0065 .0044 .txt .0037 .0000 .0000 .7z .0035 .0046 .0022 .dat
Compression ratios and + or - differences from 10GB.
10GB 1GB 100MB -------------- -------------- -------------- .8446 +.0640 .9086 +.1308 .9754 Deduplication (zpaq) .5290 +.0097 .5387 +.0295 .5585 exdupe 0.5.0b -x1 .4844 +.0497 .5341 -.0048 .4796 zip .4797 +.0597 .5324 -.0030 .4767 tar|gzip .4473 +.0572 .5045 +.0051 .4524 tar|bzip2 .4166 +.0622 .4778 +.0666 .4832 exdupe 0.5.0b -x2 .3938 +.0237 .4175 +.0010 .3948 rar 5.0 -m5 -ma5 -s .3892 +.0248 .4140 -.0045 .3847 7zip 9.20 .3825 +.0590 .4415 +.0729 .4554 zpaq 6.49 -m 1 .3711 -.0064 .3647 -.0004 .3707 freearc 0.666 .3693 +.0550 .4243 +.0737 .4430 zpaq 6.49 -m 2 .3671 +.0686 .4357 +.0744 .4415 exdupe 0.5.0b -x3 .3595 +.0067 .3662 +.0228 .3823 7zip 9.20 -mx .3391 +.0556 .3947 +.0736 .4127 zpaq 6.49 -m 3 .3060 +.0821 .3881 +.1067 .4127 zpaq 6.49 -m 4 .3029 +.0622 .3651 +.0684 .3713 pcompress 3.0b -l14 .2969 +.0521 .3490 +.0653 .3622 nanozip 0.09a -co -m3.5g .2959 +.0532 .3482 +.0668 .3627 freearc 0.666 -m9 .2936 +.0557 .3493 +.0675 .3611 zpaq 6.49 -m 5 .2791 +.0556 .3347 +.0681 .3472 zpaq 6.49 -m 6
10gb.zpaq is copyright (C) 2013, Matt Mahoney. 1gb.zpaq and 100mb.zpaq are copyright (C) 2014, Matt Mahoney. You are granted permission to download these files for your own use. You are granted permission to distribute exact copies of these files provided that you attribute the source and include a copy of this license or a link to the web page with this license. You may distribute this data in a different format (for example, 10gb.zip or 10gb.tar.gz) with such attribution provided that the exact directory structure is preserved, including file and directory names and last-modified dates. This license does not imply permission to distribute derived works, including any work where any files or removed, added, or modified from the data set.
Many of the files in this data set are copyrighted by other people and licensed under varying terms. Nothing in this license restricts you from using those files under the terms of the original license (for example, applications licensed under GPL). For some files, there is no explicit license. In particular, some documents, software, photos, and other files under the directories 10gb/www.mattmahoney.net and 10gb/2011 were hosted on my website (www.mattmahoney.net) with permission of the owners. I do not have written copies of such permissions, nor in most cases, any records of ownership, which may be hard to determine. Please do not use this data set for any purpose other than benchmarking data compression and archiving programs and related research. The following files and directories are created entirely by me and released to the public domain:
10gb/zeropad 10gb/benchmarks/e8test 10gb/benchmarks/primes 10gb/benchmarks/simple