How does a 5 GB gzipped file get read into memory and compressed ? Does the whole file need to be read into memory before decompression ? My question is related to processing gziped files in Hadoop, which cannot split processing as it does for non compressed files. What about bzip2 ? any differences ?
Thanks,