There are two parts of explanation for answering your question.
I. NPY vs. NPZ
As we already read from the doc, the .npy
format is:
the standard binary file format in NumPy for persisting a single arbitrary NumPy array on disk. ... The format is designed to be as simple as possible while achieving its limited goals. (sources)
And .npz
is only a
simple way to combine multiple arrays into a single file, one can use ZipFile to contain multiple “.npy
” files. We recommend using the file extension “.npz
” for these archives. (sources)
So, .npz
is just a ZipFile containing multiple “.npy
” files. And this ZipFile can be either compressed (by using np.savez_compressed
) or uncompressed (by using np.savez
).
It's similar to tarball archive file in Unix-like system, where a tarball file can be just an uncompressed archive file which containing other files or a compressed archive file by combining with various compression programs (gzip
, bzip2
, etc.)
II. Different APIs for binary serialization
And Numpy also provides different APIs to produce these binary file output:
np.save
---> Save an array to a binary file in NumPy .npy
format
np.savez
--> Save several arrays into a single file in uncompressed .npz
format
np.savez_compressed
--> Save several arrays into a single file in compressed .npz
format
np.load
--> Load arrays or pickled objects from .npy
, .npz
or pickled files
If we skim the source code of Numpy, under the hood:
def _savez(file, args, kwds, compress, allow_pickle=True, pickle_kwargs=None):
...
if compress:
compression = zipfile.ZIP_DEFLATED
else:
compression = zipfile.ZIP_STORED
...
def savez(file, *args, **kwds):
_savez(file, args, kwds, False)
def savez_compressed(file, *args, **kwds):
_savez(file, args, kwds, True)
Then back to the question:
- If only use
np.save
, there is no more compression on top of the .npy
format, only just a single archive file for the convenience of managing multiple related files.
- If use
np.savez_compressed
, then of course less memory on disk because of more CPU time to do the compression job (i.e. a bit slower).