Background :
I attempted to recover data from a failing HDD using ddrescue on a recent Knoppix environment. At first I stored the output to a NTFS partition, but at some point the copying rate became consistently very slow (~600KB/s), and I read on some french guide for that tool that recovering to NTFS was not recommanded, especially for a large volume, as it had been reported to result in a major slowdown. I then switched to ext4 and the performance seemed to improve significantly. (See this question.)
I noticed that the “size on disk” for the output image files generated by ddrescue was much lower than their actual size, meaning that ddrescue somehow only allocates the data which has been successfully read from the input, resulting in a “partially sparse” output, even when the -S switch (“sparse writes”) is not used. In this case, using the -S switch results in the empty sectors which are actually read from the input, being also unallocated in the output, thus making it “fully” sparse. (See that other question.)
Question :
How could this performance gap be explained between NTFS and ext4 ? Is it indeed related to sparseness ? Is the Linux NTFS driver known to have trouble dealing with large sparse files ?