I wonder if there are filesystem enhancements/flags that could help mitigate the issues here? Linux ext4 is rather well-known as being far less fragmentation-prone than NTFS, because it too rearranges blocks on the fly when writing file data, and decouples metadata writes (to the journal) from file block writes. But that could worsen things with SMR, especially in the default layout where every write involves both journal writes and filesystem block updates, in completely separate areas of the disk.
Kernel dev Ted Ts'o and others looked into this way back in 2017, and came up with an "ext4-lazy" variation on the standard filesystem implementation that avoided triggering a lot of SMR's worst behaviors. Unfortunately, it looks like those patches never made it into the kernel, and as it's been four years I wouldn't hold out much hope that they will.
But, still, their work pointed to some adjustments to the current ext4 implementation that might benefit you some:
- You could try growing the journal to the maximum allowable size, 40GB with 4K blocks. (It's 10,240,000 blocks max, so 10GB with 1K blocks.) You'd do that with
tune2fs -J size=40000 /dev/foo
. Ts'o's research showed that using a large journal as a write cache was a significant improvement.
- Coupled with the previous, you could turn on eager data journaling with
tune2fs -o journal_data /dev/foo
, so that "all data (not just metadata) is committed into the journal prior to being written into the main file system."
- For the absolute maximum bang for your buck, you could move the filesystem's journal to a separate device entirely (one that's not SMR), so that writes to the SMR disk only happen for data blocks and only to the area where the data is stored. You'd do that by formatting a journal device on another drive using
mke2fs -O journal_dev /dev/bar
. "Note that [/dev/bar] must be formatted with the same block size as file systems which will be
using it." then you'd set that device as the journal for /dev/foo
using tune2fs -J device=/dev/bar /dev/foo
.
If you're going to do the latter, forget about the first suggestion, as the size of the journal will just be the size of /dev/bar
. (Though I suspect it still won't use any more than 40GB, so on the bright side you don't need a big journal disk!) Needless to say, it's probably safest to try making any of these adjustments only after first unmounting the filesystem in question, and even though I'm pretty sure tune2fs
would balk if asked to do anything destructive, for maximum safety it's a good idea to take a backup, or at least experiment on a throwaway filesystem first.