Can rsync --remove-source-files to the same drive reduce fragmentation?

Question

I know that rsync with --remove-source-files (which I use instead of mv so I can merge directory hierarchies) creates new inodes:

stat 2021_07_30_20_18_17.pdf~
rsync --remove-source-files 2021_07_30_20_18_17.pdf~ 2021_07_30_20_18_17.pdf~.moved
stat 2021_07_30_20_18_17.pdf~.moved

Device: 805h/2053d  Inode: 4850411     Links: 1
Device: 805h/2053d  Inode: 4849693     Links: 1

Since it's allocating a new inode, does that mean it allocates new space for the target file? Or does --remove-source-files just make the new inode point to the memory locations of the original file?

Background

The reason I'm asking is because I've got a drive that is very slow due to huge directory hierarchies with big and small files mixed together) and I'm guessing this makes fragmentation worse. Since most Linux file systems don't experience fragmentation, there isn't a simple tool to defragment like on Windows.

I know I can rsync to a new drive to reduce fragmentation, but what about to the same drive? Does moving of files work the same way from a memory allocation perspective?

grawity_u1686 · Accepted Answer · 2022-10-10 18:15:26Z

rsync always runs in two-process mode – there's still a "sender" process and a "receiver" process (just a fork of the parent), exchanging data through a socketpair in the same way as they'd communicate across the network. This means that even a local copy/move still involves reading the entire original file, streaming it to the rsync receiver process, and writing all of it to the new file.

(This architecture is specific to rsync, and not necessarily applicable to other tools. For example, cp A B or even cat A > B do not guarantee a full copy will be done – they might copy data in full but they might also deliberately ask the filesystem to link existing data to the new file.

Currently ext4 does not support such links at all, however; "reflink"-based copies are only found in Btrfs and XFS. So a cp A B on ext4 will at this time result in a full copy.)

Since most Linux file systems don't experience fragmentation, there isn't a simple tool to defragment like on Windows.

They do¹, and there are tools – e2fsprogs (the official ext4 toolset) has e4defrag, btrfs-progs similarly has btrfs fi defrag, and so on.

¹ Just not as much as e.g. FAT16/FAT32 would (ext2 was designed in part to avoid the fragmentation issues that the previous Linux "ext" filesystem used to have), but that doesn't mean they somehow avoid fragmentation completely, or even significantly more so than NTFS. In particular, any kind "copy on write" filesystem (such as Btrfs or ZFS or NILFS) will by design become heavily fragmented when files are overwritten in place.

Thanks for both halves of the answer. The network part I never knew and it is very helpful for understanding. I'll have to try e4defrag to see if it helps. — Sridhar Sarnobat, Commented Oct 10, 2022 at 18:09

Stack Exchange Network

Can rsync --remove-source-files to the same drive reduce fragmentation?

Background

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
rsync
ext4
defragment
virtual-memory
inode
.

Hot Network Questions

Can rsync --remove-source-files to the same drive reduce fragmentation?

Background

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged rsyncext4defragmentvirtual-memoryinode.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
rsync
ext4
defragment
virtual-memory
inode
.