15

I noticed that every time I copy a large amount of data to my hard drive, it is still working after the file transfer is over. It makes some sounds as if it were still being written on, as if it were defragmenting.

Now it is placed on a Linux server with Raspberry Pi, but also before, when we connected it to the laptop with Windows, it did the same thing. And I didn't have any defragmentation program activated... The bigger the amount of data, the longer it takes until it stops working at... god knows what. It is a WD Blue 1TB connected to USB 3.0. It has EXT4 partition under Linux and had NTFS too before.

What do you think it is? Should I get worried?

5
  • 1
    Are you seeing the drive doing work (e.g., a status light on the drive enclosure)? Or are you seeing traffic (e.g., a status light on the USB connection)? The drive could be SMR (as described very well in an answer). The connection would be an OS related issue (e.g., flushing a write cache). Commented Dec 25, 2022 at 16:06
  • It doesn't have a led. I just hear it doing sound like it is working to read or write something... Commented Dec 26, 2022 at 11:11
  • 2
    For how long do these drive sounds continue after the transfer has completed? That may be relevant for answering the question.
    – marcelm
    Commented Dec 26, 2022 at 13:03
  • 1
    About 30 minutes... after I write hundreds of GB. Commented Dec 26, 2022 at 15:50
  • 1
    Ah, in that case it won't be the OS flushing write caches. Mokubai's SMR theory seems very plausible.
    – marcelm
    Commented Dec 26, 2022 at 18:36

3 Answers 3

30

Depending on the specific model of drive it may be using SMR technology. Shingled Magnetic Recording is a technology that can increase the density of data written and make more effective use of the physical space on the drive but it has a couple of costs.

  • Due to the overlapping of data the writing of data to the SMR area of the drive is actually slower and more difficult to write. It can only be written in strips rather than randomly like a CMR drive.
  • as the SMR area is slower many SMR drives have a more conventional "old style" CMR/PMR zone which they use to cache incoming writes before being copied to the SMR zone by the disk controller.

The net result is the behaviour you see. The drive accepts data seemingly normally and then spends some time afterwards rearranging data on the disk as it copies the data from the CMR zone to the SMR zone.

Half an hour seems excessive, but it could be moving the data in small packets over time to keep the drive interface available for use.

NASCompares has a list of drives and whether they use normal CMR or SMR technology at List of WD CMR and SMR hard drives (HDD). You can filter the list by make, model and so on to find your specific drive.

I have seen some drives that literally look like they have died during writing. They accept several GB of data and then writes drop to zero for some period of time while the drive reorganises data to clear the CMR zone. After the drive has freed up the "cache" then it carries on writing but for a period of time measured in minutes the drive is as good as unusable.

As long as you are aware that you have one of these drives it is not a problem, but if you don't know then it can look a lot like the drive is defective or doing something very odd.

16
  • 5
    I checked the list, it's SMR ! :( WD Blue 1TB WD10SPZX Commented Dec 25, 2022 at 10:03
  • 4
    And that's why SMR drives are ok if you write an office document every half hour, but nothing much else. And if you use your PC for nothing much else than writing office documents every half hour, the "advantage" of SMR drives, providing more disk space on less physical space, isn't even needed. Commented Dec 25, 2022 at 11:45
  • 3
    There is no sane reason (other than flawed metrics like GB/$) to buy a SMR disk.
    – fraxinus
    Commented Dec 25, 2022 at 19:12
  • 3
    There was quite a bit of controversy a while ago about hard drive manufacturers not labeling their drives as SMR arstechnica.com/gadgets/2020/04/…
    – qwr
    Commented Dec 25, 2022 at 19:31
  • 4
    "As long as you are aware that you have one of these drives it is not a problem" 🡄 Oh, you have a problem all right, just a different one than we're talking about here.
    – davidbak
    Commented Dec 26, 2022 at 18:34
4

Default Linux behavior is to cache in RAM. As a result the Read can finish much sooner than the Write. Depending upon sizes and speeds of the devices this difference can be quite large, especially with a lot of RAM available.

I'm not sure why "Complete" is announced based upon the read but you need to take care to allow the write to finish before removing the drive.

Another possibility is related to relative Block Sizes. If your Read device has a gazillion tiny files with a small block size and you Write device has a large block size, you can completely overwhelm the Write device writing tiny files that take up large blocks.

2
  • 1
    I have only 4GB of RAM and the "extra work" continues about 30 minutes after I finished the transfer... I don't think it's cache related... Commented Dec 24, 2022 at 22:54
  • 6
    One can run the sync command to make sure all write buffers have been flushed. But most file managers take this into account in their file copy progress windows.
    – jpa
    Commented Dec 25, 2022 at 6:33
3

I wonder if there are filesystem enhancements/flags that could help mitigate the issues here? Linux ext4 is rather well-known as being far less fragmentation-prone than NTFS, because it too rearranges blocks on the fly when writing file data, and decouples metadata writes (to the journal) from file block writes. But that could worsen things with SMR, especially in the default layout where every write involves both journal writes and filesystem block updates, in completely separate areas of the disk.

Kernel dev Ted Ts'o and others looked into this way back in 2017, and came up with an "ext4-lazy" variation on the standard filesystem implementation that avoided triggering a lot of SMR's worst behaviors. Unfortunately, it looks like those patches never made it into the kernel, and as it's been four years I wouldn't hold out much hope that they will.

But, still, their work pointed to some adjustments to the current ext4 implementation that might benefit you some:

  • You could try growing the journal to the maximum allowable size, 40GB with 4K blocks. (It's 10,240,000 blocks max, so 10GB with 1K blocks.) You'd do that with tune2fs -J size=40000 /dev/foo. Ts'o's research showed that using a large journal as a write cache was a significant improvement.
  • Coupled with the previous, you could turn on eager data journaling with tune2fs -o journal_data /dev/foo, so that "all data (not just metadata) is committed into the journal prior to being written into the main file system."
  • For the absolute maximum bang for your buck, you could move the filesystem's journal to a separate device entirely (one that's not SMR), so that writes to the SMR disk only happen for data blocks and only to the area where the data is stored. You'd do that by formatting a journal device on another drive using mke2fs -O journal_dev /dev/bar. "Note that [/dev/bar] must be formatted with the same block size as file systems which will be using it." then you'd set that device as the journal for /dev/foo using tune2fs -J device=/dev/bar /dev/foo.

If you're going to do the latter, forget about the first suggestion, as the size of the journal will just be the size of /dev/bar. (Though I suspect it still won't use any more than 40GB, so on the bright side you don't need a big journal disk!) Needless to say, it's probably safest to try making any of these adjustments only after first unmounting the filesystem in question, and even though I'm pretty sure tune2fs would balk if asked to do anything destructive, for maximum safety it's a good idea to take a backup, or at least experiment on a throwaway filesystem first.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .