0

This command:

badblocks -svn /dev/sda

What does it do? Does it just report the bad blocks? Or does it somehow handle the bad blocks so that I don't need to be worried about them?

I read the manual by man badblocks, but I don't get the -n option:


       -s     Show  the  progress  of the scan by writing out rough percentage completion of
              the current badblocks pass over the disk.  Note that badblocks may do multiple
              test  passes  over the disk, in particular if the -p or -w option is requested
              by the user.


       -v     Verbose mode.  Will write the number of read errors, write  errors  and  data-
              corruptions to stderr.


       -n     Use  non-destructive read-write mode.  By default only a non-destructive read-
              only test is done.  This option must not be combined with the  -w  option,  as
              they are mutually exclusive.

The output of running badblocks -svn /dev/sda which lasted for almost two days:

enter image description here

Update

Some posts suggest that after running badblocks -svn /dev/sda, the hard disk controller would take care of bad blocks. Not sure.

to have the hard disk controller replace bad blocks by spare blocks.

https://askubuntu.com/a/490552/507217

If you have fully processed your disk this way, the disk controller should have replaced all bad blocks by working ones and the reallocated count will be increased in the SMART log.

https://askubuntu.com/a/490549/507217

SMART

I checked the SMART table after running the badblocks command by:

smartctl --all /dev/sda

Note that Current_Pending_Sector raw value is 56. It's twice the 28 reported by badblocks. Maybe they are related.

Screenshot

Error interpretation

According to this:

How to interpret badblocks output

badblocks error log is in the form of reading/writing/comparing. In my case, all of 28 errors are reading errors. Meaning no application can read those blocks.

OS logs

I looked at OS logs by sudo journalctl -xe. Actually, SMART is throwing errors about those 56 bad sectors (28 bad blocks):

smartd[1243]: Device: /dev/sda [SAT], 56 Currently unreadable (pending) sectors

Log screenshot

Conclusion

I'd rather backup the data and replace the hard disk before it's too late.

1 Answer 1

4

The "non-destructive read-write mode" triggered by the -n option writes the test data to each block, just like the -w, and forces the disk either to accept the write, to reallocate a faulty block, or to return a write error.

However, its big win is that it first reads the block it's about to overwrite, and re-writes that data after the test data has been written. This means that after badblocks has completed, the disk should contain the same data as it did before it started running.

Process

  1. Read block and save
  2. Write block of test data
  3. Capture status result and report if necessary
  4. Rewrite saved block
  5. Repeat with next block until done

Caveat

Writing a good block of data to a disk will result in expected operation: the block will be written. However, if the write fails, the disk firmware will automatically and transparently remap the block address to one of its spare blocks and retry the write for you at that new location on the disk. Provided that that write is successful you won't know anything different and the disk will seem perfectly normal. (In the SMART table, the Sector Reallocated counter will be increased by one.) Eventually as time progresses the set of spare blocks may get used up, and from this point disk writes that would have been remapped will simply fail.

A full disk write test such as one provided by badblocks with either -w or -n will force writes to all disk blocks, ensuring that they are all available to you, or else highlighting disk blocks that cannot be remapped.

Notice that badblocks does not guarantee you haven't lost data: if it cannot read a block it cannot rewrite it after the test, so it doesn't perform the write test (but does report the block as bad). If badblocks cannot read a block then neither would any other application have been able to do so, and your data is lost.

My recommendation would be that if you get any disk blocks that cannot be remapped you replace the disk as soon as possible because you no longer have any safety net. (Personally, I would replace such a disk before reaching this stage.) The ddrescue tool may help in copying data from this broken disk to a new one.

11
  • 1
    The real question is, what causes badblocks -svn to report a bad block? I haven’t checked, but I imagine that if it fails the read before the test write, then presumably nothing is written, so blocks which can’t be read aren’t reallocated either, and they will still cause trouble in the future. Commented Dec 21, 2021 at 10:06
  • 1
    @user3405291 I've added some tangential explanation for you. Hope this is useful Commented Dec 21, 2021 at 10:48
  • 1
    @roaima Thanks. I'm going to backup the data and replace the hard disk before it's too late.
    – Megidd
    Commented Dec 21, 2021 at 11:15
  • 1
    @StephenKitt you can also have the situation where the read succeeds but the write fails, losing data that was otherwise just about viable. There's no easy way out at this point. But please do edit my answer if you think it's unclear Commented Dec 21, 2021 at 12:06
  • 1
    @roaima yes, and when badblocks writes the original data back, failures are actually ignored AFAICT... Commented Dec 21, 2021 at 12:15

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .