1

My Samsung 970 EVO M.2 500GB SSD (MZ-V7E500BW) suddenly failed yesterday during a power outage.

I now have a warning during POST ("WARNING! Please back up your data and replace your hard disk drive. WARNING! Your HDD/SSD might crash at any moment."). The last time I rebooted before this was about 5 days earlier, and the warning was not present then.

By booting a live USB stick I managed to check the SMART log:

Smart Log for NVME device:nvme0 namespace-id:ffffffff
critical_warning                        : 0x8
temperature                             : 49 C
available_spare                         : 29%
available_spare_threshold               : 10%
percentage_used                         : 0%
endurance group critical warning summary: 0
data_units_read                         : 4,948,748
data_units_written                      : 20,573,476
host_read_commands                      : 100,316,217
host_write_commands                     : 357,643,056
controller_busy_time                    : 1,790
power_cycles                            : 24
power_on_hours                          : 4,570
unsafe_shutdowns                        : 11
media_errors                            : 41
num_err_log_entries                     : 70
Warning Temperature Time                : 0
Critical Composite Temperature Time     : 0
Temperature Sensor 1           : 49 C
Temperature Sensor 2           : 74 C
Thermal Management T1 Trans Count       : 0
Thermal Management T2 Trans Count       : 0
Thermal Management T1 Total Time        : 0
Thermal Management T2 Total Time        : 0

Messages from the kernel mentioning nvme during startup of the live USB OS:

Oct 26 19:18:58 ubuntu kernel: [    1.233479] nvme nvme0: pci function 0000:06:00.0
Oct 26 19:18:58 ubuntu kernel: [    1.243303] nvme nvme0: missing or invalid SUBNQN field.
Oct 26 19:18:58 ubuntu kernel: [    1.243323] nvme nvme0: Shutdown timeout set to 8 seconds
Oct 26 19:18:58 ubuntu kernel: [    1.252449] nvme nvme0: 4/0/0 default/read/poll queues
Oct 26 19:18:58 ubuntu kernel: [    1.254855]  nvme0n1: p1 p2 p3
Oct 26 19:18:58 ubuntu kernel: [    3.629244] EXT4-fs (nvme0n1p2): INFO: recovery required on readonly filesystem
Oct 26 19:18:58 ubuntu kernel: [    3.629246] EXT4-fs (nvme0n1p2): write access will be enabled during recovery
Oct 26 19:18:58 ubuntu kernel: [    3.674861] blk_update_request: critical medium error, dev nvme0n1, sector 124928 op 0x1:(WRITE) flags 0x800 phys_seg 4 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.674893] Buffer I/O error on dev nvme0n1p2, logical block 0, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.674913] Buffer I/O error on dev nvme0n1p2, logical block 1, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.674931] Buffer I/O error on dev nvme0n1p2, logical block 2, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.674949] Buffer I/O error on dev nvme0n1p2, logical block 3, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.674967] blk_update_request: critical medium error, dev nvme0n1, sector 133200 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.674995] Buffer I/O error on dev nvme0n1p2, logical block 1034, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.675013] blk_update_request: critical medium error, dev nvme0n1, sector 133384 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675040] Buffer I/O error on dev nvme0n1p2, logical block 1057, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.675059] blk_update_request: critical medium error, dev nvme0n1, sector 147176 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675086] Buffer I/O error on dev nvme0n1p2, logical block 2781, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.675105] blk_update_request: critical medium error, dev nvme0n1, sector 4319360 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675132] Buffer I/O error on dev nvme0n1p2, logical block 524304, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.675151] blk_update_request: critical medium error, dev nvme0n1, sector 4319488 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675178] Buffer I/O error on dev nvme0n1p2, logical block 524320, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.675197] blk_update_request: critical medium error, dev nvme0n1, sector 4319544 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675224] Buffer I/O error on dev nvme0n1p2, logical block 524327, lost async page write
Oct 26 19:18:58 ubuntu kernel: [    3.675243] blk_update_request: critical medium error, dev nvme0n1, sector 4319816 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675270] blk_update_request: critical medium error, dev nvme0n1, sector 4320256 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.675297] blk_update_request: critical medium error, dev nvme0n1, sector 4320936 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Oct 26 19:18:58 ubuntu kernel: [    3.729319] EXT4-fs (nvme0n1p2): error loading journal
Oct 26 19:18:58 ubuntu kernel: [    3.743157] EXT4-fs (nvme0n1p3): INFO: recovery required on readonly filesystem
Oct 26 19:18:58 ubuntu kernel: [    3.743158] EXT4-fs (nvme0n1p3): write access will be enabled during recovery
Oct 26 19:18:58 ubuntu kernel: [    3.806113] EXT4-fs (nvme0n1p3): error loading journal
Oct 26 19:19:04 ubuntu kernel: [   30.724414] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:19:04 ubuntu kernel: [   30.752254] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:19:05 ubuntu kernel: [   31.346630] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:19:05 ubuntu kernel: [   31.365831] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:19:29 ubuntu kernel: [   55.502099] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:19:29 ubuntu kernel: [   55.516704] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:24:44 ubuntu kernel: [  370.116101] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
Oct 26 19:24:44 ubuntu kernel: [  370.130330] blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0

Thanks to ddrescue I managed to clone all of its partitions to a different machine over the network. There were IO errors while extracting both ext4 partitions but with enough retries it eventually got everything.

After that I was able to run e2fsck on the images, which appeared to succeed, and now I can mount them as read-only loop devices. Data appears to be intact.

I suppose the first question is is there anything I can do to fix whatever the problem is, and keep using this drive? I'm assuming not, but I'm definitely open to suggestions.

If I try to run fsck on one of the partitions from the live USB, this is what happens. I tried all combinations of answers to the questions as you'll see below. I can't understand enough of the manual pages and don't know enough about filesystems or drives to know what options, if any, might help me.

ubuntu@ubuntu:~$ sudo fsck /dev/nvme0n1p3
fsck from util-linux 2.36.1
e2fsck 1.46.3 (27-Jul-2021)
/dev/nvme0n1p3: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? yes
fsck.ext4: Input/output error while recovering journal of /dev/nvme0n1p3
fsck.ext4: unable to set superblock flags on /dev/nvme0n1p3


/dev/nvme0n1p3: ********** WARNING: Filesystem still has errors **********

ubuntu@ubuntu:~$ sudo fsck /dev/nvme0n1p3
fsck from util-linux 2.36.1
e2fsck 1.46.3 (27-Jul-2021)
/dev/nvme0n1p3: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? no
Clear journal<y>? no
fsck.ext4: Input/output error while recovering journal of /dev/nvme0n1p3
fsck.ext4: unable to set superblock flags on /dev/nvme0n1p3


/dev/nvme0n1p3: ********** WARNING: Filesystem still has errors **********

ubuntu@ubuntu:~$ sudo fsck /dev/nvme0n1p3
fsck from util-linux 2.36.1
e2fsck 1.46.3 (27-Jul-2021)
/dev/nvme0n1p3: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? no
Clear journal<y>? yes
fsck.ext4: Input/output error while recovering journal of /dev/nvme0n1p3
fsck.ext4: unable to set superblock flags on /dev/nvme0n1p3


/dev/nvme0n1p3: ********** WARNING: Filesystem still has errors **********

ubuntu@ubuntu:~$ 

I believe the drive is still under warranty, and I'm trying to get in contact with Samsung support to try to get a replacement or refund.

If they ask me to send it back, that's going to pose a problem since there's sensitive data on this drive.

The drive resists all attempts to write to it. I can't mount it and write to it normally. The kernel emits IO errors if I try to write to it at the block level. Even Samsung's secure erase tool (their Windows-only software offers to produce a bootable USB drive with such a tool) fails.

Is there some way to force secure erasure of this device?

5
  • Does your computers bios have an option to secure erase the drive? Failing that, hdparm does have some security erase options
    – Bravo
    Commented Oct 26, 2021 at 23:32
  • I found a secure erase tool in there but it doesn't do NVMe. I also found an old Windows hard drive, put that in, installed Samsung's software, and hoped that would let me clear the "read only" flag but nope. It also has a firmware update option, which fails (possibly because the drive is read-only?), and it also offers to make a bootable USB stick with an NVMe secure eraser tool. I did this, and that too fails, presumably also because the drive is read-only. So I have to ask now... what's the best way to destroy the NVMe without signs of physical damage...?
    – tremby
    Commented Oct 27, 2021 at 0:10
  • I can look into hdparm but in all the hoops I've jumped through today I've overwritten my Linux live USB. Writing it again now...
    – tremby
    Commented Oct 27, 2021 at 0:13
  • I'm still waiting for a response from Samsung about whether there's some way I can reset the flag, but in the mean time I got permission from Amazon, to whom I'm returning the item, to physically damage it, and they say they'll still refund me. I'll give Samsung one more day, then I guess I'll pull some chips off the thing.
    – tremby
    Commented Oct 28, 2021 at 22:25
  • guess that's easier than bombarding it with cosmic radiation :D
    – Bravo
    Commented Oct 28, 2021 at 23:32

2 Answers 2

0

Your are never too carefully : I have managed to get many files from a file system attacked by the Chernobyl virus !

To erase data, you can do a dd bs=1M if=/dev/zero of=/dev/… but if it stops somewhere, you may have to restart it with the skip option and skip some blocks.

With NVRAM, this could not erase some block but reallocates some… but we would need a really low level access to get the unerased blocks. Ok for personal sensitive data, not for military secret data !

5
  • Trying to write to it with dd exits with a success code, but the kernel emits errors like blk_update_request: critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0. If I run sync and then read off the first bytes, I can see that they were not changed.
    – tremby
    Commented Oct 26, 2021 at 21:20
  • 1
    How does this answer the OP's question(?) Commented Oct 27, 2021 at 1:45
  • You can try dd with the seek=… option in case only the first sectors can’t be written. @LinuxSecurityFreak : the OP asked for a way to erase a file system which can’t be mounted. I answer with dd which is typically a program designed to do this kind of job. Commented Oct 27, 2021 at 6:26
  • Yeah, I don't think it was necessarily a bad answer. It didn't help in my case (doesn't help with seek either), and my clarification in my question about the drive resisting all attempts to write to it was added after the answer was given. I don't think it deserves downvotes!
    – tremby
    Commented Oct 28, 2021 at 22:27
  • Did you install nvme-cli? If you can do the sanitize command which totally & quickly erases the entire drive. wiki.archlinux.org/title/Solid_state_drive/Memory_cell_clearing
    – oldfred
    Commented Jul 6, 2023 at 21:46
0

The only thing that might work is the "secure erase" command, but normally SSDs that have gone readonly as a result of media defects can not be written to because the internal firmware cannot record the writes anywhere without risking losing the remaining data (which is what you want, but normally users want to rescue what they can).

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .