Linux, how to change HDD state from ReadOnly after temporarly crash?

Question

At this time no ansver for this problem.

Usually after some problems with readings or writings to block device, kernel decides to switch flag for WHOLE DEVICE as read-only. After this any writings to any partition / filesystem located on this device cause switch it as readonly together with device state, because any writings are impossible.

Example from dmesg, this is simulation for guest linux on windows8 using VirtualBox when defrag takes guests device image:

[11903.002030] ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11903.003179] ata3.00: failed command: READ FPDMA QUEUED
[11903.003364] ata3.00: cmd 60/08:00:a8:77:57/00:00:00:00:00/40 tag 0 ncq 4096 in
[11903.003385]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11903.004074] ata3.00: status: { DRDY }
[11903.004248] ata3: hard resetting link
[11903.325703] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11903.327097] ata3.00: configured for UDMA/133
[11903.328025] ata3.00: device reported invalid CHS sector 0
[11903.329664] ata3: EH complete
[11941.000472] ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11941.000769] ata3.00: failed command: READ FPDMA QUEUED
[11941.000952] ata3.00: cmd 60/08:00:c8:77:57/00:00:00:00:00/40 tag 0 ncq 4096 in
[11941.000961]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11941.001353] ata3.00: status: { DRDY }
[11941.001504] ata3: hard resetting link
[11941.320297] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11941.321252] ata3.00: configured for UDMA/133
[11941.321379] ata3.00: device reported invalid CHS sector 0
[11941.321553] ata3: EH complete
[11980.001746] ata3.00: exception Emask 0x0 SAct 0x11fff SErr 0x0 action 0x6 frozen
[11980.002070] ata3.00: failed command: WRITE FPDMA QUEUED
[11980.002255] ata3.00: cmd 61/18:00:28:23:59/00:00:00:00:00/40 tag 0 ncq 12288 out
[11980.002265]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
-------------------
There are many other errors, like "lost write page", "Journal has aborted", "Buffer I/O error", "hard resetting link" and many others.

After this, remount cause:

mount / -o remount,rw
mount: cannot remount block device /dev/sda1 read-write, is write-protected

because WHOLE device sda keeping rootfs sda1 is READONLY.

In my experience this occurs in situations:

HDD is really damaged. Returned writing problems are depended on HDD condition
Host machine is overloaded, then linux guest virtual HDD writings are timeouted
FC cable or SAN device (array disks over Fibre Channel) is overloaded
Momentary lost connection over FC or FCoE. Maybe lost/timeouted FC packet

At this situations device is really read-write, but linux kernel marks this device internally as read-only and is used as read-only. This is kernel functionality maked for damage prevention, but it is useable only at 1. point.

Question is. How to manually tell to kernel, hdd block device operates normally?

Witiout this, kernel serve device as read-only, like 'CD-ROM', and no other command has chance to works properly, including mount/remount -o read-write , fsck and others.

Unusable ansvers, really qualified as spam from people who wants to help, but doesn't understand about problem nature:

Try remount as read-write (impossible, device is R-O)

fsck this (what for? device is R-O, no repair is possible)

'I don't know' (first with sense, but unusable)

'Replace your device' *(usually the problem is something else)

Has anybody any formula for question above? Switch flag for writeable block device that reverts it from read-only to read-write state ? At this time it seems that no-one know how.

It is some workarounds, but usually semiusable or unusable:

Remove module supports access to specified hdd or storage array. Unfortunately usually damaged device keeps rootfs, or driver keeps both damaged device and device that keeps rootfs
Remove FC access to device and join this again (fctools), not allways possible, not allways works.
Restart WHOLE machine. Usually only this is allways possible and we allways forced to.

At points 1. and 2. we tell to kernel that we completly disconnect device and connect to it again. Kernel recognized this as joining new properly operatings device. We can simulate this using USB device and momentary remove power. Point 3. is last chance and usually works. But why we should restart all? Unfortunately at all points we lost all journals updates and dirty buffers.

Notice, at the same situations I have no problems with Windows (desktop and server).

Not an answer, but possibly related in case of #2 (high host load, guest hdd timeout): Increase the Linux hdd timeout to prevent filesystem corruption caused by hdd timeouts in guest system. — basic6, Commented Aug 22, 2016 at 10:57
@Znik, are these guest virtual machines running on Citrix XenServer? Or physical hardware? Our StorageServer bridges from the land of ethernet to land of mini-sas. When this bridge machine panics, it has to be forcefully rebooted. Windows guest VMs come back. Linux guest virtual machines exhibit the same exact problem you have. Nothing suggested here brings the mount points back to rw. — rjt, Commented Jul 23, 2017 at 3:42
@rjt, this occurs in many situations. Main situation is where device is extremally slow down with any problem, like physical damage, device overload, cabling, external FC over Eth and eth is overloaded, sometimes switch reset when transfer block, timeout, lost packet etc. Device usually is still visible, but marked as readonly. Reboot is not resolution, it is workaround as I described at the main question / problem description. — Znik, Commented Aug 4, 2017 at 7:29

Jose Luis Martin · Accepted Answer · 2013-04-29 15:21:05Z

16

try with blockdev --setrw or hdparm -r 0

answered Apr 29, 2013 at 15:21

Jose Luis Martin

4173 silver badges4 bronze badges

thanks, this should be usefull. I'm waiting for any timeout on fc controller
– Znik
Commented May 6, 2013 at 12:47
1

An important part that needs to be added: Sometimes it is necessary to do a fsck on the read-only file system, before it can be mounted again.
– anon
Commented Nov 20, 2016 at 16:08
3

Diddnt work for me. i have similar problem
– Jono
Commented Apr 4, 2017 at 18:24
1

Did not work for me even with fsck. Citrix XenServer Linux guests.
– rjt
Commented Jul 23, 2017 at 3:52
Not Working ! This commands seem effective, but the dongle is still RO. (it is software, but from where???) If you want to try, take any Debian iso 9.4.
– Sandburg
Commented May 20, 2018 at 13:26

Add a comment |

Roberto · Accepted Answer · 2016-05-24 20:40:43Z

7

Like Jose Luis Martin suggested use blockdev, my 2cent is to do a remount rw and forcefsck

(assuming sda is your disk)

blockdev --setrw /dev/sda
mount /dev/sda -o remount,rw
touch /forcefsck

answered May 24, 2016 at 20:40

Roberto

711 silver badge2 bronze badges

1

It makes more sense to just run fsck before the mount, as it will fail to mount without fsck. (At least in my case it did.)
– anon
Commented Nov 20, 2016 at 16:09
` # blockdev --setrw /dev/xvda1 # # touch /tmp/date +%Y%m%d-%H%M%S touch: cannot touch ?/tmp/20170722-221904?: Read-only file system # # mount -o remount,rw /dev/xvda1 [137010.709883] EXT4-fs error (device xvda1): ext4_remount:4824: Abort forced by user mount: cannot remount block device /dev/xvda1 read-write, is write-protected `
– rjt
Commented Jul 23, 2017 at 3:20

Add a comment |

UnX · Accepted Answer · 2013-05-01 22:18:03Z

3

Check this wiki page, it explains the error thrown by libata :

https://ata.wiki.kernel.org/index.php/Libata_error_messages

From what I see above, you got a timeout issue and as per the document mentioned :

Controller failed to respond to an active ATA command. This could be any number of causes. Most often this is due to an unrelated interrupt subsystem bug (try booting with 'pci=nomsi' or 'acpi=off' or 'noapic'), which failed to deliver an interrupt when we were expecting one from the hardware.

You may want to disable ACPI ( check how to based on your distro) or check you kernel for known bugs and possibly update it if it is not the latest ( or downgrade it).

answered May 1, 2013 at 22:18

UnX

8415 silver badges8 bronze badges

Yes, this is really timeout. Usually this occurs on FC controller when array device is overloaded. You're right, on local ATA subsystem this is usually any hardware bug or driver/chipset implementation
– Znik
Commented May 6, 2013 at 12:49
So it's a timeout? Well, what does sudo hdparm -I /dev/sdX | grep locked say? It must say: 'not locked'. It showed these enigmatic timeouts in the past here whenever a HDD was locked by ATA password (due to a previous security erase and a system crash later which caused the security pw not to be cleared again). This password stuff really has a huge impact, also on your nerves.:) Even standard tools shipped by your HD drive vendor behave crazily, as if the HDD is about to die when the password is active. The culprit for countless tufts of hair torn out through the years.
– syntaxerror
Commented Dec 3, 2014 at 20:56

Add a comment |

samson · Accepted Answer · 2021-03-18 17:09:59Z

0

This is not precisely the situation referred to in the question, but I came here through google and thought others might find my experience useful: HFS+ volumes will be mounted as read-only in Ubuntu derivatives (and probably other distros), so you'll have to disable journaling. There's a fix here which mostly worked! I needed access to a Mac to do it, though.

answered Mar 18, 2021 at 17:09

samson

1011 bronze badge

Add a comment |

Chris Nzoka-okoye · Accepted Answer · 2022-07-23 07:46:38Z

###Hello, the following commands can help. However, it is not safe to unmount or attempt to modify the root filesystem of a running drive. Instead, boot the system from a bootable device.

Locate the drive on the system

$ mount | grep /dev/

Unmount the drive

$ sudo umount <your-mount-point-name>

Check and repair the file system with any of the following commands

###for an ext4 device

$ sudo fsck.ext4 -f /dev/sda1

###for a dos device

$ sudo dosfsck -a /dev/sda1

###or you can simply run thefsck command.

$ sudo fsck /dev/sda1

Remount the device

$ sudo mkdir <your-mount-point-name>

This will create a new mount point. Then run:

$ sudo mount -o rw,uid=1000,gid=1000,user,exec,umask=003,blksize=4096 /dev/sdc1 /media/<your-mount-point-name>

You're good to go. However, for more details on the commands you can check out Baeldung

You say “boot the system from a bootable device.”   Well, the OP already said that rebooting fixes the problem, so you seem to misunderstand what the question is about. — G-Man Says 'Reinstate Monica', Commented Jul 23, 2022 at 23:14

awas · Accepted Answer · 2019-11-19 16:54:55Z

-1

Reboot in windows 10, go to power options and turn off fast shutdown. then reboot to linux ..gbamm all is fine.

fast shutdown in windows 10 hibernates some files and the drive is partly used. so linux sees is as busy.

answered Nov 19, 2019 at 16:54

awas

71 bronze badge

Add a comment |

John · Accepted Answer · 2020-06-21 03:06:52Z

-2

After months of everything working fine, I had the problem of linux sometimes seeing the disk drive shared with windows as read-only. I eventually tracked down what had changed. FIX: Turn off windows 10 "storage sense", which automates disk cleanup.

answered Jun 21, 2020 at 3:06

John

1

Add a comment |

Stack Exchange Network

Linux, how to change HDD state from ReadOnly after temporarly crash?

7 Answers 7

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
readonly
mount
.

Linked

Hot Network Questions

Linux, how to change HDD state from ReadOnly after temporarly crash?

7 Answers 7

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxreadonlymount.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
readonly
mount
.