0

I have a Windows 10 VM setup using Virtual Machine Manager (i.e. libvirtd) that is completely setup and works. The primary drive it uses is a qcow2 file.

Now I am trying to attach a physical SATA drive as a second drive. It will not boot off of the second disk but only needs to read/write to it well after booting. The second disk has multiple partitions.

I have added the "SATA Disk 2" device with a source path of /dev/disk/by-path/pci-0000:00:1f.2-ata-5 to make it uniquely identifiable (but this also happens with /dev/sdc or /dev/sdd). The device is not mounted in the Linux host.

A few seconds after the VM begins to boot, the Linux system makes the sounds for detaching and reattaching an external drive. If I check /dev I can see that the device sdc is no longer there and there is now a sdd device (or sdd to sdc) and the symlink in /dev/disk/by-path/pci-... is now pointing to the "newly attached" device.

The detaching and reattaching of the drive usually causes the Windows 10 guest machine to "segfault" during boot (segfault is probably not the right term here, but qemu moves the machine to the paused state and it cannot be resumed).

One time the Windows 10 guest machine did succeed in getting past the boot (even though the disk did detach and reattach) and in did in-part recognize the device (showed up in disk manager with all of the partitions and even created the drives in Explorer but the drives could not be opened or worked with in any way).

Questions:

  • What could be causing this behavior?
    • Could it be the Linux file manager (Nemo)?
    • Or possibly a system service like systemctl?
  • How can I prevent this from happening?

Edit to add more information based on comment:

lsblk confirms there are no mounts

dmesg does show several errors and messages. The following is repeated 4 times (note this was going from sdd to sdc) with a few differences (the third time it limits it to 1.5 Gbps and the fourth time it brings the link completely down, disables it, and a bit later brings it back up from the disabled state as sdc)

[  762.078570] ata5.00: exception Emask 0x10 SAct 0x4000000 SErr 0x400101 action 0x6 frozen
[  762.078574] ata5.00: irq_stat 0x0c000000, interface fatal error
[  762.078575] ata5: SError: { RecovData UnrecovData Handshk }
[  762.078577] ata5.00: failed command: WRITE FPDMA QUEUED
[  762.078578] ata5.00: cmd 61/10:d0:00:e8:2e/00:00:00:00:00/40 tag 26 ncq dma 8192 out
                        res 40/00:d0:00:e8:2e/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[  762.078591] ata5.00: status: { DRDY }
[  762.078594] ata5: hard resetting link
[  762.388642] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  762.389395] ata5.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[  762.389401] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[  762.389404] ata5.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[  762.390476] ata5.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[  762.390479] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[  762.390481] ata5.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[  762.390904] ata5.00: configured for UDMA/100
[  762.390918] ata5: EH complete
5
  • 1
    It sounds like "attaching" it to the VM somehow caused the drive to detach from the host (which should not happen normally). Have you checked dmesg to see whether it somehow triggered an ATA error or so? And are you sure not any filesystem on the drive was mounted on the host? (Confirmed with lsblk?)
    – Tom Yan
    Commented Jun 24, 2021 at 23:16
  • Does e.g. sudo cat /dev/disk/by-path/pci-0000:00:1f.2-ata-5 > /dev/null triggers similar problem? I wonder if it has anything to do with virtualization/emulation at all.
    – Tom Yan
    Commented Jun 25, 2021 at 17:41
  • No, it does not. Also, the only other site I can find with an exact error message as the first line (same Emask, SAct, and SErr) is lists.debian.org/debian-kernel/2017/04/msg00340.html which is also using Qemu/KVM. It has other similarities as well, except that it is from 2017 running kernel 3.16.0 (which was the issue) while I am running 5.12.5.
    – thaimin
    Commented Jun 25, 2021 at 17:51
  • Does your setting look something like this? Have you tried different advanced options?
    – Tom Yan
    Commented Jun 25, 2021 at 18:01
  • Also see if booting the host with 5:noncq or 5:1.5 set to the kernel parameter libata.force= helps.
    – Tom Yan
    Commented Jun 25, 2021 at 18:17

0

You must log in to answer this question.

Browse other questions tagged .