Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
The SATA emulator (ide-hd, ide-cd or ide-drive when used with the AHCI host bus adapter) can occasionally cause a guest to hang because of a race condition in the completion code in hw/ide/ahci.c.
Version-Release number of selected component (if applicable):
All versions of qemu since v0.14.0.
I do not presently know when SATA was considered "supported" in a Red Hat product; but it was not sooner than 2.4.0.
How reproducible:
2.11 and prior: ~0%
2.12 and later: ~0% - 100%, depending on threads, timing, and guest operating system.
Steps to Reproduce:
1. Boot guest using -M q35 and an AHCI disk, such as Windows 10
2. Observe that the spinning dots loading animation will freeze for several seconds (around 10 to 12 seconds) before reaching the login screen.
Actual results:
- SATA performance is marred by occasional freezes, characterized by guest-driver errors. Linux may emit warnings in dmesg. Windows may freeze for 10-12 seconds at a time before attempting to reset the device.
Expected results:
- The SATA emulator, while slow, should not freeze or cause error messages or hangs in guest operating systems.
Additional info:
This is caused by a race condition where the PxCI register was not cleared prior to raising an IRQ upon AHCI command completion. Prior to v2.12.0, the timing for this was apparently not an issue, but changes in the locking primitives in 2.12.0 made the bug more likely to hit.
For a guest operating system to see the bug, the guest SATA driver must interrogate the PxCI register to see it is not cleared in its interrupt handler, it may then opt to take corrective action.
See the launchpad for more information.
I tried 20 times, hit once.
1. test with qemu-kvm-rhev-2.12.0-1.el7
2. Boot guest using -M q35 and an AHCI disk, Windows 10
3. Observe that the spinning dots loading animation will freeze for several seconds (around 10 to 12 seconds) before reaching the login screen.
4. the guests hangs around 10 to 12 seconds. I can move the mouse but everything needing disk access is unresponsive.
I tried 20 times, hit once.
1. test with qemu-kvm-rhev-2.12.0-1.el7
2. Boot guest using -M q35 and an AHCI disk, Windows 10
3. Observe that the spinning dots loading animation will freeze for several seconds (around 10 to 12 seconds) before reaching the login screen.
4. the guests hangs around 10 to 12 seconds. I can move the mouse but everything needing disk access is unresponsive.
Thank you for testing and reproducing this, sorry I was not able to give better reproduction instructions.
I tried personally with `./x86_64-softmmu/qemu-system-x86_64 -m 4096 -cpu host -M q35 -enable-kvm -smp 4 -drive id=sda,if=none,file=/home/bos/jhuston/windows_10.qcow -device ide-hd,drive=sda -qmp tcp::4444,server,nowait -snapshot`
and was able to reproduce it fairly often on my T460S laptop:
- Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
- MemTotal: 20423644 kB
I saw it most frequently on 2.12.0 upstream and did not test with our downstream product, but the underlying bug has existed for all versions of the AHCI emulator, so it may be more or less likely to trigger on various versions for various reasons.
Comment 10Miroslav Rezanina
2018-07-24 14:22:29 UTC
According to Comment 4, I tried 30 times, not hit the issue. So verify this bug.
1. test with qemu-kvm-rhev-2.12.0-8.el7
2. Boot guest using -M q35 and an AHCI disk, Windows 10
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2018:3443