Description of problem: I tried to install Fedora 10 as a guest on RHEL-5 host. The installation process got frozen at the end, which I believe to be a consequence of this error (as it was the bootloader installation phase). Despite that, I am able to boot the guest system, but I am getting a lot of ATA errors. Version-Release number of selected component (if applicable): kernel-xen-2.6.18-128.el5 xen-3.0.3-80.el5 How reproducible: always Steps to Reproduce: 1. (run xen enabled system) 2. virt-install -n F10 -r 512 -f F10.img -s 10 --vnc --hvm -c ./boot.iso 3. perform the default installation 4. reboot the guest Actual results: the guest console is flooded with repetitions of the following error message: ata2: soft resetting link ata2.00: configured for MWDMA2 ata2: EH complete ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 1e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 res 41/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x3 (HSM violation) ata2.00: status: { DRDY ERR } ata2: soft resetting link Expected results: no errors occur Additional info: the physical hardware is Dell Precision 490 what I find strange is that the virt-manager reports the virtual disk drive as IDE (hda) while during the guest installation, it was detected as sda
Created attachment 332698 [details] bootscreen I have experienced the same problem, installation freezing at the end, also with recent Rawhide unfortunately, after rebooting the guest, I am unable to boot it, see the screenshot - pay attention also to the reported hard drive size
I forgot to mention that the virtual guest at the screenshot uses disk partition instead of image file as the harddrive device
Reproduced on FC10 guest with an image file as harddrive divice hda The host is Scienfific Linux 5.2 x86_64, xen-3.0.3-64.el5_2.9.x86_64
Can you try passing "clocksource=acpi_pm" to the guest kernel before you boot it, and see if that makes a difference? There is a bug in F-10 having to do with paravirtualized clocks, and I'm wondering if this is another instance of it. I'm also going to change the component to "xen" for the time being; this is either a bug in the guest emulation (i.e. xen), or it's a bug in the guest kernel (in which case we would move it to F-10 kernel). But it's definitely not python-virtinst's problem. Chris Lalancette
Yes, I tried, but it didn't help - I see the same ata2 errors in dmesg output: [root@fc10 ~]# cat /proc/cmdline ro root=/dev/VolGroup00/LogVol00 rhgb quiet clocksource=acpi_pm [root@fc10 ~]# dmesg|tail ata2.00: status: { DRDY ERR } ata2: soft resetting link ata2.00: configured for MWDMA2 ata2: EH complete ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 res 41/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x3 (HSM violation) ata2.00: status: { DRDY ERR } ata2: soft resetting link [root@fc10 ~]# uname -a Linux fc10.xen.home 2.6.27.15-170.2.24.fc10.x86_64 #1 SMP Wed Feb 11 23:14:31 EST 2009 x86_64 x86_64 x86_64 GNU/Linux And I'm agree that python-virtinst is not the source of this problem
(In reply to comment #5) > Can you try passing "clocksource=acpi_pm" to the guest kernel before you > boot it, and see if that makes a difference? the same for me, passing this option does not help tried using Rawhide, kernel 2.6.29-0.179.rc6.git5.fc11.x86_64
Karel: What about appending domain configuration file to this BZ ? Sergey: I have ran into this issue using xen-3.0.3-87.el5 RPMs with kernel-xen-2.6.18-146.el5xen too. This may be kernel-xen problem as well as IOEMU problem and most definitely not python-virtinst problem because this is done even when VM is installed. I tried F10 i386 FV guest...
I've been poking about this in IOEMU code but no luck since but it may be some kernel thing because I found some information on fedora-kernel-list at: http://www.mail-archive.com/fedora-kernel-list@redhat.com/msg00087.html May be related to this 2.6.20+ kernels but not "pci=nomsi" because this is not working either. Maybe some kernel issue. Michal
I have been running into this issue as well however it may still be related to lib-virt somehow as once I created a disk(secondary) in virt-manager set to "SCSI Disk" there were no errors when trying to access this disk where there is a stream of soft resetting link errors when accessing the IDE created device (which shows as /dev/sda1). Sam.
Hi Sam, well, you're talking about libvirt relations or something like that. I don't think it's the issue but for clarification, could you provide us your libvirt version and exact steps you did to see and not to see those errors? Thanks, Michal
I'm also hitting this error installing early builds of RHEL 6.0 on a RHEL 5.4 host with kernel-xen-2.6.18-164.el5 libvirt-0.6.3-20.1.el5_4 libvirt-python-0.6.3-20.1.el5_4 python-virtinst-0.400.3-5.el5 virt-manager-0.6.1-8.el5 xen-3.0.3-94.el5 xen-libs-3.0.3-94.el5 I started the RHEL 6 install with virt-install: virt-install -n rhel6 -r 512 --vcpus=1 -f /var/lib/xen/images/rhel6 \ -b xenbr0 --vnc --noautoconsole -v --os-type=linux --os-variant=fedora11 \ -c /tmp/rhel6/boot.iso Note that I used an OS variant of fedora11 since rhel6 is not listed yet for virt-install. On the first boot after installation it spit out hundreds of these errors, but it eventually booted all the way. This thread implies this is fixed upstream: http://www.mail-archive.com/linux-ide@vger.kernel.org/msg14513.html
*** Bug 526662 has been marked as a duplicate of this bug. ***
Upstream patch is here: http://www.mail-archive.com/qemu-devel@nongnu.org/msg11844.html The backport to Xen's qemu is almost trivial.
(In reply to comment #16) > Upstream patch is here: > http://www.mail-archive.com/qemu-devel@nongnu.org/msg11844.html > > The backport to Xen's qemu is almost trivial. Thanks for pointing this out. I'll backport this one ... Michal
Created attachment 374020 [details] Qemu Libata fix Well, I have this backported but I am unable to reproduce it even with Fedora 10 and Fedora 12 x86_64... This is the patch but could somebody tell me how to reproduce it since I am unable to reproduce it? Michal
(In reply to comment #18) > Well, I have this backported but I am unable to reproduce it even with Fedora > 10 and Fedora 12 x86_64... could that be it is somehow hardware dependent? (unfortunately, I can't reinstall my machine to RHEL-5 right now to try)
When I boot a RHEL-6 64b fv guest with xen -100 I get tons of and tons of ata errors on the console. After applying the patch in comment 18 I don't get those errors to the console anymore, but dmesg still shows a few of these. ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata2.00: BMDMA stat 0x5 ata2.00: cmd a0/01:00:00:80:00/00:00:00:00:00/a0 tag 0 dma 16512 in cdb 5a 00 2a 00 00 00 00 00 80 00 00 00 00 00 00 00 res 48/20:02:00:1c:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) ata2.00: status: { DRDY DRQ } ata2: soft resetting link The same results for f11 (2.6.30.9-96).
(In reply to comment #21) > When I boot a RHEL-6 64b fv guest with xen -100 I get tons of and tons of ata > errors on the console. After applying the patch in comment 18 I don't get those > errors to the console anymore, but dmesg still shows a few of these. > > ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen > ata2.00: BMDMA stat 0x5 > ata2.00: cmd a0/01:00:00:80:00/00:00:00:00:00/a0 tag 0 dma 16512 in > cdb 5a 00 2a 00 00 00 00 00 80 00 00 00 00 00 00 00 > res 48/20:02:00:1c:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) > ata2.00: status: { DRDY DRQ } > ata2: soft resetting link > > The same results for f11 (2.6.30.9-96). Well, maybe the upstream qemu patch does this because when an error is here it's not showing other HSM violations so it's showing just few of them. So did this improve the situation? Michal
Now that I look again closer, the error I reported in comment 21 is different than originally reported error in this bug. I have DRDY DRQ and the original report was for DRDY ERR. It looks like the proposed patch does eliminate the DRDY ERRs. So the DRDY DRQ errors are something else and deserve a different bug.
Ok, I just backedup and doubled checked without the patch. The error I have continuously output to the console is Emask 0x2 { DRDY DRQ ERR }. So I never reproduced exactly the same thing as the originator. This may not make a difference, but should maybe be investigated. I'll sort it out and open a new bug for it if necessary. As far as this bug goes, I believe the patch works. When not violating the HSM when avoid getting constant exceptions.
I still see the same errors under a fully-virtualized environment: Linux fedora-11-64 2.6.30.9-96.fc11.x86_64 #1 SMP Wed Nov 4 00:02:04 EST 2009 x86_64 x86_64 x86_64 GNU/Linux {{{ ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00 res 41/50:03:00:08:00/00:00:00:00:00/a0 Emask 0x3 (HSM violation) ata2.00: status: { DRDY ERR } ata2: soft resetting link ata2.00: configured for MWDMA2 ata2: EH complete }}} Apart from that, the guest tends to hand every few days.
(In reply to comment #26) > I still see the same errors under a fully-virtualized environment: > > Linux fedora-11-64 2.6.30.9-96.fc11.x86_64 #1 SMP Wed Nov 4 00:02:04 EST 2009 > x86_64 x86_64 x86_64 GNU/Linux > > {{{ > ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen > ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in > cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00 > res 41/50:03:00:08:00/00:00:00:00:00/a0 Emask 0x3 (HSM violation) > ata2.00: status: { DRDY ERR } > ata2: soft resetting link > ata2.00: configured for MWDMA2 > ata2: EH complete > }}} > > Apart from that, the guest tends to hand every few days. Well, this maybe kernel related... Does it do with older/newer kernels? Michal
Seems to be doing it with all fedora 11 kernels I tried, including the last one. It may be related to bug #543947
(In reply to comment #28) > Seems to be doing it with all fedora 11 kernels I tried, including the last > one. It may be related to bug #543947 Well, I can't claim I understand that stuff well but could you also try with F10 or F12 kernels? If this is no issue on F10 and F12 kernels, it may be related to bug you wrote above... Michal
I can tell you for sure that it does not happen with Fedora 9 kernels. I did not try with Fedora 12 or Fedora 10.
Any updates on this?
Well, I did some testing with Fedora 8, 9 and Fedora 10 kernels (all 32 bit, i386, guests) just to be sure and this problem didn't occur on those guests but DRDY DRQ messages are here in dmesg output but not DRDY DRQ ERR ones. It seems like it's related to BZ #543947. Also, I've not been able to install Fedora 12 again - there were some errors - we need to be sure... Michal
Well, I managed to install Fedora 12, 32-bit guest and I saw no DRDY DRQ ERR errors, only DRDY DRQ messages so it seems the problem is really related to bug #543947 because I saw no such issue on other guest than Fedora 11. Michal
I am seeing this problem with Fedora 12 fully virtualized guest on RHEL 5.3. kernel-2.6.18-164.6.1.el5xen xen-3.0.3-94.el5_4.2 This string of errors is logged every few seconds whenever the guest is up: Dec 24 19:05:24 web1 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 24 19:05:24 web1 kernel: ata2.00: ST_FIRST: DRQ=1 with device error, dev_stat 0x49 Dec 24 19:05:24 web1 kernel: ata2.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in Dec 24 19:05:24 web1 kernel: cdb 12 00 00 00 24 00 00 00 00 00 00 00 00 00 00 00 Dec 24 19:05:24 web1 kernel: res 49/20:01:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) Dec 24 19:05:24 web1 kernel: ata2.00: status: { DRDY DRQ ERR } Dec 24 19:05:24 web1 kernel: ata2: soft resetting link Dec 24 19:05:25 web1 kernel: ata2.00: configured for MWDMA2 Dec 24 19:05:25 web1 kernel: ata2: EH complete
A little more info on my post above: host hardware: Proliant DL365 Opteron I tried clocksource=acpi_pm, but it made no difference. guest: 2.6.31.6-166.fc12.x86_64
Alan, packages that fix this bug will be available shortly.
According to Comment #52, check this bug on xen-3.0.3-102.el5 and rhel5.4 for x86_64 and i386 platform: 1.(run xen enabled system) 2. virt-install -n F10 -r 512 -f F10.img -s 10 --vnc --hvm -c /root/Fedora-10-i386-DVD.iso 3. perform the default installation 4. reboot the guest 5. dmesg | grep ata2 After step5, I get follow results: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata2.00: BMDMA stat 0x5 ata2.00: cmd a0/01:00:00:80:00/00:00:00:00:00/a0 tag 0 dma 16512 in ata2.00: status: { DRDY DRQ } ata2: soft resetting link ata2.00: configured for MWDMA2 ata2: EH complete These ata2 message is about {DRDY DRQ}, not the original {DRDY ERR}. So this bug is fixed on xen-3.0.3-102.el5 and change bug's status to verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0294.html
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).