Bug 494114
Summary: | 2.6.18-128.1.6.el5xen panic! | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Alexander Lindqvist <alexander> | ||||||||||
Component: | kernel-xen | Assignee: | Prarit Bhargava <prarit> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | low | ||||||||||||
Version: | 5.3 | CC: | clalance, dzickus, mishu, pasteur, qcai, rhelbugzilla, riel, xen-maint | ||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | i686 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2009-09-02 09:01:10 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Created attachment 338166 [details]
Just before 2.6.18-128.1.6.el5xen panics
Created attachment 338167 [details]
Server booting 2.6.18-128.1.6.el5xen filmed with mobile.
Opens in VLC.
CentOS bugtracker: http://bugs.centos.org/view.php?id=3489 Are you able to reproduce using RHEL? No as I don't have access to RHEL 5 software. I could download a test kernel from dzickus or a clalance virttest kernel if you want? If so please specify which kernel you want me to test. If you could test with the virttest kernels at http://people.redhat.com/clalance/virttest, that would be useful. That being said, I don't know that we have any fixes in place for something like this, so I'm not that hopeful it will make a difference. Also if you can test whether the bare-metal kernel has the same problem, that would be useful. Finally, if at all possible, getting a serial console output of the crash would be extremely useful. While the movie shows the crash, it's too blurry and short to really read the OOPs output, so it's hard to see what's going on. Chris Lalancette Created attachment 338219 [details]
Capture of serial console
Capture of serial console during boot of 2.6.18-128.1.5.el5xen
Attachment above is 2.6.18-128.1.(6).el5xen (misstyped) kernel-xen-2.6.18-137.el5virttest15.i686.rpm tested and has the same problem. OK, great, that's very good info. I've asked one of the PCI bus enumerations experts to have a quick look at this BZ, but in all likelihood it won't be until tomorrow. Chris Lalancette OK, it seems that there is a patch available that *should* fix this issue. I've built a test kernel with it; it's available at: http://people.redhat.com/clalance/bz494114 Can you give this test kernel a try, and see if it fixes the issue for you? Thanks, Chris Lalancette That kernel did it ! Im running both servers on this kernel with 8 paravirt guests now and so far so good. Can you confirm which kernel release will contain this bugfix ? OK, great, thanks for testing. This patch is currently slated for 5.4, barring any problems we find with it. I'm going to close this as a dup of BZ 470202. Chris Lalancette *** This bug has been marked as a duplicate of bug 470202 *** Un-duped by clalance, and POSTed by me. P. I posted a comment to the 481500 bug post which tracked this issue: https://bugzilla.redhat.com/show_bug.cgi?id=481500 and Chris Lalancette redirected me here. Considering the wait until the 5.4 release is a ways out, can we get the specific PCI enumeration patch for this issue so that we can apply them to the released 5.3 kernels? I'd rather not run a more experimental kernel than I need to. Btw, I can reproduce this problem without the Xen HV so this is more of a core kernel issue than a virtualization specific issue. Btw, I can confirm that 2.6.18-138.el5virttest16 does boot on my 2x1ghz P3 platform but XVC serial console redirection (ttyS0,9600n1) is broken and only posts the following: -- Kernel 2.6.18-138.el5virttest16 on an i686 Filesystem type is ext2fs, partition type 0x83 dhcp-49 login: -2.6.18-138.el5virttest16 ro root=/dev/VolGroup00/LogVol00 conso le=xvc console=tty xencons=xvc [Linux-bzImage, setup=0x1e00, size=0x1beb74] initrd /initrd-2.6.18-138.el5virttest16.img [Linux-initrd @ 0x37cf2000, 0x2fd280 bytes] ÿ <---- that's the last character. Initally looks like baud mismatch but no other characters come up so I don't think Xen is taking over the serial port as expected -- (In reply to comment #17) > Btw, I can confirm that 2.6.18-138.el5virttest16 does boot on my 2x1ghz P3 > platform but XVC serial console redirection (ttyS0,9600n1) is broken and only > posts the following: > -- > Kernel 2.6.18-138.el5virttest16 on an i686 > Filesystem type is ext2fs, partition type 0x83 > dhcp-49 login: -2.6.18-138.el5virttest16 ro root=/dev/VolGroup00/LogVol00 conso > le=xvc console=tty xencons=xvc > [Linux-bzImage, setup=0x1e00, size=0x1beb74] > initrd /initrd-2.6.18-138.el5virttest16.img > [Linux-initrd @ 0x37cf2000, 0x2fd280 bytes] > > ÿ <---- that's the last character. Initally looks like baud mismatch but no > other characters come up so I don't think Xen is taking over the serial port as > expected > -- Seems like a new issue, unrelated to this BZ. Please open a new bugzilla on your issue. Thanks, P. (In reply to comment #17) > Btw, I can confirm that 2.6.18-138.el5virttest16 does boot on my 2x1ghz P3 > platform but XVC serial console redirection (ttyS0,9600n1) is broken and only > posts the following: > -- > Kernel 2.6.18-138.el5virttest16 on an i686 > Filesystem type is ext2fs, partition type 0x83 > dhcp-49 login: -2.6.18-138.el5virttest16 ro root=/dev/VolGroup00/LogVol00 conso > le=xvc console=tty xencons=xvc > [Linux-bzImage, setup=0x1e00, size=0x1beb74] > initrd /initrd-2.6.18-138.el5virttest16.img > [Linux-initrd @ 0x37cf2000, 0x2fd280 bytes] > > ÿ <---- that's the last character. Initally looks like baud mismatch but no > other characters come up so I don't think Xen is taking over the serial port as > expected > -- As Prarit said, that's something else. Although to be honest, I can't imagine what could cause that in the virttest kernels. In any case, please open up a new BZ, with details of which kernel, which guest, which dom0, and the output from /boot/grub/grub.conf. Chris Lalancette in kernel-2.6.18-141.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html |
Created attachment 338165 [details] 2.6.18-128.1.6.el5xen panic Description of problem: 2.6.18-128.1.6.el5xen panics during boot. Version-Release number of selected component (if applicable): 2.6.18-128.1.6.el5xen How reproducible: Always during boot. Steps to Reproduce: 1. 2. 3. Actual results: panics during boot and restarts server in an endless loop. 2.6.18-92.1.22.el5xen works Expected results: boot without crashing Additional info: 2 identical Proliant 6400R (both panics) Server config: 4x P3 Xeon 550MHz 2MB Cache HP SmartArray 5304 4GB RAM CentOS 5.3 Upgraded DomU's first to CentOS 5.3 kernel 2.6.18-128.1.6 and they are running ok on this hardware. Dom0 boots 2.6.18-128.1.6.el5xen kernel but reboots in the middle of the boot process. 2.6.18-92.1.22el5xen boots fine. It is probably the same problem with the bare metal kernel but this is untested.