Bug 236474
Summary: | kernel-xen-2.6.20-1.2944.fc6 reboots | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Askar Ali Khan <asraikhn> |
Component: | kernel-xen | Assignee: | Eduardo Habkost <ehabkost> |
Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 6 | CC: | bstein, itamar, katzj, sts+redhat-bugzilla, xen-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-06-07 17:10:40 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Askar Ali Khan
2007-04-14 20:33:05 UTC
I have added the noreboot option to grub and my dom0 seems to be survive and not rebooting noreboot doesn't have fixed, dead after 1 hour of uptime. [root@serv ~]# (XEN) (file=extable.c, line=77) Pre-exception: ff1619f4 -> ff163ce3 (XEN) (file=traps.c, line=1518) GPF (4814): ff163d28 -> ff163d39 (XEN) (file=traps.c, line=1518) GPF (0000): ff161ba1 -> ff161cba (XEN) domain_crash_sync called from entry.S (ff163d78) (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-3.0.3-0-1.2944.fc6 x86_32p debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) EIP: 4817:[<c1ebcee4>] (XEN) EFLAGS: 40690fed CONTEXT: guest (XEN) eax: 00000000 ebx: c04013a7 ecx: 00000061 edx: 00000246 (XEN) esi: c04080fa edi: c1ec2500 ebp: ffffffff esp: c073425a (XEN) cr0: 8005003b cr4: 000006f0 cr3: 77d56000 cr2: b7ff5000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 4817 (XEN) Guest stack trace from esp=c073425a: (XEN) 00000000 00040000 00000000 00000000 00000000 00200000 00040000 00000000 (XEN) 02000000 00000000 00000000 00000000 428c0000 428cc073 0001c073 4ead0000 (XEN) ffffdead ffffffff e658ffff ccccc093 cccccccc cccccccc cccccccc cccccccc (XEN) cccccccc 6ec0cccc 4140c0e4 8d40c073 dd78ed7e e3c0c0d3 0002ee09 00000000 (XEN) 000d0000 02000000 00000000 00000000 1eed0100 ffffdeaf ffffffff 0000ffff (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00040000 (XEN) 00000000 00040000 00000000 00000000 00000000 00200000 00030000 00000000 (XEN) 02000000 00000000 00000000 00000000 434c0000 434cc073 0001c073 4ead0000 (XEN) ffffdead ffffffff eb00ffff ccccc093 cccccccc cccccccc cccccccc cccccccc (XEN) cccccccc 4740cccc 7368c073 0000c046 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 1eed0100 ffffdeaf ffffffff 0000ffff (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 440c0000 440cc073 0001c073 4ead0000 (XEN) ffffdead ffffffff 0000ffff cccc0000 cccccccc cccccccc cccccccc cccccccc (XEN) cccccccc 4080cccc 7368c073 0000c046 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 1eed0100 ffffdeaf ffffffff 0000ffff (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 44cc0000 44ccc073 0001c073 4ead0000 (XEN) Domain 0 crashed: rebooting machine in 5 seconds. One way to reproduce appears to be as follows: 1. Export an NFS file system from the domain0 to the domainU Export over a brouter interface to the domainU. 2. Mount it in the domainU 3. Copy a file >2Gb from a non-NFS mounted partition to the NFS-mounted partition Notes: - The domain0 may lose access to any host accessible on the brouter, shortly before. This is hard to catch in time without logging from a console port - Changing the domU to use UDP instead of TCP or reducing the rsize/wsize= NFS mount options in the domU appear to delay, but not prevent, triggering this as soon. - This behavior did not occur in 2.6.19 based kernels. No updates ? Look like we have to stick with 2.6.19-1.2911.6.5.fc6xen which is the last working kernel-xen. Or they are plaining to fix it while releasing 2.6.21.x :) Thanks. Askar 2.6.19-1.2911.6.5.fc6xen doesn't work for me. anyone have a estimated time to new release of xen packages in fedora ? Why 2.6.19 doesn't work? Do it have problems for you, also? I don't have an estimate on the time to debug the instability/rebooting bugs. But there may be some work (in parallel) to update the FC6 kernel to 2.6.21, soon, and there is a possibility of the 2.6.21 update solving the instability problems. At least I hope so. :) I'm seeing some similar crashes on my home machine, and was able to get the stack trace on the serial console by adding the noreboot option to the xen kernel command line (without that, the box would just spontaneously reboot before ever outputting it): (XEN) domain_crash_sync called from entry.S (ff161d99) (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-3.0.3-0-1.2948.fc6 x86_32p debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) EIP: 0061:[<c0404e99>] (XEN) EFLAGS: 00210292 CONTEXT: guest (XEN) eax: 00000000 ebx: 007e564f ecx: 00000073 edx: 00200297 (XEN) esi: bfccf44c edi: 0000007b ebp: 00000000 esp: e72ee010 (XEN) cr0: 80050033 cr4: 000006f0 cr3: 9aa7b000 cr2: 08df40) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0069 cs: 0061 (XEN) Guest stack trace from esp=e72ee010: (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 2f610025 00000000 00000000 00000000 2f612025 00000000 (XEN) (XEN) 2f617025 00000000 2f618025 00000000 2f619025 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 2f61f025 00000000 2f620025 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 000000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 00000000 00000000000 (XEN) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 2f650025 00000000 2f651025 00000000 00000000 00000000 (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. It says "not tainted", but I actually have the nvidia module plugged in in this case. I've been able, however, to reproduce the problem without it though. The 2.6.19 xen kernels were pretty stable on this machine, but I've had no luck whatsoever with the 2.6.20 series. Both 2.6.19 and 2.6.20 xen kernels, however work fine on my work machine. So there seems to be something hardware-specific about these problems. My home box is a AMD X2, and work is a dual dual-core xeon box, so there are some not-insignificant differences. I'll be happy to collect info or test kernels/boot options if you can suggest anything... So we got another update for kernel-xen kernel-xen.i686[2.6.20-1.2948.fc6, duno if its fix the reboots/crashing problem or not? Thanks. This may be the same problem reported on bug #234008. Could you test using kernel-xen-2.6.20-1.2952.fc6, that is available on the Fedora Core 6 updates-testing repository? Any test results using kernel-xen-2.6.20-1.2952.fc6? I'll give a try to kernel-xen-2.6.20-1.2952.fc6 which is available via yum and then update if the the problem persist. Askar. Thanks. I will keep the "needinfo" flag, so the system will remember me that I am waiting for the 2.6.20-1.2952.fc6 test results, when checking the list of open bugs. :) I have updated kernel-xen with 2.6.20-1.2952.fc6xen on one of our hosts and its been working cool from last 17 hours, nothing in logs. Dom0 and demU (5) working just fine, I hope finally we are again back on track :) I'll watching this host for 24+ hours, then will go to update other 2 hosts kernel-xen. Thanks. Askar Thanks for the information. Marking this bug as another instance of bug #234008. *** This bug has been marked as a duplicate of 234008 *** |