Bug 442949
Summary: | F-9 xen pv_ops : unimplemented failsafe_callback() called while running prelink | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Stephen Tweedie <sct> | ||||||||
Component: | kernel-xen | Assignee: | Eduardo Habkost <ehabkost> | ||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | rawhide | CC: | xen-maint | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | kernel-xen-2.6.25.2-2.fc10 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2008-05-11 22:04:56 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 434756 | ||||||||||
Attachments: |
|
Description
Stephen Tweedie
2008-04-17 20:03:16 UTC
Created attachment 302798 [details]
dmesg log of oopses
So, here's our invalid opcode: ENTRY(xen_failsafe_callback) /*FIXME: implement me! */ ud2a ENDPROC(xen_failsafe_callback) Next thing, of course, is to find out what's going wrong that the failsafe callback is invoked. (Note: "invalid opcode" is generally just a BUG(), which is implemented using ud2 ... that caught me out before. Interesting that report_bug() continues to claim that it's a BUG() even if it can't find IP in the bug table, like in this case) Seems to be reproducible: running /etc/cron.daily/prelink manually just resulted in the same error within under a minute for me. (Removing from F9Blocker again - this isn't reproducible for me on a fresh install and it doesn't cause problems during installation. At this point it doesn't look like it would warrant holding up the release) It seems to happen every time for me, at least if I force the prelink with touch /var/lib/misc/prelink.force I also noticed that the prelink job itself errors out with: >>>>> /etc/cron.daily/prelink: line 47: 2738 Segmentation fault /usr/sbin/prelink -av $PRELINK_OPTS >> /var/log/prelink/prelink.log 2>&1 /usr/bin/ldd: line 161: /lib/ld-linux.so.2: cannot execute binary file >>>>> where /lib/ld-linux.so.2 is the old 32-bit glibc. Did you have this installed on your test-case install that completed prelink without error? (In reply to comment #5) > It seems to happen every time for me, at least if I force the prelink with > > touch /var/lib/misc/prelink.force Yeah, had tried that and variations of e.g. "prelink -au" followed by "prelink -avf" > I also noticed that the prelink job itself errors out with: > >>>>> > /etc/cron.daily/prelink: line 47: 2738 Segmentation fault > /usr/sbin/prelink -av $PRELINK_OPTS >> /var/log/prelink/prelink.log 2>&1 > /usr/bin/ldd: line 161: /lib/ld-linux.so.2: cannot execute binary file > >>>>> > where /lib/ld-linux.so.2 is the old 32-bit glibc. Did you have this installed > on your test-case install that completed prelink without error? Yep, have that. I couldn't reproduce it here, either. Maybe running it with 'kstack=64' on the kernel command-line could reveal other useful kernel addresses on the stack. 'kstack=64' makes no difference. (In reply to comment #8) > 'kstack=64' makes no difference. Hasn't it shown more data after the "Stack:" line on the oops? 'rpm -qa' output may help me to reproduce the bug. I bet there is an specific file that triggers the bug when loaded by the prelink script, so maybe having the exact set of packages installed will make the bug reproducible. Created attachment 303656 [details]
clear %fs when loading new TLS descriptors (1/2)
__switch_to() is on the backtrace before failsafe_callback(). Probably it is
being triggered when returing from a hypercall at
paravirt_leave_lazy_cpu_mode().
The two attached patches were an attempt to fix this, but I haven't tested them
enough to make sure they are correct.
Created attachment 303657 [details]
clear %fs when loading new TLS descriptors (2/2)
Here's where this shows up on kerneloops.org: http://www.kerneloops.org/oops.php?number=9341 Should be fixed with kernel-xen-2.6.25-4.fc9 and kernel-xen-2.6.25.2-2.fc10 * Sun May 11 2008 Mark McLoughlin <markmc> - Fix oops during prelink (ehabkost, #442949) kernel-xen-2.6-2.6.25-4.fc9 has been submitted as an update for Fedora 9 |