Bug 213929 - 2.6.18 kernel and kdump vmcores require crash utility upgrade
Summary: 2.6.18 kernel and kdump vmcores require crash utility upgrade
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: crash
Version: 5.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Dave Anderson
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-11-03 20:07 UTC by Dave Anderson
Modified: 2007-11-30 22:07 UTC (History)
1 user (show)

Fixed In Version: 5.0.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-11-28 21:43:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dave Anderson 2006-11-03 20:07:50 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030611

Description of problem:

The upgrade of RHEL5 to a 2.6.18-based kernel and the introduction
of kdump-generated vmcore dumpfiles require that the RHEL5 crash utility
be upgraded from its current version of 4.0-3.7 to upstream version
4.0-3.9 or greater.  The kernel changes reflect themselves in the
backtraces of x86_64 kernel dumpfiles.

Version-Release number of selected component (if applicable):
crash-4.0-3.7

How reproducible:
Always


Steps to Reproduce:
1. execute "bt" command on a task that has transitioned from a
process stack to a per-cpu IRQ stack due to a hardware interrupt.
2. execute "bt" command on a task that has transitioned from a
process stack to a per-cpu NMI exception stack.
3. execute "bt" command on a task that has transitioned from a
process stack to a per-cpu IRQ stack via the call_softirq entry
point (not a hardware interrupt).

Actual Results:
Examples of failures provided below.

Expected Results:
Examples of correct output provided below.

Additional info:

Comment 1 Dave Anderson 2006-11-03 20:14:49 UTC
The 2.6.18 kernel has changed the manner of linking an IRQ stack
back to the process stack from whence it came.  As a result of that,
a bogus exception frame is being displayed by the "bt" command.

Here is an example of issue #1, where a task as transitioned from
a process stack to a per-cpu IRQ stack.  The exception frame that
is shown just below frame #19 is incorrect, and in fact is reported
as a "possibly bogus exception frame":

crash> bt
PID: 0      TASK: ffff81003fe48100  CPU: 1   COMMAND: "swapper"
 #0 [ffff81003fe6bb40] crash_kexec at ffffffff800ab7c4
 #1 [ffff81003fe6bbc8] mwait_idle at ffffffff800553b7
 #2 [ffff81003fe6bc00] sysrq_handle_crashdump at ffffffff8019301f
 #3 [ffff81003fe6bc10] __handle_sysrq at ffffffff80192e1c
 #4 [ffff81003fe6bc50] kbd_event at ffffffff8018dbc1
 #5 [ffff81003fe6bca0] input_event at ffffffff801e9b9f
 #6 [ffff81003fe6bcd0] hidinput_hid_event at ffffffff801e42cb
 #7 [ffff81003fe6bd00] hid_process_event at ffffffff801df66b
 #8 [ffff81003fe6bd40] hid_input_report at ffffffff801df9d9
 #9 [ffff81003fe6bdc0] hid_irq_in at ffffffff801e0dc0
#10 [ffff81003fe6bde0] usb_hcd_giveback_urb at ffffffff801d33d8
#11 [ffff81003fe6be10] uhci_giveback_urb at ffffffff88168724
#12 [ffff81003fe6be50] uhci_scan_schedule at ffffffff88168e07
#13 [ffff81003fe6bed0] uhci_irq at ffffffff8816ac08
#14 [ffff81003fe6bf10] usb_hcd_irq at ffffffff801d3dc7
#15 [ffff81003fe6bf20] handle_IRQ_event at ffffffff80010704
#16 [ffff81003fe6bf50] __do_IRQ at ffffffff800b5238
#17 [ffff81003fe6bf58] __do_softirq at ffffffff80011c0b
#18 [ffff81003fe6bf90] do_IRQ at ffffffff8006a762
--- <IRQ stack> ---
#19 [ffff81003fe65e70] ret_from_intr at ffffffff8005bac9
    [exception RIP: cpu_idle+149]
    RIP: ffffffff800473be  RSP: ffffffff8042e220  RFLAGS: ffffffff80074188
    RAX: ffffffffffffff16  RBX: 0000000000000000  RCX: ffffffff800553b7
    RDX: 0000000000000010  RSI: 0000000000000246  RDI: ffff81003fe65ef0
    RBP: ffff81003fe64000   R8: ffffffff8034e838   R9: 0000000000000001
    R10: 0000000000000000  R11: 0000000000000000  R12: 000000000000003f
    R13: ffff810037d0c008  R14: 0000000000000246  R15: 0000000000000001
    ORIG_RAX: 0000000000000018  CS: 0020  SS: 0000
bt: WARNING: possibly bogus exception frame
crash>

With the proper fix, it looks like this:

crash> bt
PID: 0      TASK: ffff81003fe48100  CPU: 1   COMMAND: "swapper"
 #0 [ffff81003fe6bb40] crash_kexec at ffffffff800ab7c4
 #1 [ffff81003fe6bbc8] mwait_idle at ffffffff800553b7
 #2 [ffff81003fe6bc00] sysrq_handle_crashdump at ffffffff8019301f
 #3 [ffff81003fe6bc10] __handle_sysrq at ffffffff80192e1c
 #4 [ffff81003fe6bc50] kbd_event at ffffffff8018dbc1
 #5 [ffff81003fe6bca0] input_event at ffffffff801e9b9f
 #6 [ffff81003fe6bcd0] hidinput_hid_event at ffffffff801e42cb
 #7 [ffff81003fe6bd00] hid_process_event at ffffffff801df66b
 #8 [ffff81003fe6bd40] hid_input_report at ffffffff801df9d9
 #9 [ffff81003fe6bdc0] hid_irq_in at ffffffff801e0dc0
#10 [ffff81003fe6bde0] usb_hcd_giveback_urb at ffffffff801d33d8
#11 [ffff81003fe6be10] uhci_giveback_urb at ffffffff88168724
#12 [ffff81003fe6be50] uhci_scan_schedule at ffffffff88168e07
#13 [ffff81003fe6bed0] uhci_irq at ffffffff8816ac08
#14 [ffff81003fe6bf10] usb_hcd_irq at ffffffff801d3dc7
#15 [ffff81003fe6bf20] handle_IRQ_event at ffffffff80010704
#16 [ffff81003fe6bf50] __do_IRQ at ffffffff800b5238
#17 [ffff81003fe6bf58] __do_softirq at ffffffff80011c0b
#18 [ffff81003fe6bf90] do_IRQ at ffffffff8006a762
--- <IRQ stack> ---
#19 [ffff81003fe65e48] ret_from_intr at ffffffff8005bac9
    [exception RIP: mwait_idle+54]
    RIP: ffffffff800553b7  RSP: ffff81003fe65ef0  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: 0000000000000001  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: ffffffff8034e838
    RBP: 0000000000000000   R8: ffff81003fe64000   R9: 000000000000003f
    R10: ffff810037d0c008  R11: 0000000000000246  R12: ffff810037fef040
    R13: 0000000000000001  R14: ffff81003fe482f0  R15: 0000000002245f5e
    ORIG_RAX: ffffffffffffff16  CS: 0010  SS: 0018
#20 [ffff81003fe65ef0] cpu_idle at ffffffff800473be
crash>



Comment 2 Dave Anderson 2006-11-03 20:20:15 UTC
The second issue is associated with a task that has transitioned
from its process stack to a per-cpu NMI stack.  This will be seen
in all SMP kdump-generated dumpfiles, because the kdump facility
utilized NMI interrupts to stop all non-panicking cpus.  netdump
and diskdump did not use NMI interrupts, but rather they just used
simple inter-processor interrupts (IPIs).  

In any case, on the active tasks killed by NMI in a kdump dumpfile,
the exception frame data is displayed correctly, but crash is
incorrectly reporting that it is a "possibly bogus exception frame": 

crash> bt
PID: 0      TASK: ffff810037d0d100  CPU: 3   COMMAND: "swapper"
 #0 [ffff810037d38f30] crash_nmi_callback at ffffffff8007732f
 #1 [ffff810037d38f40] do_nmi at ffffffff80063b2c
 #2 [ffff810037d38f50] nmi at ffffffff8006302f
    [exception RIP: mwait_idle+54]
    RIP: ffffffff80055293  RSP: ffff810037d31ef0  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffffffff8005525d  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: ffffffff8034e8b8
    RBP: 0000000000000003   R8: ffff810037d30000   R9: 000000000000003c
    R10: ffff810009e4f6f0  R11: ffffffff8013b8fc  R12: ffffffff8040e280
    R13: ffffffff8040e280  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
bt: WARNING: possibly bogus exception frame
--- <exception stack> ---
 #3 [ffff810037d31ef0] mwait_idle at ffffffff80055293
 #4 [ffff810037d31ef0] cpu_idle at ffffffff8004729a
crash>

With the proper fix it looks like this:

crash> bt
PID: 0      TASK: ffff810037d0d100  CPU: 3   COMMAND: "swapper"
 #0 [ffff810037d38f30] crash_nmi_callback at ffffffff8007732f
 #1 [ffff810037d38f40] do_nmi at ffffffff80063b2c
 #2 [ffff810037d38f50] nmi at ffffffff8006302f
    [exception RIP: mwait_idle+54]
    RIP: ffffffff80055293  RSP: ffff810037d31ef0  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffffffff8005525d  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: ffffffff8034e8b8
    RBP: 0000000000000003   R8: ffff810037d30000   R9: 000000000000003c
    R10: ffff810009e4f6f0  R11: ffffffff8013b8fc  R12: ffffffff8040e280
    R13: ffffffff8040e280  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <exception stack> ---
 #3 [ffff810037d31ef0] mwait_idle at ffffffff80055293
 #4 [ffff810037d31ef0] cpu_idle at ffffffff8004729a
crash>


Comment 3 Dave Anderson 2006-11-03 20:33:47 UTC
The third problem arises when a transition is made from the process stack
to a per-cpu IRQ stack via the recently-introduced "call_softirq" entry point.
In that case, there is no exception frame laid down on the process stack at
the time of the interrupt -- because no interrupt has occurred.  The
call_softirq facility is essentially a "call" to a function, but the
underlying stack gets switched from the process stack to the per-cpu IRQ stack.

Here's an example of a process that was performing soft IRQ processing
on its per-cpu IRQ stack, and while doing so, received a NMI interrupt
from a panicking cpu.  The problem with the "bt" command is shown below in
the transition from the IRQ stack back to the process stack, just below
frame #17.  Note that it displays a "phantom" exception frame (which doesn't
exist in the case of soft IRQ handling), it reports that the exception frame
is "possibly bogus" (which it is), and then just "stops":

crash> bt
PID: 15334  TASK: ffff810011ce47a0  CPU: 2   COMMAND: "diff"
 #0 [ffff81003ff66f30] crash_nmi_callback at ffffffff8007732f
 #1 [ffff81003ff66f40] do_nmi at ffffffff80063b2c
 #2 [ffff81003ff66f50] nmi at ffffffff8006302f
    [exception RIP: avc_has_perm_noaudit+117]
    RIP: ffffffff80118259  RSP: ffff81003ff63bf8  RFLAGS: 00000202
    RAX: 0000000000001910  RBX: 0000000000000039  RCX: 0000000000000002
    RDX: ffff81003ff19e40  RSI: 0000000000000039  RDI: 0000000000000039
    RBP: 0000000000000000   R8: ffff81003ff63c98   R9: 0000000000000000
    R10: ffff81003ff63d18  R11: 0000000000000048  R12: 0000000000000039
    R13: 0000000000000000  R14: 0000000000000001  R15: ffff81003ff63d18
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <exception stack> ---
 #3 [ffff81003ff63bf8] avc_has_perm_noaudit at ffffffff80118259
 #4 [ffff81003ff63c80] avc_has_perm at ffffffff8011915f
 #5 [ffff81003ff63cf0] selinux_socket_sock_rcv_skb at ffffffff8011a48b
 #6 [ffff81003ff63db0] tcp_v4_rcv at ffffffff800268d6
 #7 [ffff81003ff63de8] udp_rcv at ffffffff80050801
 #8 [ffff81003ff63e08] scsi_next_command at ffffffff88006a92
 #9 [ffff81003ff63e40] ip_local_deliver at ffffffff800340be
#10 [ffff81003ff63e70] ip_rcv at ffffffff8003519b
#11 [ffff81003ff63ec0] netif_receive_skb at ffffffff8001fe97
#12 [ffff81003ff63f00] process_backlog at ffffffff8003008e
#13 [ffff81003ff63f40] net_rx_action at ffffffff8000c37b
#14 [ffff81003ff63f80] __do_softirq at ffffffff80011c0b
#15 [ffff81003ff63f88] call_softirq at ffffffff8005c6b0
#16 [ffff81003ff63fb0] call_softirq at ffffffff8005c6b0
--- <IRQ stack> ---
#17 [ffff8100335a3668] call_softirq at ffffffff8005c6b0
    RIP: 0000000000000000  RSP: ffffffff80060f6f  RFLAGS: ffff8100335a37d8
    RAX: ffff810080000000  RBX: ffffffff8002f5bf  RCX: ffffffff800874af
    RDX: ffff8100280a4670  RSI: ffff81002cad30cc  RDI: ffff81002b967880
    RBP: ffffffff8037e600   R8: ffffffff8003181a   R9: ffff81001282de00
    R10: ffff8100280a4670  R11: ffff81003e0b90e0  R12: ffffffff8002b596
    R13: ffffffff8037e400  R14: ffffffff8006a7d9  R15: 0000000000000000
    ORIG_RAX: 0000000000000001  CS: ffffffff80033ecd  SS: ffff8100335a37e8
bt: WARNING: possibly bogus exception frame
crash>

The reason that it "stops" is because of the RIP of 0000000000000000 in the
bogus exception frame; crash presumes that it's a user-space virtual address,
and therefore that it must be the kernel-entry-from-user-space exception frame,
so there can't be any more to the backtrace.

With the proper fix, there is no dumping of the "phantom" exception frame
upon return to the process stack, and furthermore, the remainder of the
trace on the process stack is displayed:

crash> bt
PID: 15334  TASK: ffff810011ce47a0  CPU: 2   COMMAND: "diff"
 #0 [ffff81003ff66f30] crash_nmi_callback at ffffffff8007732f
 #1 [ffff81003ff66f40] do_nmi at ffffffff80063b2c
 #2 [ffff81003ff66f50] nmi at ffffffff8006302f
    [exception RIP: avc_has_perm_noaudit+117]
    RIP: ffffffff80118259  RSP: ffff81003ff63bf8  RFLAGS: 00000202
    RAX: 0000000000001910  RBX: 0000000000000039  RCX: 0000000000000002
    RDX: ffff81003ff19e40  RSI: 0000000000000039  RDI: 0000000000000039
    RBP: 0000000000000000   R8: ffff81003ff63c98   R9: 0000000000000000
    R10: ffff81003ff63d18  R11: 0000000000000048  R12: 0000000000000039
    R13: 0000000000000000  R14: 0000000000000001  R15: ffff81003ff63d18
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <exception stack> ---
 #3 [ffff81003ff63bf8] avc_has_perm_noaudit at ffffffff80118259
 #4 [ffff81003ff63c80] avc_has_perm at ffffffff8011915f
 #5 [ffff81003ff63cf0] selinux_socket_sock_rcv_skb at ffffffff8011a48b
 #6 [ffff81003ff63db0] tcp_v4_rcv at ffffffff800268d6
 #7 [ffff81003ff63de8] udp_rcv at ffffffff80050801
 #8 [ffff81003ff63e08] scsi_next_command at ffffffff88006a92
 #9 [ffff81003ff63e40] ip_local_deliver at ffffffff800340be
#10 [ffff81003ff63e70] ip_rcv at ffffffff8003519b
#11 [ffff81003ff63ec0] netif_receive_skb at ffffffff8001fe97
#12 [ffff81003ff63f00] process_backlog at ffffffff8003008e
#13 [ffff81003ff63f40] net_rx_action at ffffffff8000c37b
#14 [ffff81003ff63f80] __do_softirq at ffffffff80011c0b
#15 [ffff81003ff63f88] call_softirq at ffffffff8005c6b0
#16 [ffff81003ff63fb0] call_softirq at ffffffff8005c6b0
--- <IRQ stack> ---
#17 [ffff8100335a3670] do_softirq at ffffffff8006a7d9
#18 [ffff8100335a3680] local_bh_enable at ffffffff8002b596
#19 [ffff8100335a3690] dev_queue_xmit at ffffffff8002f5bf
#20 [ffff8100335a36b0] ip_output at ffffffff8003181a
#21 [ffff8100335a36c0] dequeue_task at ffffffff800874af
#22 [ffff8100335a36f0] ip_queue_xmit at ffffffff80033ecd
#23 [ffff8100335a3700] thread_return at ffffffff80060f6f
#24 [ffff8100335a37a0] tcp_transmit_skb at ffffffff80021add
#25 [ffff8100335a3800] __tcp_push_pending_frames at ffffffff80032ba4
#26 [ffff8100335a3820] __alloc_skb at ffffffff8002de03
#27 [ffff8100335a3860] tcp_sendmsg at ffffffff800261bc
#28 [ffff8100335a38a0] rpc_release_task at ffffffff882f171f
#29 [ffff8100335a38f0] sock_sendmsg at ffffffff80052f3d
#30 [ffff8100335a3900] nfs3_rpc_wrapper at ffffffff8843697d
#31 [ffff8100335a3950] wake_up_bit at ffffffff8002cd74
#32 [ffff8100335a39a0] call_transmit_status at ffffffff882ec220
#33 [ffff8100335a3a90] kernel_sendmsg at ffffffff802030e1
#34 [ffff8100335a3aa0] xs_tcp_send_request at ffffffff882ef566
#35 [ffff8100335a3b60] xprt_transmit at ffffffff882ee440
#36 [ffff8100335a3b90] call_transmit at ffffffff882ec412
#37 [ffff8100335a3bc0] __rpc_execute at ffffffff882f17cc
#38 [ffff8100335a3be0] rpc_call_sync at ffffffff882ec592
#39 [ffff8100335a3bf0] rpc_save_sigmask at ffffffff882ec50f
#40 [ffff8100335a3c10] nfs3_rpc_wrapper at ffffffff88436946
#41 [ffff8100335a3c50] nfs3_proc_access at ffffffff88436f71
#42 [ffff8100335a3c70] dput at ffffffff8000ce75
#43 [ffff8100335a3c90] __link_path_walk at ffffffff8000a29f
#44 [ffff8100335a3ce0] rpcauth_lookup_credcache at ffffffff882f28b2
#45 [ffff8100335a3d80] nfs_permission at ffffffff8842a182
#46 [ffff8100335a3d90] inode_has_perm at ffffffff80119cae
#47 [ffff8100335a3df0] permission at ffffffff8000d3ce
#48 [ffff8100335a3e10] may_open at ffffffff80011cda
#49 [ffff8100335a3e40] open_namei at ffffffff8001a7a4
#50 [ffff8100335a3e60] __dentry_open at ffffffff8001e030
#51 [ffff8100335a3ea0] do_filp_open at ffffffff80027041
#52 [ffff8100335a3f50] do_sys_open at ffffffff80019308
#53 [ffff8100335a3f80] tracesys at ffffffff8005b641
    RIP: 000000303c8bf580  RSP: 00007fff46414868  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: ffffffff8005b641  RCX: ffffffffffffffff
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 00000000006151a0
    RBP: 0000000000616a70   R8: 00000000fffffffe   R9: 766972642f30742d
    R10: 0000000000000002  R11: 0000000000000246  R12: 00000000ffffff01
    R13: 00000000ffffffff  R14: 00007fff46414b00  R15: 0000000000000000
    ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b
crash>


Comment 4 Dave Anderson 2006-11-03 20:41:11 UTC
Linda, 

  Is it possible to get this into RHEL5-B2?



Comment 5 Linda Wang 2006-11-03 20:44:43 UTC
The fix is ready to go, would be good to pull this into B2.

Comment 6 RHEL Program Management 2006-11-03 20:45:52 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux release.  Product Management has requested further review
of this request by Red Hat Engineering.  This request is not yet committed for
inclusion in release.

Comment 7 Dave Anderson 2006-11-03 21:27:38 UTC
crash version 4.0-3.10 checked into RHEL5 CVS.


Comment 8 Jay Turner 2006-11-06 15:00:35 UTC
QE ack for RHEL5B2.

Comment 11 Dave Anderson 2006-11-08 19:56:55 UTC
IBM's testing showed that this fix can fail based upon the
size of the user's display window column size, because a
behind-the-scenes gdb function does line-wrapping even when
the output is redirected to a temporary file.  My testing
had been done in a larger window size.

Anyway, another upgrade needs to be done -- setting back
to ASSIGNED, and working on a fix...

Dave


Comment 13 Dave Anderson 2006-11-09 13:23:32 UTC
Fix is in crash version 4.0-3.11, which has been committed into RHEL5 CVS.

Please restore rhel-5.0.0, blocker-beta and pm_ack.


Note You need to log in before you can comment on or make changes to this bug.