Bug 506242 - irq timeout message resulting in system hanging
irq timeout message resulting in system hanging
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.2
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Prarit Bhargava
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-16 06:53 EDT by cormac
Modified: 2009-10-20 14:37 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-10-20 14:37:13 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description cormac 2009-06-16 06:53:26 EDT
Description of problem:
Root filsystem disk reporting irq time out and system goes into unresponsive state.  Hard reboot required to rectify the issue.


Version-Release number of selected component (if applicable):  

Kernel version 2.6.18-53.el5


How reproducible:

system setup with oracle 10G running.  No actually reproduce steps as it occurs over time and this is the second instance in a number of weeks.


Steps to Reproduce:
1.  IBM x3950M2 hardware setup with external SAN.
2.  System has oracle 10G running on it.
3.  A number of weeks ago system was unresponsive and required a hardware reset.
4.  Examined /var/log/messages and there was no info related to any issues with syste.
5.  Increased logging levels in /var/log/messages incase we ran into issue in future
6.  issue occured again overnight with system totally unresponsive. /var/log/messages has a number of unknowing messages as below:

Jun 15 20:32:11 $HOSTNAME setroubleshoot:      SELinux is preventing access to files with the label, file_t.      For complete SELinux messages. run sealert -l c6f5dcfc-9982-4261-bfae-330a6f231206

Jun 15 20:54:17 $HOSTNAME kernel: hda: irq timeout: status=0xd0 { Busy }
Jun 15 20:54:17 $HOSTNAME kernel: ide: failed opcode was: unknown
Jun 15 20:54:47 $HOSTNAME kernel: hda: ATAPI reset timed-out, status=0xd0
Jun 15 20:55:18 $HOSTNAME kernel: ide0: reset timed-out, status=0xd0
Jun 15 20:55:22 $HOSTNAME kernel: hda: status timeout: status=0xd0 { Busy }
Jun 15 20:55:22 $HOSTNAME kernel: ide: failed opcode was: unknown
Jun 15 20:55:22 $HOSTNAME kernel: hda: drive not ready for command
Jun 15 20:55:52 $HOSTNAME kernel: hda: ATAPI reset timed-out, status=0xd0
Jun 15 20:56:22 $HOSTNAME kernel: ide0: reset timed-out, status=0xd0
Jun 15 20:56:31 $HOSTNAME kernel: BUG: soft lockup detected on CPU#21!
Jun 15 20:56:31 $HOSTNAME kernel:
Jun 15 20:56:31 $HOSTNAME kernel: Call Trace:
Jun 15 20:56:31 $HOSTNAME kernel:  <IRQ>  [<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff800930e2>] update_process_times+0x42/0x68
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8005bc8e>] apic_timer_interrupt+0x66/0x6c
Jun 15 20:56:31 $HOSTNAME kernel:  <EOI>  [<ffffffff80062ad0>] _spin_unlock_irqrestore+0x8/0x9
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8000ae66>] ide_end_request+0xf0/0xfc
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8000edc7>] ide_do_request+0x708/0x787
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8003d03b>] lock_timer_base+0x1b/0x3c
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff80031c4f>] del_timer+0x4e/0x57
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff80134f37>] elv_insert+0xd6/0x1f7
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff801367d6>] blk_execute_rq_nowait+0x7e/0x9a
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff80136890>] blk_execute_rq+0x9e/0xce
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff80139ac0>] sg_io+0x235/0x333
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8013a036>] scsi_cmd_ioctl+0x1c3/0x3a6
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff801c0af4>] generic_ide_ioctl+0x1f/0x50c
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff881e0d8b>] :cdrom:cdrom_ioctl+0x31/0xc18
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8000a2e0>] __link_path_walk+0xdf8/0xf42
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff881fad4e>] :ide_cd:idecd_ioctl+0x13f/0x159
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8011c70b>] avc_has_perm+0x43/0x55
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff80138055>] blkdev_driver_ioctl+0x5d/0x72
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff801386a9>] blkdev_ioctl+0x63f/0x69a
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8011d242>] inode_has_perm+0x56/0x63
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff800da67d>] blkdev_open+0x0/0x4f
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff800da6b7>] blkdev_open+0x3a/0x4f
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8001e11c>] __dentry_open+0x101/0x1dc
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff800d9af4>] block_ioctl+0x1b/0x1f
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8003fc22>] do_ioctl+0x21/0x6b
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8002fc67>] vfs_ioctl+0x248/0x261
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8004a242>] sys_ioctl+0x59/0x78
Jun 15 20:56:31 $HOSTNAME kernel:  [<ffffffff8005b28d>] tracesys+0xd5/0xe0
Jun 15 20:56:31 $HOSTNAME kernel:

  
Actual results:


Expected results:


Additional info:
Comment 1 Prarit Bhargava 2009-06-29 09:35:01 EDT
Cormac,

I've seen similar reports to this BZ -- IBM x3XXX systems hanging in IDE/CDROM access.

Can you verify that you are running the latest FW, and there are no HW upgrades.  In some of the other reported cases updating to the latest FW or performing a HW upgrade seems to resolve the problem.

P.
Comment 2 cormac 2009-07-10 09:12:42 EDT
I will take a look at the current firmware verison on the system and see if it needs updating.  Thankfully we have not encountered the issue since we last saw it.  

I will provide feedback early next week (week starting 13th July 2009).

Cormac.
Comment 3 Prarit Bhargava 2009-09-23 10:16:43 EDT
Cormac, any updates?

Thanks,

P.

Note You need to log in before you can comment on or make changes to this bug.