Bug 501075

Summary: soft lockups with software RAID6 create and re-sync'
Product: Red Hat Enterprise Linux 5 Reporter: Issue Tracker <tao>
Component: kernelAssignee: Doug Ledford <dledford>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 5.3CC: cward, daniel.brnak, djeffery, jtluka, kbaxley, paul.gregory-cullen, rrajaram, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 07:13:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 499522, 525215, 533192    
Attachments:
Description Flags
add cond_resched to loop in function raid5d none

Description Issue Tracker 2009-05-15 20:20:23 UTC
Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2009-05-15 20:20:25 UTC
Description of problem: On RHLE5.3 system customer create a RAID6 with 24 disks which results into a kernel panic and call trace in the messages file. The issue also reproducible when try to re-sync the disk after removing a disk.

#mdadm --create /dev/md_d1 --metadata=1.0 --level=6 --raid-devices=24 /dev/sdd1 /dev/sdh1 /dev/sdl1 /dev/sdp1 /dev/sdt1 /dev/sdx1 /dev/sdab1 /dev/sdaf1 /dev/sdaj1 /dev/sdan1 /dev/sdar1 /dev/sdav1 /dev/sdc1 /dev/sdg1 /dev/sdk1 /dev/sdo1 /dev/sds1 /dev/sdw1 /dev/sdaa1 /dev/sdae1 /dev/sdai1 /dev/sdam1 /dev/sdaq1 /dev/sdau1

dmesg:
Jan 27 17:05:14 sunstorage01 kernel: md: syncing RAID array md_d1
Jan 27 17:05:14 sunstorage01 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Jan 27 17:05:14 sunstorage01 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Jan 27 17:05:14 sunstorage01 kernel: md: using 128k window, over a total of 976759808 blocks.
Jan 27 17:05:14 sunstorage01 kernel:  md_d1: unknown partition table
Jan 27 17:05:26 sunstorage01 kernel: BUG: soft lockup - CPU#1 stuck for 10s! [md_d1_raid5:6923]
Jan 27 17:05:26 sunstorage01 kernel: CPU 1:
Jan 27 17:05:26 sunstorage01 kernel: Modules linked in: raid456 xor ipv6 xfrm_nalgo crypto_api mptctl autofs4 ipmi_devintf ipmi_si ipmi_msghandler hidp l2cap bluetooth sunrpc dm_mirror dm_multipath dm_mod video sbs backlight i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev sg shpchp i2c_nforce2 mlx4_core forcedeth i2c_core pcspkr usb_storage mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Jan 27 17:05:26 sunstorage01 kernel: Pid: 6923, comm: md_d1_raid5 Not tainted 2.6.18-92.1.22.el5 #1
Jan 27 17:05:56 sunstorage01 kernel: RIP: 0010:[<ffffffff8000b5c7>]  [<ffffffff8000b5c7>] memcmp+0x1a/0x22
Jan 27 17:05:56 sunstorage01 kernel: RSP: 0018:ffff811027d09cd8  EFLAGS: 00000206
Jan 27 17:05:56 sunstorage01 kernel: RAX: 0000000000000000 RBX: ffff81080b7c60b8 RCX: 0000000000000665
Jan 27 17:05:56 sunstorage01 kernel: RDX: 0000000000000000 RSI: ffff81080b7d899b RDI: ffff81081574c99b
Jan 27 17:05:56 sunstorage01 kernel: RBP: ffff81080b7d6000 R08: ffff810001000810 R09: 0000000000001000
Jan 27 17:05:56 sunstorage01 kernel: R10: ffff81080b7d9000 R11: ffff81080b7d8000 R12: ffff81080b7d5000
Jan 27 17:05:56 sunstorage01 kernel: R13: ffff81080b7d4000 R14: ffff81080b7d3000 R15: ffff81080b7d2000
Jan 27 17:05:56 sunstorage01 kernel: FS:  00002b47d593bb40(0000) GS:ffff81011c9608c0(0000) knlGS:0000000000000000
Jan 27 17:05:56 sunstorage01 kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jan 27 17:05:56 sunstorage01 kernel: CR2: 00000038a806abe0 CR3: 000000081aef4000 CR4: 00000000000006e0
Jan 27 17:05:56 sunstorage01 kernel: 
Jan 27 17:05:56 sunstorage01 kernel: Call Trace:
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff883f7d10>] :raid456:handle_stripe+0xe5b/0x24d4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff800892a6>] __wake_up_common+0x3e/0x68
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8002e284>] __wake_up+0x38/0x4f
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009dc54>] keventd_create_kthread+0x0/0xc4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009dc54>] keventd_create_kthread+0x0/0xc4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff883f94d6>] :raid456:raid5d+0x14d/0x17b
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8003a940>] prepare_to_wait+0x34/0x5c
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff802030c7>] md_thread+0xf8/0x10e
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009de6c>] autoremove_wake_function+0x0/0x2e
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff80202fcf>] md_thread+0x0/0x10e
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff80032569>] kthread+0xfe/0x132
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009dc54>] keventd_create_kthread+0x0/0xc4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8003246b>] kthread+0x0/0x132
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Jan 27 17:05:56 sunstorage01 kernel: 


How reproducible: Random

Steps to Reproduce:

As per customer comment most of the time issue reproducible while creating and resync of the devices. They simply use the command provided in Problem Desc.

Actual results: Call trace visible with vmcore generation.

Expected results: RAID creation and resync should go smoothly with no Call trace.

Additional info:

We tried reproducing the issue in test environment however no success. Vmcore was not uploaded to the ftp site by the customer and we have setup for the vmcore on a test system :

vmcore ---

IP :10.65.6.56
root/redhat

#crash /usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux vmcore-1887814

I am uploading the core to ftp site as well with name : vmcore-1887814.gz
This event sent from IssueTracker by djeffery  [Support Engineering Group]
 issue 268390

Comment 2 David Jeffery 2009-05-15 20:39:57 UTC
With large numbers of disks and raid6, it seems the raid5 worker thread is saturating the cpu.  And there's no guarantee it will ever schedule in a timely manner.  Instead, it's stuck in a loop where more work is continuously added as it finishes older work.

As an example, in one case while debugging the issue over 1 million iterations of the loop in raid5d() occurred before the loop was finally broken.  It took almost 3.5 minutes before finally leaving the raid5d() function.

Comment 3 David Jeffery 2009-05-15 20:46:31 UTC
Created attachment 344232 [details]
add cond_resched to loop in function raid5d

After adding a call to cond_resched() in raid5d() where no locks are held, I have not been able to reproduce the soft lockup.  The raid5 worker thread still runs almost continuously, but will no longer monopolize a cpu as the resync thread feeds it requests.

Comment 14 Don Zickus 2009-12-15 20:17:35 UTC
in kernel-2.6.18-181.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please update the appropriate value in the Verified field
(cf_verified) to indicate this fix has been successfully
verified. Include a comment with verification details.

Comment 17 errata-xmlrpc 2010-03-30 07:13:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html