Bug 501075

Summary:

soft lockups with software RAID6 create and re-sync'

Product:

Red Hat Enterprise Linux 5

Reporter:

Issue Tracker <tao>

Component:

kernel

Assignee:

Doug Ledford <dledford>

Status:

CLOSED ERRATA

QA Contact:

Red Hat Kernel QE team <kernel-qe>

Severity:

high

Docs Contact:

Priority:

high

Version:

5.3

CC:

cward, daniel.brnak, djeffery, jtluka, kbaxley, paul.gregory-cullen, rrajaram, tao

Target Milestone:

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2010-03-30 07:13:32 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

499522, 525215, 533192

Attachments:

Description	Flags
add cond_resched to loop in function raid5d	none

Description Issue Tracker 2009-05-15 20:20:23 UTC

Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2009-05-15 20:20:25 UTC

Description of problem: On RHLE5.3 system customer create a RAID6 with 24 disks which results into a kernel panic and call trace in the messages file. The issue also reproducible when try to re-sync the disk after removing a disk.

#mdadm --create /dev/md_d1 --metadata=1.0 --level=6 --raid-devices=24 /dev/sdd1 /dev/sdh1 /dev/sdl1 /dev/sdp1 /dev/sdt1 /dev/sdx1 /dev/sdab1 /dev/sdaf1 /dev/sdaj1 /dev/sdan1 /dev/sdar1 /dev/sdav1 /dev/sdc1 /dev/sdg1 /dev/sdk1 /dev/sdo1 /dev/sds1 /dev/sdw1 /dev/sdaa1 /dev/sdae1 /dev/sdai1 /dev/sdam1 /dev/sdaq1 /dev/sdau1

dmesg:
Jan 27 17:05:14 sunstorage01 kernel: md: syncing RAID array md_d1
Jan 27 17:05:14 sunstorage01 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Jan 27 17:05:14 sunstorage01 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Jan 27 17:05:14 sunstorage01 kernel: md: using 128k window, over a total of 976759808 blocks.
Jan 27 17:05:14 sunstorage01 kernel:  md_d1: unknown partition table
Jan 27 17:05:26 sunstorage01 kernel: BUG: soft lockup - CPU#1 stuck for 10s! [md_d1_raid5:6923]
Jan 27 17:05:26 sunstorage01 kernel: CPU 1:
Jan 27 17:05:26 sunstorage01 kernel: Modules linked in: raid456 xor ipv6 xfrm_nalgo crypto_api mptctl autofs4 ipmi_devintf ipmi_si ipmi_msghandler hidp l2cap bluetooth sunrpc dm_mirror dm_multipath dm_mod video sbs backlight i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev sg shpchp i2c_nforce2 mlx4_core forcedeth i2c_core pcspkr usb_storage mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Jan 27 17:05:26 sunstorage01 kernel: Pid: 6923, comm: md_d1_raid5 Not tainted 2.6.18-92.1.22.el5 #1
Jan 27 17:05:56 sunstorage01 kernel: RIP: 0010:[<ffffffff8000b5c7>]  [<ffffffff8000b5c7>] memcmp+0x1a/0x22
Jan 27 17:05:56 sunstorage01 kernel: RSP: 0018:ffff811027d09cd8  EFLAGS: 00000206
Jan 27 17:05:56 sunstorage01 kernel: RAX: 0000000000000000 RBX: ffff81080b7c60b8 RCX: 0000000000000665
Jan 27 17:05:56 sunstorage01 kernel: RDX: 0000000000000000 RSI: ffff81080b7d899b RDI: ffff81081574c99b
Jan 27 17:05:56 sunstorage01 kernel: RBP: ffff81080b7d6000 R08: ffff810001000810 R09: 0000000000001000
Jan 27 17:05:56 sunstorage01 kernel: R10: ffff81080b7d9000 R11: ffff81080b7d8000 R12: ffff81080b7d5000
Jan 27 17:05:56 sunstorage01 kernel: R13: ffff81080b7d4000 R14: ffff81080b7d3000 R15: ffff81080b7d2000
Jan 27 17:05:56 sunstorage01 kernel: FS:  00002b47d593bb40(0000) GS:ffff81011c9608c0(0000) knlGS:0000000000000000
Jan 27 17:05:56 sunstorage01 kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jan 27 17:05:56 sunstorage01 kernel: CR2: 00000038a806abe0 CR3: 000000081aef4000 CR4: 00000000000006e0
Jan 27 17:05:56 sunstorage01 kernel: 
Jan 27 17:05:56 sunstorage01 kernel: Call Trace:
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff883f7d10>] :raid456:handle_stripe+0xe5b/0x24d4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff800892a6>] __wake_up_common+0x3e/0x68
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8002e284>] __wake_up+0x38/0x4f
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009dc54>] keventd_create_kthread+0x0/0xc4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009dc54>] keventd_create_kthread+0x0/0xc4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff883f94d6>] :raid456:raid5d+0x14d/0x17b
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8003a940>] prepare_to_wait+0x34/0x5c
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff802030c7>] md_thread+0xf8/0x10e
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009de6c>] autoremove_wake_function+0x0/0x2e
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff80202fcf>] md_thread+0x0/0x10e
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff80032569>] kthread+0xfe/0x132
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8009dc54>] keventd_create_kthread+0x0/0xc4
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8003246b>] kthread+0x0/0x132
Jan 27 17:05:56 sunstorage01 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Jan 27 17:05:56 sunstorage01 kernel: 


How reproducible: Random

Steps to Reproduce:

As per customer comment most of the time issue reproducible while creating and resync of the devices. They simply use the command provided in Problem Desc.

Actual results: Call trace visible with vmcore generation.

Expected results: RAID creation and resync should go smoothly with no Call trace.

Additional info:

We tried reproducing the issue in test environment however no success. Vmcore was not uploaded to the ftp site by the customer and we have setup for the vmcore on a test system :

vmcore ---

IP :10.65.6.56
root/redhat

#crash /usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux vmcore-1887814

I am uploading the core to ftp site as well with name : vmcore-1887814.gz
This event sent from IssueTracker by djeffery  [Support Engineering Group]
 issue 268390

Comment 2 David Jeffery 2009-05-15 20:39:57 UTC

With large numbers of disks and raid6, it seems the raid5 worker thread is saturating the cpu.  And there's no guarantee it will ever schedule in a timely manner.  Instead, it's stuck in a loop where more work is continuously added as it finishes older work.

As an example, in one case while debugging the issue over 1 million iterations of the loop in raid5d() occurred before the loop was finally broken.  It took almost 3.5 minutes before finally leaving the raid5d() function.

Comment 3 David Jeffery 2009-05-15 20:46:31 UTC

Created attachment 344232 [details]
add cond_resched to loop in function raid5d

After adding a call to cond_resched() in raid5d() where no locks are held, I have not been able to reproduce the soft lockup.  The raid5 worker thread still runs almost continuously, but will no longer monopolize a cpu as the resync thread feeds it requests.

Comment 14 Don Zickus 2009-12-15 20:17:35 UTC

in kernel-2.6.18-181.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please update the appropriate value in the Verified field
(cf_verified) to indicate this fix has been successfully
verified. Include a comment with verification details.

Comment 17 errata-xmlrpc 2010-03-30 07:13:32 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html