Bug 1258153
Summary: | md is hanging in break_stripe_break_list | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Thomas Davis <tdavis> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 22 | CC: | calle, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab |
Target Milestone: | --- | Flags: | jforbes:
needinfo?
|
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-11-23 17:24:11 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Thomas Davis
2015-08-29 17:40:08 UTC
I forgot - I have a script that runs on boot that does: MDS="md0 md1 md2" for MD in $MDS do echo 8192 > /sys/block/$MD/md/stripe_cache_size echo 2048 > /sys/block/$MD/queue/read_ahead_kb done I've since commented out the stripe_cache_size increase and will see if the system is any more stable. I've had the same issue twice now in two days on F21. Machine has been running fine for a long time on F20 and upgraded to F21. Did not happen under load either time. Raid resynced after crash yesterday and seemed clean, data seems intact but last night it blew up again. Is this a problem with the actual md driver or do I have a hardware issue? Sep 9 04:46:05 a.hostname kernel: [56837.517234] WARNING: CPU: 2 PID: 685 at drivers/md/raid5.c:4226 break_stripe_batch_list+0x1b6/0x260 [raid456]() Sep 9 04:46:05 a.hostname kernel: [56837.517237] Modules linked in: vhost_net vhost macvtap macvlan xt_geoip(OE) xt_iprange xt_comment xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc raid456 async_raid6_recov async_memcpy async_pq async_xor kvm_amd ppdev xor async_tx raid6_pq kvm crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul snd_hda_intel crc32c_intel snd_hda_controller snd_hda_codec snd_hda_core ghash_clmulni_intel snd_hwdep snd_seq snd_seq_device k10temp i2c_piix4 snd_pcm parport_pc snd_timer parport shpchp snd tpm_infineon tpm_tis tpm soundcore acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc amdkfd amd_iommu_v2 radeon i2c_algo_bit drm_kms_helper ttm serio_raw drm r8169 mii Sep 9 04:46:05 a.hostname kernel: [56837.517303] CPU: 2 PID: 685 Comm: md0_raid5 Tainted: G OE 4.1.6-100.fc21.x86_64 #1 Sep 9 04:46:05 a.hostname kernel: [56837.517306] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./F2A75M-D3H, BIOS F3 09/20/2012 Sep 9 04:46:05 a.hostname kernel: [56837.517310] 0000000000000000 000000002e5f5144 ffff880818faba98 ffffffff817940d5 Sep 9 04:46:05 a.hostname kernel: [56837.517314] 0000000000000000 0000000000000000 ffff880818fabad8 ffffffff810a163a Sep 9 04:46:05 a.hostname kernel: [56837.517319] 0000000000000348 0000000000000000 ffff88037e721f60 ffff8806f3395ea8 Sep 9 04:46:05 a.hostname kernel: [56837.517323] Call Trace: Sep 9 04:46:05 a.hostname kernel: [56837.517332] [<ffffffff817940d5>] dump_stack+0x45/0x57 Sep 9 04:46:05 a.hostname kernel: [56837.517338] [<ffffffff810a163a>] warn_slowpath_common+0x8a/0xc0 Sep 9 04:46:05 a.hostname kernel: [56837.517343] [<ffffffff810a176a>] warn_slowpath_null+0x1a/0x20 Sep 9 04:46:05 a.hostname kernel: [56837.517349] [<ffffffffa052fa26>] break_stripe_batch_list+0x1b6/0x260 [raid456] Sep 9 04:46:05 a.hostname kernel: [56837.517357] [<ffffffffa05337b0>] handle_stripe+0x880/0x26d0 [raid456] Sep 9 04:46:05 a.hostname kernel: [56837.517366] [<ffffffffa05357ae>] handle_active_stripes.isra.46+0x1ae/0x520 [raid456] Sep 9 04:46:05 a.hostname kernel: [56837.517371] [<ffffffff815f53b9>] ? md_wakeup_thread+0x39/0x70 Sep 9 04:46:05 a.hostname kernel: [56837.517377] [<ffffffffa0529cd3>] ? do_release_stripe+0xe3/0x190 [raid456] Sep 9 04:46:05 a.hostname kernel: [56837.517384] [<ffffffffa05367a8>] raid5d+0x4b8/0x680 [raid456] Sep 9 04:46:05 a.hostname kernel: [56837.517389] [<ffffffff8179632d>] ? __schedule+0x2dd/0x960 Sep 9 04:46:05 a.hostname kernel: [56837.517393] [<ffffffff815f7584>] md_thread+0x154/0x160 Sep 9 04:46:05 a.hostname kernel: [56837.517398] [<ffffffff810e4830>] ? wait_woken+0x90/0x90 Sep 9 04:46:05 a.hostname kernel: [56837.517402] [<ffffffff815f7430>] ? find_pers+0x80/0x80 Sep 9 04:46:05 a.hostname kernel: [56837.517406] [<ffffffff810c06c8>] kthread+0xd8/0xf0 Sep 9 04:46:05 a.hostname kernel: [56837.517410] [<ffffffff810c05f0>] ? kthread_create_on_node+0x1b0/0x1b0 Sep 9 04:46:05 a.hostname kernel: [56837.517415] [<ffffffff8179ada2>] ret_from_fork+0x42/0x70 Sep 9 04:46:05 a.hostname kernel: [56837.517418] [<ffffffff810c05f0>] ? kthread_create_on_node+0x1b0/0x1b0 Sep 9 04:46:05 a.hostname kernel: [56837.517422] ---[ end trace ab62cdc458212cb4 ]--- I ended up moving back to the only 4.0 kernel I could find: Linux tank 4.0.4-301.fc22.x86_64 #1 SMP Thu May 21 13:10:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux and the problem went away. I also found that the code generating this problem was introduced into the 4.1.x kernels. I noticed a kernel-4.1.6 was released into updates lately, but I have no ideas if any updates to the md driver is in it. Same for me, works well with 4.0.4-301.fc22.x86_64 *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs. Fedora 22 has now been rebased to 4.2.3-200.fc22. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23. If you experience different issues, please open a new bug report for those. *********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously. |