Bug 1301220
Summary: | [abrt] WARNING: CPU: 4 PID: 1191 at drivers/md/raid5.c:4246 break_stripe_batch_list+0x1a9/0x250 [raid456]() [raid456] | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Brian <bugzilla-redhat> | ||||
Component: | kernel | Assignee: | fedora-kernel-raid | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 23 | CC: | bugzilla-redhat, extras-qa, gansalmon, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab | ||||
Target Milestone: | --- | Flags: | bugzilla-redhat:
needinfo-
|
||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
URL: | https://retrace.fedoraproject.org/faf/reports/bthash/5857f2cd15c43af419e6962bfcc11806dc073ded | ||||||
Whiteboard: | abrt_hash:d0f4f9cfcf46183648ff065569980002ffa2efe4;VARIANT_ID=server; | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-10-03 11:50:21 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Brian
2016-01-22 22:05:28 UTC
Created attachment 1117353 [details]
File: dmesg
This happened last week, Jan 15 09:09:32 , with kernel 4.2.8-300 kernel: ------------[ cut here ]------------ kernel: WARNING: CPU: 4 PID: 1191 at drivers/md/raid5.c:4246 break_stripe_batch_list+0x1a9/0x250 [raid456]() kernel: Modules linked in: rpcsec_gss_krb5 bluetooth vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun 8021q garp mrp cfg80211 rfkill ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_filter ip6_tables iptable_security iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle w83795 xfs libcrc32c btrfs raid456 kvm_amd async_raid6_recov kvm async_memcpy async_pq async_xor async_tx xor crct10dif_pclmul crc32_pclmul crc32c_intel amd64_edac_mod joydev edac_core raid6_pq sp5100_tco fam15h_power k10temp edac_mce_amd i2c_piix4 shpchp tpm_tis kernel: tpm acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc raid1 mgag200 i2c_algo_bit drm_kms_helper ttm drm e1000 serio_raw e1000e sata_sil24 mpt2sas raid_class ptp scsi_transport_sas pps_core kernel: CPU: 4 PID: 1191 Comm: md124_raid6 Not tainted 4.2.8-300.fc23.x86_64 #1 kernel: Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.5 11/25/2013 kernel: 0000000000000000 00000000a427c98a ffff880bdd1f3a98 ffffffff817738ca kernel: 0000000000000000 0000000000000000 ffff880bdd1f3ad8 ffffffff8109e4c6 kernel: 0000000000000010 0000000000000000 ffff8817e87a48f0 ffff8817c8092500 kernel: Call Trace: kernel: [<ffffffff817738ca>] dump_stack+0x45/0x57 kernel: [<ffffffff8109e4c6>] warn_slowpath_common+0x86/0xc0 kernel: [<ffffffff8109e5fa>] warn_slowpath_null+0x1a/0x20 kernel: [<ffffffffa04a61b9>] break_stripe_batch_list+0x1a9/0x250 [raid456] kernel: [<ffffffffa04af9b9>] handle_stripe+0x9b9/0x2550 [raid456] kernel: [<ffffffff81779d1e>] ? _raw_spin_unlock_irqrestore+0xe/0x10 kernel: [<ffffffffa04b16e6>] handle_active_stripes.isra.45+0x196/0x4b0 [raid456] kernel: [<ffffffffa04b1e7f>] raid5d+0x47f/0x660 [raid456] kernel: [<ffffffff815e3909>] md_thread+0x139/0x150 kernel: [<ffffffff810df9a0>] ? wake_atomic_t_function+0x70/0x70 kernel: [<ffffffff815e37d0>] ? find_pers+0x80/0x80 kernel: [<ffffffff810bc8c8>] kthread+0xd8/0xf0 kernel: [<ffffffff810bc7f0>] ? kthread_worker_fn+0x160/0x160 kernel: [<ffffffff8177a69f>] ret_from_fork+0x3f/0x70 kernel: [<ffffffff810bc7f0>] ? kthread_worker_fn+0x160/0x160 kernel: ---[ end trace fed71451b49ee7b6 ]--- Another Friday, another crash. This time kernel-4.3.3-303.fc23.x86_64 Related to bug 1258153 ? I also set /sys/block/md124/md/stripe_cache_size to 16384 on boot as noted by bug 1258153, comment 1 Feb 5 15:18:11 vh0 kernel: ------------[ cut here ]------------ Feb 5 15:18:11 vh0 kernel: WARNING: CPU: 11 PID: 1177 at drivers/md/raid5.c:4240 break_stripe_batch_list+0x1a9/0x250 [raid456]() Feb 5 15:18:11 vh0 kernel: Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun 8021q garp mrp cfg80211 rfkill ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_security iptable_raw w83795 xfs libcrc32c kvm_amd kvm btrfs crct10dif_pclmul crc32_pclmul raid456 crc32c_intel async_raid6_recov async_memcpy async_pq async_xor xor async_tx joydev raid6_pq amd64_edac_mod edac_mce_amd fam15h_power sp5100_tco edac_core k10temp shpchp i2c_piix4 tpm_tis tpm acpi_cpufreq nfsd Feb 5 15:18:11 vh0 kernel: auth_rpcgss nfs_acl lockd grace sunrpc raid1 mgag200 i2c_algo_bit drm_kms_helper ttm drm e1000 serio_raw e1000e mpt2sas sata_sil24 raid_class ptp scsi_transport_sas pps_core fjes Feb 5 15:18:11 vh0 kernel: CPU: 11 PID: 1177 Comm: md124_raid6 Not tainted 4.3.3-303.fc23.x86_64 #1 Feb 5 15:18:11 vh0 kernel: Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.5 11/25/2013 Feb 5 15:18:11 vh0 kernel: 0000000000000000 000000004b473b69 ffff88003661bad8 ffffffff813a625f Feb 5 15:18:11 vh0 kernel: 0000000000000000 ffff88003661bb10 ffffffff810a07c2 0000000000000000 Feb 5 15:18:11 vh0 kernel: ffff8817cb9fc8f0 ffff8817dc89bd50 ffff8817dc89bcc8 ffff8817cb9fc8f0 Feb 5 15:18:11 vh0 kernel: Call Trace: Feb 5 15:18:11 vh0 kernel: [<ffffffff813a625f>] dump_stack+0x44/0x55 Feb 5 15:18:11 vh0 kernel: [<ffffffff810a07c2>] warn_slowpath_common+0x82/0xc0 Feb 5 15:18:11 vh0 kernel: [<ffffffff810a090a>] warn_slowpath_null+0x1a/0x20 Feb 5 15:18:11 vh0 kernel: [<ffffffffa038c0d9>] break_stripe_batch_list+0x1a9/0x250 [raid456] Feb 5 15:18:11 vh0 kernel: [<ffffffffa03959e4>] handle_stripe+0xa44/0x2640 [raid456] Feb 5 15:18:11 vh0 kernel: [<ffffffffa0059e61>] ? _scsih_qcmd+0x281/0x7c0 [mpt2sas] Feb 5 15:18:11 vh0 kernel: [<ffffffffa039776d>] handle_active_stripes.isra.44+0x18d/0x4a0 [raid456] Feb 5 15:18:11 vh0 kernel: [<ffffffffa038b87d>] ? do_release_stripe+0x8d/0x170 [raid456] Feb 5 15:18:11 vh0 kernel: [<ffffffff815fbd96>] ? bitmap_daemon_work+0x1c6/0x350 Feb 5 15:18:11 vh0 kernel: [<ffffffffa038b975>] ? __release_stripe+0x15/0x20 [raid456] Feb 5 15:18:11 vh0 kernel: [<ffffffffa0397efc>] raid5d+0x47c/0x710 [raid456] Feb 5 15:18:11 vh0 kernel: [<ffffffff811086be>] ? try_to_del_timer_sync+0x5e/0x90 Feb 5 15:18:11 vh0 kernel: [<ffffffff81108460>] ? trace_event_raw_event_tick_stop+0xf0/0xf0 Feb 5 15:18:11 vh0 kernel: [<ffffffff815ed089>] md_thread+0x139/0x150 Feb 5 15:18:11 vh0 kernel: [<ffffffff810e2370>] ? wake_atomic_t_function+0x70/0x70 Feb 5 15:18:11 vh0 kernel: [<ffffffff815ecf50>] ? find_pers+0x70/0x70 Feb 5 15:18:11 vh0 kernel: [<ffffffff810bede8>] kthread+0xd8/0xf0 Feb 5 15:18:11 vh0 kernel: [<ffffffff810bed10>] ? kthread_worker_fn+0x160/0x160 Feb 5 15:18:11 vh0 kernel: [<ffffffff81781adf>] ret_from_fork+0x3f/0x70 Feb 5 15:18:11 vh0 kernel: [<ffffffff810bed10>] ? kthread_worker_fn+0x160/0x160 Feb 5 15:18:11 vh0 kernel: ---[ end trace 537dd668d3493211 ]--- Feb 5 15:18:12 vh0 abrt-dump-journal-oops: abrt-dump-journal-oops: Found oopses: 1 Feb 5 15:18:12 vh0 abrt-dump-journal-oops: abrt-dump-journal-oops: Creating problem directories Feb 5 15:18:13 vh0 abrt-server: Looking for kernel package Feb 5 15:18:13 vh0 abrt-server: Kernel package kernel-core-4.3.3-303.fc23.x86_64 found Feb 5 15:18:13 vh0 abrt-dump-journal-oops: Reported 1 kernel oopses to Abrt Crashed again in kernel-4.3.4-300.fc23.x86_64, after 2 days Feb 7 22:46:39 vh0 kernel: ------------[ cut here ]------------ Feb 7 22:46:39 vh0 kernel: WARNING: CPU: 1 PID: 1184 at drivers/md/raid5.c:4240 break_stripe_batch_list+0x1a9/0x250 [raid456]() Feb 7 22:46:39 vh0 kernel: Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun 8021q garp mrp cfg80211 rfkill ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_broute bridge stp llc ebtable_filter ebtable_nat ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_mangle ip6table_security ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw iptable_mangle iptable_security w83795 xfs libcrc32c kvm_amd kvm raid456 btrfs async_raid6_recov crct10dif_pclmul crc32_pclmul async_memcpy async_pq crc32c_intel async_xor xor async_tx joydev raid6_pq sp5100_tco amd64_edac_mod edac_mce_amd i2c_piix4 k10temp fam15h_power shpchp edac_core acpi_cpufreq tpm_tis tpm nfsd Feb 7 22:46:39 vh0 kernel: auth_rpcgss nfs_acl lockd grace sunrpc raid1 mgag200 i2c_algo_bit drm_kms_helper ttm drm serio_raw e1000 e1000e mpt2sas sata_sil24 raid_class ptp scsi_transport_sas pps_core fjes Feb 7 22:46:39 vh0 kernel: CPU: 1 PID: 1184 Comm: md124_raid6 Not tainted 4.3.4-300.fc23.x86_64 #1 Feb 7 22:46:39 vh0 kernel: Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.5 11/25/2013 Feb 7 22:46:39 vh0 kernel: 0000000000000000 00000000c0bdb046 ffff8817e175fad8 ffffffff813a625f Feb 7 22:46:39 vh0 kernel: 0000000000000000 ffff8817e175fb10 ffffffff810a07c2 0000000000000000 Feb 7 22:46:39 vh0 kernel: ffff8814d8e28c28 ffff8817d0acd5a0 ffff8817d0acd518 ffff8817d442d518 Feb 7 22:46:39 vh0 kernel: Call Trace: Feb 7 22:46:39 vh0 kernel: [<ffffffff813a625f>] dump_stack+0x44/0x55 Feb 7 22:46:39 vh0 kernel: [<ffffffff810a07c2>] warn_slowpath_common+0x82/0xc0 Feb 7 22:46:39 vh0 kernel: [<ffffffff810a090a>] warn_slowpath_null+0x1a/0x20 Feb 7 22:46:39 vh0 kernel: [<ffffffffa03340d9>] break_stripe_batch_list+0x1a9/0x250 [raid456] Feb 7 22:46:39 vh0 kernel: [<ffffffffa033d9e4>] handle_stripe+0xa44/0x2640 [raid456] Feb 7 22:46:39 vh0 kernel: [<ffffffffa005ce61>] ? _scsih_qcmd+0x281/0x7c0 [mpt2sas] Feb 7 22:46:39 vh0 kernel: [<ffffffffa033f76d>] handle_active_stripes.isra.44+0x18d/0x4a0 [raid456] Feb 7 22:46:39 vh0 kernel: [<ffffffffa033387d>] ? do_release_stripe+0x8d/0x170 [raid456] Feb 7 22:46:39 vh0 kernel: [<ffffffff815fbdf6>] ? bitmap_daemon_work+0x1c6/0x350 Feb 7 22:46:39 vh0 kernel: [<ffffffffa0333975>] ? __release_stripe+0x15/0x20 [raid456] Feb 7 22:46:39 vh0 kernel: [<ffffffffa033fefc>] raid5d+0x47c/0x710 [raid456] Feb 7 22:46:39 vh0 kernel: [<ffffffff811086be>] ? try_to_del_timer_sync+0x5e/0x90 Feb 7 22:46:39 vh0 kernel: [<ffffffff81108460>] ? trace_event_raw_event_tick_stop+0xf0/0xf0 Feb 7 22:46:39 vh0 kernel: [<ffffffff815ed0e9>] md_thread+0x139/0x150 Feb 7 22:46:39 vh0 kernel: [<ffffffff810e2370>] ? wake_atomic_t_function+0x70/0x70 Feb 7 22:46:39 vh0 kernel: [<ffffffff815ecfb0>] ? find_pers+0x70/0x70 Feb 7 22:46:39 vh0 kernel: [<ffffffff810bede8>] kthread+0xd8/0xf0 Feb 7 22:46:39 vh0 kernel: [<ffffffff810bed10>] ? kthread_worker_fn+0x160/0x160 Feb 7 22:46:39 vh0 kernel: [<ffffffff81781b9f>] ret_from_fork+0x3f/0x70 Feb 7 22:46:39 vh0 kernel: [<ffffffff810bed10>] ? kthread_worker_fn+0x160/0x160 Feb 7 22:46:39 vh0 kernel: ---[ end trace 9d45f5089741982a ]--- Happy Valentine's day to me, with a trip to the office to hard reset the server. . . . again. Is there anything else I can do to help diagnose this? I ask because I'm currently compiling a 4.0.8 kernel for f23( per advice from bug 1258153 ) and I'll be unable to help further. At some point though, I'd like to have confidence that I can use future kernels again. This is a RAID 6, and the stripe_cache_size was set to 16384 via udev rules at boot. It runs anywhere between 3 and 10 days before this happens. Feb 14 07:31:47 vh0 kernel: ------------[ cut here ]------------ Feb 14 07:31:47 vh0 kernel: WARNING: CPU: 2 PID: 1230 at drivers/md/raid5.c:4240 break_stripe_batch_list+0x1a9/0x250 [raid456]() Feb 14 07:31:47 vh0 kernel: Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_security ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_security vhost_net vhost macvtap macvlan tun 8021q garp mrp cfg80211 rfkill bridge stp llc w83795 xfs libcrc32c kvm_amd btrfs kvm raid456 crct10dif_pclmul crc32_pclmul async_raid6_recov crc32c_intel async_memcpy async_pq async_xor xor async_tx raid6_pq joydev amd64_edac_mod sp5100_tco acpi_cpufreq edac_mce_amd shpchp k10temp fam15h_power i2c_piix4 edac_core tpm_tis tpm nfsd Feb 14 07:31:47 vh0 kernel: auth_rpcgss nfs_acl lockd grace sunrpc raid1 mgag200 i2c_algo_bit drm_kms_helper ttm drm e1000 serio_raw e1000e sata_sil24 mpt2sas ptp raid_class scsi_transport_sas pps_core fjes [last unloaded: iptable_raw] Feb 14 07:31:47 vh0 kernel: CPU: 2 PID: 1230 Comm: md124_raid6 Not tainted 4.3.5-300.fc23.x86_64 #1 Feb 14 07:31:47 vh0 kernel: Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.5 11/25/2013 Feb 14 07:31:47 vh0 kernel: 0000000000000000 00000000b32e7713 ffff880be70dfad8 ffffffff813a643f Feb 14 07:31:47 vh0 kernel: 0000000000000000 ffff880be70dfb10 ffffffff810a07d2 0000000000000000 Feb 14 07:31:47 vh0 kernel: ffff8817ddb18000 ffff8817c9412500 ffff8817c9412478 ffff8817dbfe48f0 Feb 14 07:31:47 vh0 kernel: Call Trace: Feb 14 07:31:47 vh0 kernel: [<ffffffff813a643f>] dump_stack+0x44/0x55 Feb 14 07:31:47 vh0 kernel: [<ffffffff810a07d2>] warn_slowpath_common+0x82/0xc0 Feb 14 07:31:47 vh0 kernel: [<ffffffff810a091a>] warn_slowpath_null+0x1a/0x20 Feb 14 07:31:47 vh0 kernel: [<ffffffffa03650d9>] break_stripe_batch_list+0x1a9/0x250 [raid456] Feb 14 07:31:47 vh0 kernel: [<ffffffffa036e9e4>] handle_stripe+0xa44/0x2640 [raid456] Feb 14 07:31:47 vh0 kernel: [<ffffffff810c9dd7>] ? try_to_wake_up+0x47/0x350 Feb 14 07:31:47 vh0 kernel: [<ffffffffa037076d>] handle_active_stripes.isra.44+0x18d/0x4a0 [raid456] Feb 14 07:31:47 vh0 kernel: [<ffffffffa0370efc>] raid5d+0x47c/0x710 [raid456] Feb 14 07:31:47 vh0 kernel: [<ffffffff811086ce>] ? try_to_del_timer_sync+0x5e/0x90 Feb 14 07:31:47 vh0 kernel: [<ffffffff815ed479>] md_thread+0x139/0x150 Feb 14 07:31:47 vh0 kernel: [<ffffffff810e2380>] ? wake_atomic_t_function+0x70/0x70 Feb 14 07:31:47 vh0 kernel: [<ffffffff815ed340>] ? find_pers+0x70/0x70 Feb 14 07:31:47 vh0 kernel: [<ffffffff810bedf8>] kthread+0xd8/0xf0 Feb 14 07:31:47 vh0 kernel: [<ffffffff810bed20>] ? kthread_worker_fn+0x160/0x160 Feb 14 07:31:47 vh0 kernel: [<ffffffff8178219f>] ret_from_fork+0x3f/0x70 Feb 14 07:31:47 vh0 kernel: [<ffffffff810bed20>] ? kthread_worker_fn+0x160/0x160 Feb 14 07:31:47 vh0 kernel: ---[ end trace 8ebf5228f7cce4cf ]--- Feb 14 07:31:48 vh0 abrt-dump-journal-oops: abrt-dump-journal-oops: Found oopses: 1 Feb 14 07:31:48 vh0 abrt-dump-journal-oops: abrt-dump-journal-oops: Creating problem directories Feb 14 07:31:48 vh0 abrt-server: Deleting problem directory oops-2016-02-14-07:31:48-1742-0 (dup of oops-2016-01-15-09:09:33-1582-0) Feb 14 07:31:48 vh0 dbus[1690]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Feb 14 07:31:49 vh0 dbus[1690]: [system] Successfully activated service 'org.freedesktop.problems' Feb 14 07:31:49 vh0 abrt-dump-journal-oops: Reported 1 kernel oopses to Abrt Feb 14 07:31:51 vh0 abrt-server: This problem has already been reported. Do I need to reinitialize the entire array to upgrade the metadata? https://bbs.archlinux.org/viewtopic.php?id=205801 The metadata on this array is actually 1.2, and thus comment 6 is irrelevant. I'm still running 4.3.5-300.fc23.x86_64 with # cat /etc/udev/rules.d/99-md-raid6-tuning.rules SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change", TEST=="md/stripe_cache_size", ATTR{md/stripe_cache_size}="8192" So far with the 8192 value, it's at 4 days uptime, but after the next crash, I'm reverting to a 4.0 kernel. Previously the mean uptime was about 7 days with 16384. # mdadm --detail /dev/md123 /dev/md123: Version : 1.2 Creation Time : Thu Mar 10 08:13:56 2011 Raid Level : raid6 Array Size : 5859787776 (5588.33 GiB 6000.42 GB) Used Dev Size : 976631296 (931.39 GiB 1000.07 GB) Raid Devices : 8 Total Devices : 8 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Thu Feb 18 20:21:48 2016 State : clean Active Devices : 8 Working Devices : 8 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : hq2.advancedopen.net:aos1 UUID : 35a4aca9:e895efd5:4d6dc573:354ba5c3 Events : 5651279 Number Major Minor RaidDevice State 15 8 112 0 active sync /dev/sdh 13 8 64 1 active sync /dev/sde 14 8 80 2 active sync /dev/sdf 10 8 128 3 active sync /dev/sdi 11 8 144 4 active sync /dev/sdj 9 8 160 5 active sync /dev/sdk 8 8 176 6 active sync /dev/sdl 12 8 96 7 active sync /dev/sdg Well, the crash took a LOT longer to happen with stripe_cache_size of 8192, but it happened again today. Anything from anybody? I can't be the only one suffering this. Apr 12 15:03:29 vh0 kernel: ------------[ cut here ]------------ Apr 12 15:03:29 vh0 kernel: WARNING: CPU: 4 PID: 1349 at drivers/md/raid5.c:4240 break_stripe_batch_list+0x1a9/0x250 [raid456]() Apr 12 15:03:29 vh0 kernel: Modules linked in: bluetooth vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun 8021q garp mrp cfg80211 rfkill ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_broute bridge stp llc ebtable_filter ebtable_nat ebtables ip6table_mangle ip6table_raw ip6table_security ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables iptable_mangle iptable_raw iptable_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx btrfs w83795 xor raid6_pq xfs libcrc32c kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel joydev amd64_edac_mod edac_mce_amd edac_core sp5100_tco acpi_cpufreq k10temp fam15h_power i2c_piix4 tpm_tis shpchp Apr 12 15:03:29 vh0 kernel: tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc raid1 mgag200 i2c_algo_bit drm_kms_helper ttm drm e1000 serio_raw e1000e sata_sil24 mpt2sas raid_class ptp scsi_transport_sas pps_core fjes Apr 12 15:03:29 vh0 kernel: CPU: 4 PID: 1349 Comm: md123_raid6 Not tainted 4.3.5-300.fc23.x86_64 #1 Apr 12 15:03:29 vh0 kernel: Hardware name: Supermicro H8SGL/H8SGL, BIOS 3.5 11/25/2013 Apr 12 15:03:29 vh0 kernel: 0000000000000000 000000002c6177e2 ffff880be73dfad8 ffffffff813a643f Apr 12 15:03:29 vh0 kernel: 0000000000000000 ffff880be73dfb10 ffffffff810a07d2 0000000000000000 Apr 12 15:03:29 vh0 kernel: ffff8817d1e16140 ffff8802fb143d50 ffff8802fb143cc8 ffff88162b5c0c28 Apr 12 15:03:29 vh0 kernel: Call Trace: Apr 12 15:03:29 vh0 kernel: [<ffffffff813a643f>] dump_stack+0x44/0x55 Apr 12 15:03:29 vh0 kernel: [<ffffffff810a07d2>] warn_slowpath_common+0x82/0xc0 Apr 12 15:03:29 vh0 kernel: [<ffffffff810a091a>] warn_slowpath_null+0x1a/0x20 Apr 12 15:03:29 vh0 kernel: [<ffffffffa05860d9>] break_stripe_batch_list+0x1a9/0x250 [raid456] Apr 12 15:03:29 vh0 kernel: [<ffffffffa058f9e4>] handle_stripe+0xa44/0x2640 [raid456] Apr 12 15:03:29 vh0 kernel: [<ffffffff810d8684>] ? set_next_entity+0xa4/0x880 Apr 12 15:03:29 vh0 kernel: [<ffffffffa059176d>] handle_active_stripes.isra.44+0x18d/0x4a0 [raid456] Apr 12 15:03:29 vh0 kernel: [<ffffffffa0591efc>] raid5d+0x47c/0x710 [raid456] Apr 12 15:03:29 vh0 kernel: [<ffffffff811086ce>] ? try_to_del_timer_sync+0x5e/0x90 Apr 12 15:03:29 vh0 kernel: [<ffffffff815ed479>] md_thread+0x139/0x150 Apr 12 15:03:29 vh0 kernel: [<ffffffff810e2380>] ? wake_atomic_t_function+0x70/0x70 Apr 12 15:03:29 vh0 kernel: [<ffffffff815ed340>] ? find_pers+0x70/0x70 Apr 12 15:03:29 vh0 kernel: [<ffffffff810bedf8>] kthread+0xd8/0xf0 Apr 12 15:03:29 vh0 kernel: [<ffffffff810bed20>] ? kthread_worker_fn+0x160/0x160 Apr 12 15:03:29 vh0 kernel: [<ffffffff8178219f>] ret_from_fork+0x3f/0x70 Apr 12 15:03:29 vh0 kernel: [<ffffffff810bed20>] ? kthread_worker_fn+0x160/0x160 Apr 12 15:03:29 vh0 kernel: ---[ end trace a3997021533c32c3 ]--- Apr 12 15:03:30 vh0 abrt-dump-journal-oops: abrt-dump-journal-oops: Found oopses: 1 Apr 12 15:03:30 vh0 abrt-dump-journal-oops: abrt-dump-journal-oops: Creating problem directories Apr 12 15:03:30 vh0 abrt-server: Deleting problem directory oops-2016-04-12-15:03:30-1362-0 (dup of oops-2016-01-15-09:09:33-1582-0) Apr 12 15:03:31 vh0 dbus[1289]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Apr 12 15:03:31 vh0 dbus[1289]: [system] Successfully activated service 'org.freedesktop.problems' Apr 12 15:03:31 vh0 abrt-dump-journal-oops: Reported 1 kernel oopses to Abrt Apr 12 15:03:32 vh0 abrt-server: This problem has already been reported. Apr 12 15:03:32 vh0 abrt-server: https://retrace.fedoraproject.org/faf/reports/980973/ Discussion: https://bugzilla.kernel.org/show_bug.cgi?id=108741 Patch: http://thread.gmane.org/87r3fkjttq.fsf@notabene.neil.brown.name Mainlined here: https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.7 But 4.4.7 isn't built yet in Koji or in F23 updates . . . . Laura, help? Koji was down for scheduled maintenance for part of the afternoon so I couldn't get the build out. It's building now. In the future, please give us at least 24 hours after a stable release is available to get it building (4.4.7 only came out today) In my indiscribable glee that I had found a patch, I failed to note the day of the kernel release. You are absolutely correct. Apologies for any implied deficiencies on your part. Thank you for your snappy response. I rest well tonight knowing a fix is coming. https://bodhi.fedoraproject.org/updates/kernel-4.4.7-300.fc23 now available in bodhi for testing. Please give karma as appropriate. *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 23 kernel bugs. Fedora 23 has now been rebased to 4.7.4-100.fc23. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 24 or 25, and are still experiencing this issue, please change the version to Fedora 24 or 25. If you experience different issues, please open a new bug report for those. Confirming that kernel > 4.4.7-300 fixes this. |