Bug 1471459 - BUG: scheduling while atomic: rcuos/2/29/0x00000200 kernel 4.11.9 fc24, fc25
Summary: BUG: scheduling while atomic: rcuos/2/29/0x00000200 kernel 4.11.9 fc24, fc25
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 25
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1472850 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-16 04:28 UTC by Ian Donaldson
Modified: 2017-12-12 10:22 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-12 10:22:15 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
tumeric full kernel log (fc24) (85.58 KB, text/plain)
2017-07-18 00:59 UTC, Ian Donaldson
no flags Details
radish full kernel log (fc25) (77.84 KB, text/plain)
2017-07-18 01:00 UTC, Ian Donaldson
no flags Details

Description Ian Donaldson 2017-07-16 04:28:30 UTC
Description of problem:

Since upgrading to kernel 4.11.9 2 days ago I've seen 2 instances
of this error on 2 separate Dell R330 servers (identical hardware); 
one running fc24 and the other fc25.

Version-Release number of selected component (if applicable):

4.11.9-100.fc24.x86_64
4.11.9-200.fc25.x86_64

Kernel traces:

Jul 15 09:00:02 tumeric kernel: BUG: scheduling while atomic: rcuos/3/37/0x00000200
Jul 15 09:00:02 tumeric kernel: Modules linked in: nfsv3 nfs fscache binfmt_misc cfg80211 rfkill intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ipmi_ssif crc32_pclmul ghash_clmulni_intel intel_cstate ipmi_si intel_uncore intel_rapl_perf dcdbas wdat_wdt ipmi_devintf wmi ipmi_msghandler acpi_power_meter video ie31200_edac mei_me tpm_tis tpm_tis_core i2c_i801 shpchp edac_core mei tpm intel_pch_thermal nfsd auth_rpcgss nfs_acl lockd grace sunrpc raid1 mgag200 i2c_algo_bit drm_kms_helper tg3 ttm ptp drm crc32c_intel megaraid_sas pps_core
Jul 15 09:00:02 tumeric kernel: CPU: 3 PID: 37 Comm: rcuos/3 Not tainted 4.11.9-100.fc24.x86_64 #1
Jul 15 09:00:04 tumeric kernel: Hardware name: Dell Inc. PowerEdge R330/0H5N7P, BIOS 1.4.5 08/09/2016
Jul 15 09:00:04 tumeric kernel: Call Trace:
Jul 15 09:00:04 tumeric kernel: dump_stack+0x63/0x86
Jul 15 09:00:04 tumeric kernel: __schedule_bug+0x54/0x70
Jul 15 09:00:04 tumeric kernel: __schedule+0x627/0x8a0
Jul 15 09:00:04 tumeric kernel: schedule+0x36/0x80
Jul 15 09:00:04 tumeric kernel: schedule_timeout+0x238/0x300
Jul 15 09:00:04 tumeric kernel: ? unfreeze_partials.isra.70+0x178/0x1c0
Jul 15 09:00:04 tumeric kernel: wait_for_completion+0x111/0x180
Jul 15 09:00:04 tumeric kernel: ? wait_for_completion+0x111/0x180
Jul 15 09:00:04 tumeric kernel: ? wake_up_q+0x80/0x80
Jul 15 09:00:04 tumeric kernel: __wait_rcu_gp+0xc8/0xf0
Jul 15 09:00:04 tumeric kernel: synchronize_sched+0x5d/0x80
Jul 15 09:00:04 tumeric kernel: ? __call_rcu+0x320/0x320
Jul 15 09:00:04 tumeric kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 15 09:00:04 tumeric kernel: blk_queue_bypass_start+0x73/0x80
Jul 15 09:00:04 tumeric kernel: blkcg_deactivate_policy+0x110/0x120
Jul 15 09:00:04 tumeric kernel: blk_throtl_exit+0x34/0x50
Jul 15 09:00:04 tumeric kernel: blkcg_exit_queue+0x3a/0x40
Jul 15 09:00:04 tumeric kernel: blk_release_queue+0x2f/0x100
Jul 15 09:00:04 tumeric kernel: kobject_release+0x6a/0x170
Jul 15 09:00:04 tumeric kernel: kobject_put+0x2f/0x60
Jul 15 09:00:04 tumeric kernel: blk_exit_rl+0x35/0x40
Jul 15 09:00:04 tumeric kernel: blkg_free+0x60/0xc0
Jul 15 09:00:04 tumeric kernel: __blkg_release_rcu+0x5a/0xc0
Jul 15 09:00:04 tumeric kernel: rcu_nocb_kthread+0x2b4/0x4c0
Jul 15 09:00:04 tumeric kernel: kthread+0x109/0x140
Jul 15 09:00:04 tumeric kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 15 09:00:04 tumeric kernel: ? kthread_park+0x90/0x90
Jul 15 09:00:04 tumeric kernel: ret_from_fork+0x25/0x30


Jul 16 03:00:02 garlic kernel: BUG: scheduling while atomic: rcuos/2/29/0x00000200
Jul 16 03:00:02 garlic kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache binfmt_misc cfg80211 rfkill sunrpc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipmi_ssif intel_cstate ipmi_si intel_uncore intel_rapl_perf ipmi_devintf dcdbas wdat_wdt wmi ipmi_msghandler mei_me tpm_tis video mei acpi_power_meter ie31200_edac tpm_tis_core i2c_i801 edac_core tpm intel_pch_thermal shpchp raid1 mgag200 i2c_algo_bit drm_kms_helper tg3 ttm ptp drm crc32c_intel megaraid_sas pps_core
Jul 16 03:00:02 garlic kernel: CPU: 3 PID: 29 Comm: rcuos/2 Not tainted 4.11.9-200.fc25.x86_64 #1
Jul 16 03:00:03 garlic kernel: Hardware name: Dell Inc. PowerEdge R330/0H5N7P, BIOS 1.4.5 08/09/2016
Jul 16 03:00:03 garlic kernel: Call Trace:
Jul 16 03:00:03 garlic kernel: dump_stack+0x63/0x86
Jul 16 03:00:03 garlic kernel: __schedule_bug+0x54/0x70
Jul 16 03:00:03 garlic kernel: __schedule+0x627/0x8a0
Jul 16 03:00:03 garlic kernel: schedule+0x36/0x80
Jul 16 03:00:03 garlic kernel: schedule_timeout+0x238/0x300
Jul 16 03:00:03 garlic kernel: wait_for_completion+0x111/0x180
Jul 16 03:00:03 garlic kernel: ? wait_for_completion+0x111/0x180
Jul 16 03:00:03 garlic kernel: ? wake_up_q+0x80/0x80
Jul 16 03:00:03 garlic kernel: __wait_rcu_gp+0xc8/0xf0
Jul 16 03:00:03 garlic kernel: synchronize_sched+0x5d/0x80
Jul 16 03:00:03 garlic kernel: ? __call_rcu+0x320/0x320
Jul 16 03:00:03 garlic kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 16 03:00:03 garlic kernel: blk_queue_bypass_start+0x73/0x80
Jul 16 03:00:03 garlic kernel: blkcg_deactivate_policy+0x110/0x120
Jul 16 03:00:03 garlic kernel: blk_throtl_exit+0x34/0x50
Jul 16 03:00:03 garlic kernel: blkcg_exit_queue+0x3a/0x40
Jul 16 03:00:03 garlic kernel: blk_release_queue+0x2f/0x100
Jul 16 03:00:03 garlic kernel: kobject_release+0x6a/0x170
Jul 16 03:00:03 garlic kernel: kobject_put+0x2f/0x60
Jul 16 03:00:03 garlic kernel: blk_exit_rl+0x35/0x40
Jul 16 03:00:03 garlic kernel: blkg_free+0x60/0xc0
Jul 16 03:00:03 garlic kernel: __blkg_release_rcu+0x5a/0xc0
Jul 16 03:00:03 garlic kernel: rcu_nocb_kthread+0x2b4/0x4c0
Jul 16 03:00:03 garlic kernel: kthread+0x109/0x140
Jul 16 03:00:03 garlic kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 16 03:00:03 garlic kernel: ? kthread_park+0x90/0x90
Jul 16 03:00:03 garlic kernel: ret_from_fork+0x25/0x30
Jul 16 03:00:03 garlic kernel: NOHZ: local_softirq_pending 28a


In both cases the load average was stuck at exactly 2.
Both systems were otherwise idle.



How reproducible:

seems random

Steps to Reproduce:
1. can't
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Ian Donaldson 2017-07-16 04:46:16 UTC
Just saw it on a 3rd system, different hardware: Sun X2100M2

Jul 16 04:40:01 radish kernel: BUG: scheduling while atomic: rcuos/1/21/0x00000200
Jul 16 04:40:01 radish kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache binfmt_misc forcedeth tg3 xt_set ip_set ip_vs nfnetlink nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c cfg80211 rfkill xt_limit nf_log_ipv4 nf_log_common xt_LOG sunrpc amd64_edac_mod edac_mce_amd edac_core ppdev kvm_amd kvm irqbypass k8temp parport_pc parport shpchp nv_tco i2c_nforce2 tpm_tis tpm_tis_core tpm raid1 ata_generic pata_acpi ast i2c_algo_bit drm_kms_helper ttm drm serio_raw sata_nv pata_amd ptp pps_core [last unloaded: forcedeth]
Jul 16 04:40:01 radish kernel: CPU: 0 PID: 21 Comm: rcuos/1 Tainted: G        W       4.11.9-200.fc25.x86_64 #1
Jul 16 04:40:01 radish kernel: Hardware name: Sun Microsystems Sun Fire X2100 M2/S40              , BIOS S40_3A21 10/30/2008
Jul 16 04:40:01 radish kernel: Call Trace:
Jul 16 04:40:01 radish kernel: dump_stack+0x63/0x86
Jul 16 04:40:01 radish kernel: __schedule_bug+0x54/0x70
Jul 16 04:40:01 radish kernel: __schedule+0x627/0x8a0
Jul 16 04:40:01 radish kernel: schedule+0x36/0x80
Jul 16 04:40:01 radish kernel: schedule_timeout+0x238/0x300
Jul 16 04:40:01 radish kernel: wait_for_completion+0x111/0x180
Jul 16 04:40:01 radish kernel: ? wait_for_completion+0x111/0x180
Jul 16 04:40:01 radish kernel: ? wake_up_q+0x80/0x80
Jul 16 04:40:01 radish kernel: __wait_rcu_gp+0xc8/0xf0
Jul 16 04:40:01 radish kernel: synchronize_sched+0x5d/0x80
Jul 16 04:40:01 radish kernel: ? __call_rcu+0x320/0x320
Jul 16 04:40:01 radish kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 16 04:40:01 radish kernel: blk_queue_bypass_start+0x73/0x80
Jul 16 04:40:01 radish kernel: blkcg_deactivate_policy+0x110/0x120

Comment 2 Ian Donaldson 2017-07-16 04:48:04 UTC
Prior kernel on this system was 4.11.3-200.fc25.x86_64 and had no issues.

Comment 3 Ian Donaldson 2017-07-16 04:49:41 UTC
Load average stuck at 3 on this system.

top - 04:49:14 up 1 day, 23:15,  1 user,  load average: 3.49, 2.67, 1.41
Tasks: 136 total,   1 running, 135 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.2 sy,  0.0 ni, 99.2 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  3079392 total,  2074684 free,   191092 used,   813616 buff/cache
KiB Swap:  4194300 total,  4194300 free,        0 used.  2684764 avail Mem 

...

Comment 4 Ian Donaldson 2017-07-16 04:51:24 UTC
$ ps auxww |grep D
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        21  0.0  0.0      0     0 ?        D    Jul14   0:30 [rcuos/1]
root       840  0.0  0.1 207992  3216 ?        Ssl  Jul14   0:00 /usr/sbin/gssproxy -D
root       850  0.0  0.2  48320  7136 ?        Ds   Jul14   0:04 /usr/lib/systemd/systemd-logind
root       977  0.0  1.2 1108140 37416 ?       Ss   Jul14   0:01 /usr/bin/abrt-dump-journal-oops -fxtD
root       979  0.0  0.7 1048816 22436 ?       Ss   Jul14   0:00 /usr/bin/abrt-dump-journal-xorg -fxtD
root      1841  0.0  0.2  98504  7268 ?        Ss   Jul14   0:00 /usr/sbin/sshd -D
root      1879  0.0  0.0 124120  2556 ?        Ss   Jul14   0:04 /usr/sbin/keepalived -D
root      1880  0.0  0.1 124120  5096 ?        S    Jul14   0:04 /usr/sbin/keepalived -D
root      1881  0.0  0.1 128484  5908 ?        S    Jul14   0:31 /usr/sbin/keepalived -D
root     13001  0.0  0.0      0     0 ?        D    04:40   0:00 [kworker/0:0]
root     13168  0.0  0.0      0     0 ?        D    04:42   0:00 [kworker/1:1]
iand     13280  0.0  0.0   9160   956 pts/0    S+   04:50   0:00 grep --color=auto D

Comment 5 Ian Donaldson 2017-07-16 04:55:18 UTC
That last trace wasn't complete:

Jul 15 04:40:01 radish kernel: CPU: 0 PID: 21 Comm: rcuos/1 Not tainted 4.11.9-200.fc25.x86_64 #1
Jul 15 04:40:01 radish kernel: Hardware name: Sun Microsystems Sun Fire X2100 M2/S40              , BIOS S40_3A21 10/30/2008
Jul 15 04:40:01 radish kernel: Call Trace:
Jul 15 04:40:01 radish kernel: dump_stack+0x63/0x86
Jul 15 04:40:01 radish kernel: __schedule_bug+0x54/0x70
Jul 15 04:40:01 radish kernel: __schedule+0x627/0x8a0
Jul 15 04:40:01 radish kernel: schedule+0x36/0x80
Jul 15 04:40:01 radish kernel: schedule_timeout+0x238/0x300
Jul 15 04:40:01 radish kernel: wait_for_completion+0x111/0x180
Jul 15 04:40:01 radish kernel: ? wait_for_completion+0x111/0x180
Jul 15 04:40:01 radish kernel: ? wake_up_q+0x80/0x80
Jul 15 04:40:01 radish kernel: __wait_rcu_gp+0xc8/0xf0
Jul 15 04:40:01 radish kernel: synchronize_sched+0x5d/0x80
Jul 15 04:40:01 radish kernel: ? __call_rcu+0x320/0x320
Jul 15 04:40:01 radish kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 15 04:40:01 radish kernel: blk_queue_bypass_start+0x73/0x80
Jul 15 04:40:01 radish kernel: blkcg_deactivate_policy+0x110/0x120
Jul 15 04:40:01 radish kernel: blk_throtl_exit+0x34/0x50
Jul 15 04:40:01 radish kernel: blkcg_exit_queue+0x3a/0x40
Jul 15 04:40:01 radish kernel: blk_release_queue+0x2f/0x100
Jul 15 04:40:01 radish kernel: kobject_release+0x6a/0x170
Jul 15 04:40:01 radish kernel: kobject_put+0x2f/0x60
Jul 15 04:40:01 radish kernel: blk_exit_rl+0x35/0x40
Jul 15 04:40:01 radish kernel: blkg_free+0x60/0xc0
Jul 15 04:40:01 radish kernel: __blkg_release_rcu+0x5a/0xc0
Jul 15 04:40:01 radish kernel: rcu_nocb_kthread+0x2b4/0x4c0
Jul 15 04:40:01 radish kernel: kthread+0x109/0x140
Jul 15 04:40:01 radish kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 15 04:40:01 radish kernel: ? kthread_park+0x90/0x90
Jul 15 04:40:01 radish kernel: ret_from_fork+0x25/0x30
Jul 15 04:40:01 radish kernel: NOHZ: local_softirq_pending 202
Jul 15 04:40:01 radish kernel: BUG: scheduling while atomic: rcuos/1/21/0x7ffffe00
Jul 15 04:40:01 radish kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache binfmt_misc forcedeth tg3 xt_set ip_set ip_vs nfnetlink nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c cfg80211 rfkill xt_limit nf_log_ipv4 nf_log_common xt_LOG sunrpc amd64_edac_mod edac_mce_amd edac_core ppdev kvm_amd kvm irqbypass k8temp parport_pc parport shpchp nv_tco i2c_nforce2 tpm_tis tpm_tis_core tpm raid1 ata_generic pata_acpi ast i2c_algo_bit drm_kms_helper ttm drm serio_raw sata_nv pata_amd ptp pps_core [last unloaded: forcedeth]
Jul 15 04:40:01 radish kernel: CPU: 0 PID: 21 Comm: rcuos/1 Tainted: G        W       4.11.9-200.fc25.x86_64 #1
Jul 15 04:40:01 radish kernel: Hardware name: Sun Microsystems Sun Fire X2100 M2/S40              , BIOS S40_3A21 10/30/2008
Jul 15 04:40:01 radish kernel: Call Trace:
Jul 15 04:40:01 radish kernel: dump_stack+0x63/0x86
Jul 15 04:40:01 radish kernel: __schedule_bug+0x54/0x70
Jul 15 04:40:01 radish kernel: __schedule+0x627/0x8a0
Jul 15 04:40:01 radish kernel: schedule+0x36/0x80
Jul 15 04:40:01 radish kernel: rcu_nocb_kthread+0x3af/0x4c0
Jul 15 04:40:01 radish kernel: kthread+0x109/0x140
Jul 15 04:40:01 radish kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 15 04:40:01 radish kernel: ? kthread_park+0x90/0x90
Jul 15 04:40:01 radish kernel: ret_from_fork+0x25/0x30

Jul 16 04:40:01 radish kernel: BUG: scheduling while atomic: rcuos/1/21/0x00000200
Jul 16 04:40:01 radish kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache binfmt_misc forcedeth tg3 xt_set ip_set ip_vs nfnetlink nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c cfg80
211 rfkill xt_limit nf_log_ipv4 nf_log_common xt_LOG sunrpc amd64_edac_mod edac_mce_amd edac_core ppdev kvm_amd kvm irqbypass k8temp parport_pc parport shpchp nv_tco i2c_nforce2 tpm_tis tpm_tis_core tpm raid1 ata_gener
ic pata_acpi ast i2c_algo_bit drm_kms_helper ttm drm serio_raw sata_nv pata_amd ptp pps_core [last unloaded: forcedeth]
Jul 16 04:40:01 radish kernel: CPU: 0 PID: 21 Comm: rcuos/1 Tainted: G        W       4.11.9-200.fc25.x86_64 #1
Jul 16 04:40:01 radish kernel: Hardware name: Sun Microsystems Sun Fire X2100 M2/S40              , BIOS S40_3A21 10/30/2008
Jul 16 04:40:01 radish kernel: Call Trace:
Jul 16 04:40:01 radish kernel: dump_stack+0x63/0x86
Jul 16 04:40:01 radish kernel: __schedule_bug+0x54/0x70
Jul 16 04:40:01 radish kernel: __schedule+0x627/0x8a0
Jul 16 04:40:01 radish kernel: schedule+0x36/0x80
Jul 16 04:40:01 radish kernel: schedule_timeout+0x238/0x300
Jul 16 04:40:01 radish kernel: wait_for_completion+0x111/0x180
Jul 16 04:40:01 radish kernel: ? wait_for_completion+0x111/0x180
Jul 16 04:40:01 radish kernel: ? wake_up_q+0x80/0x80
Jul 16 04:40:01 radish kernel: __wait_rcu_gp+0xc8/0xf0
Jul 16 04:40:01 radish kernel: synchronize_sched+0x5d/0x80
Jul 16 04:40:01 radish kernel: ? __call_rcu+0x320/0x320
Jul 16 04:40:01 radish kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 16 04:40:01 radish kernel: blk_queue_bypass_start+0x73/0x80
Jul 16 04:40:01 radish kernel: blkcg_deactivate_policy+0x110/0x120
Jul 16 04:40:01 radish kernel: blk_throtl_exit+0x34/0x50
Jul 16 04:40:01 radish kernel: blkcg_exit_queue+0x3a/0x40
Jul 16 04:40:01 radish kernel: blk_release_queue+0x2f/0x100
Jul 16 04:40:01 radish kernel: kobject_release+0x6a/0x170
Jul 16 04:40:01 radish kernel: kobject_put+0x2f/0x60
Jul 16 04:40:01 radish kernel: blk_exit_rl+0x35/0x40
Jul 16 04:40:01 radish kernel: blkg_free+0x60/0xc0
Jul 16 04:40:01 radish kernel: __blkg_release_rcu+0x5a/0xc0
Jul 16 04:40:01 radish kernel: rcu_nocb_kthread+0x2b4/0x4c0
Jul 16 04:40:01 radish kernel: kthread+0x109/0x140
Jul 16 04:40:01 radish kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 16 04:40:01 radish kernel: ? kthread_park+0x90/0x90
Jul 16 04:40:01 radish kernel: ret_from_fork+0x25/0x30

Comment 6 Ian Donaldson 2017-07-16 05:05:46 UTC
In addition, reboot hangs...  this is what is on the console before 
I did a power cycle:

radish login: [ ***  ] (1 of 3) A stop job is running for Login Service (10s / 2[  OK  ] Stopped Login Service.
[    **] (1 of 2) A stop job is running for ...k Time Service (2min 39s / 3min)

Comment 7 Ian Donaldson 2017-07-16 06:51:43 UTC
another system, Sun X4100M2 ...

Jul 16 06:21:42 cheddar kernel: sd 2:0:0:0: Queue depth reduced to (35)
Jul 16 06:30:01 cheddar kernel: BUG: scheduling while atomic: rcuos/0/11/0x00000200
Jul 16 06:30:01 cheddar kernel: Modules linked in: rpcsec_gss_krb5 binfmt_misc e1000 forcedeth nfnetlink_queue nfnetlink_log nfnetlink bluetooth cfg80211 rfkill nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_log_ipv4 nf_log_common xt_mac xt_limit xt_LOG nf_conntrack_ftp nf_conntrack libcrc32c powernow_k8 amd64_edac_mod edac_mce_amd edac_core kvm_amd kvm irqbypass ppdev ipmi_ssif joydev k8temp snd_mpu401_uart ipmi_si snd_rawmidi snd_seq_device snd soundcore ns558 ipmi_devintf tpm_tis tpm_tis_core ipmi_msghandler gameport parport_pc parport i2c_nforce2 shpchp tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc raid1 ata_generic pata_acpi mptsas scsi_transport_sas mptscsih mptbase serio_raw pata_amd [last unloaded: e1000]
Jul 16 06:30:01 cheddar kernel: CPU: 0 PID: 11 Comm: rcuos/0 Tainted: G        W       4.11.9-200.fc25.x86_64 #1
Jul 16 06:30:01 cheddar kernel: Hardware name: Sun Microsystems Sun Fire X4100 M2/Sun Fire X4100 M2                        , BIOS 0ABJX104 04/09/2009
Jul 16 06:30:01 cheddar kernel: Call Trace:
Jul 16 06:30:01 cheddar kernel: dump_stack+0x63/0x86
Jul 16 06:30:01 cheddar kernel: __schedule_bug+0x54/0x70
Jul 16 06:30:01 cheddar kernel: __schedule+0x627/0x8a0
Jul 16 06:30:01 cheddar kernel: schedule+0x36/0x80
Jul 16 06:30:01 cheddar kernel: schedule_timeout+0x238/0x300
Jul 16 06:30:01 cheddar kernel: ? update_load_avg+0x5d0/0xa50
Jul 16 06:30:01 cheddar kernel: ? page_counter_uncharge+0x22/0x40
Jul 16 06:30:01 cheddar kernel: wait_for_completion+0x111/0x180
Jul 16 06:30:01 cheddar kernel: ? wait_for_completion+0x111/0x180
Jul 16 06:30:01 cheddar kernel: ? wake_up_q+0x80/0x80
Jul 16 06:30:01 cheddar kernel: __wait_rcu_gp+0xc8/0xf0
Jul 16 06:30:01 cheddar kernel: synchronize_sched+0x5d/0x80
Jul 16 06:30:01 cheddar kernel: ? __call_rcu+0x320/0x320
Jul 16 06:30:01 cheddar kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 16 06:30:01 cheddar kernel: blk_queue_bypass_start+0x73/0x80
Jul 16 06:30:01 cheddar kernel: blkcg_deactivate_policy+0x110/0x120
Jul 16 06:30:01 cheddar kernel: blk_throtl_exit+0x34/0x50
Jul 16 06:30:01 cheddar kernel: blkcg_exit_queue+0x3a/0x40
Jul 16 06:30:01 cheddar kernel: blk_release_queue+0x2f/0x100
Jul 16 06:30:01 cheddar kernel: kobject_release+0x6a/0x170
Jul 16 06:30:01 cheddar kernel: kobject_put+0x2f/0x60
Jul 16 06:30:01 cheddar kernel: blk_exit_rl+0x35/0x40
Jul 16 06:30:01 cheddar kernel: blkg_free+0x60/0xc0
Jul 16 06:30:01 cheddar kernel: __blkg_release_rcu+0x5a/0xc0
Jul 16 06:30:01 cheddar kernel: rcu_nocb_kthread+0x2b4/0x4c0
Jul 16 06:30:01 cheddar kernel: kthread+0x109/0x140
Jul 16 06:30:01 cheddar kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 16 06:30:01 cheddar kernel: ? kthread_park+0x90/0x90
Jul 16 06:30:01 cheddar kernel: ret_from_fork+0x25/0x30

Comment 8 Laura Abbott 2017-07-17 14:33:28 UTC
Can you share the full kernel log? Can you also test the 4.11.10 update available in bodhi?

Comment 9 Georg Sauthoff 2017-07-17 18:56:07 UTC
I see this issue with:

- F25, Thinkpad x220
- F26, Thinkpad x220
- F26, Dell Latitude E7270 (Skylake)

It occurs during reboot/shutdown - which then hangs.

Journal messages from the Dell Skylake system:

Jul 17 09:08:19 example.org systemd[1]: Stopped Configure read-only root suppo
rt.
Jul 17 09:08:19 example.org kernel: BUG: scheduling while atomic: rcuos/2/29/0
x00000200
Jul 17 09:08:19 example.org kernel: Modules linked in: rfcomm xfs uas usb_stor
age ccm tun ip_set nfnetlink bridge stp llc libcrc32c cmac bnep sunrpc uvcvideo 
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media 
btusb btrtl intel_rapl x86_pkg_temp_thermal intel_powerclamp snd_soc_skl coretem
p kvm_intel snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp kvm snd_hda_ext_core
 snd_hda_codec_hdmi snd_soc_sst_match snd_soc_core dell_led snd_hda_codec_realte
k snd_hda_codec_generic arc4 snd_compress irqbypass snd_pcm_dmaengine intel_csta
te ac97_bus iwlmvm mac80211 iTCO_wdt mei_wdt iTCO_vendor_support dell_wmi sparse
_keymap ppdev iwlwifi dell_laptop dell_smbios dcdbas cfg80211 dell_smm_hwmon snd
_hda_intel intel_uncore intel_rapl_perf snd_hda_codec snd_hda_core snd_hwdep snd
_seq snd_seq_device e1000e snd_pcm
Jul 17 09:08:19 example.org kernel:  joydev snd_timer mei_me i2c_i801 rtsx_pci
_ms ptp snd pps_core memstick soundcore shpchp mei intel_pch_thermal wmi hci_uar
t btbcm btqca parport_pc pinctrl_sunrisepoint btintel intel_lpss_acpi bluetooth 
parport pinctrl_intel intel_lpss acpi_als int3403_thermal int3400_thermal acpi_t
hermal_rel kfifo_buf dell_rbtn processor_thermal_device industrialio int340x_the
rmal_zone tpm_tis intel_soc_dts_iosf tpm_tis_core rfkill acpi_pad tpm binfmt_mis
c btrfs xor raid6_pq dm_crypt i915 rtsx_pci_sdmmc mmc_core crct10dif_pclmul i2c_
algo_bit crc32_pclmul drm_kms_helper crc32c_intel drm ghash_clmulni_intel serio_
raw rtsx_pci video i2c_hid [last unloaded: ip6_tables]
Jul 17 09:08:19 example.org kernel: CPU: 2 PID: 29 Comm: rcuos/2 Not tainted 4.11.9-300.fc26.x86_64 #1
Jul 17 09:08:19 example.org kernel: Hardware name: Dell Inc. Latitude E7270/0T0V7J, BIOS 1.16.4 06/02/2017
Jul 17 09:08:19 example.org kernel: Call Trace:
Jul 17 09:08:19 example.org kernel:  dump_stack+0x63/0x84
Jul 17 09:08:19 example.org kernel:  __schedule_bug+0x55/0x70
Jul 17 09:08:19 example.org kernel:  __schedule+0x66e/0x8d0
Jul 17 09:08:19 example.org kernel:  schedule+0x36/0x80
Jul 17 09:08:19 example.org kernel:  schedule_timeout+0x202/0x300
Jul 17 09:08:19 example.org kernel:  ? account_entity_enqueue+0xd8/0x100
Jul 17 09:08:19 example.org kernel:  wait_for_completion+0x118/0x180
Jul 17 09:08:19 example.org kernel:  ? wait_for_completion+0x118/0x180
Jul 17 09:08:19 example.org kernel:  ? wake_up_q+0x80/0x80
Jul 17 09:08:19 example.org kernel:  __wait_rcu_gp+0xcc/0x100
Jul 17 09:08:19 example.org kernel:  synchronize_sched+0x5d/0x80
Jul 17 09:08:19 example.org kernel:  ? __call_rcu+0x310/0x310
Jul 17 09:08:19 example.org kernel:  ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 17 09:08:19 example.org kernel:  blk_queue_bypass_start+0x73/0x80
Jul 17 09:08:19 example.org kernel:  blkcg_deactivate_policy+0xff/0x120
Jul 17 09:08:19 example.org kernel:  blk_throtl_exit+0x34/0x50
Jul 17 09:08:19 example.org kernel:  blkcg_exit_queue+0x3a/0x40
Jul 17 09:08:19 example.org kernel:  blk_release_queue+0x2f/0x100
Jul 17 09:08:19 example.org kernel:  kobject_release+0x67/0x170
Jul 17 09:08:19 example.org kernel:  kobject_put+0x2b/0x50
Jul 17 09:08:19 example.org kernel:  blk_exit_rl+0x3a/0x50
Jul 17 09:08:19 example.org kernel:  blkg_free.part.7+0x4f/0xc0
Jul 17 09:08:19 example.org kernel:  __blkg_release_rcu+0x61/0xd0
Jul 17 09:08:19 example.org kernel:  rcu_nocb_kthread+0x15f/0x500
Jul 17 09:08:19 example.org kernel:  kthread+0x125/0x140
Jul 17 09:08:19 example.org kernel:  ? get_state_synchronize_sched+0x20/0x20
Jul 17 09:08:19 example.org kernel:  ? kthread_park+0x90/0x90
Jul 17 09:08:19 example.org kernel:  ret_from_fork+0x25/0x30
Jul 17 09:08:19 example.org systemd[1]: Unmounted /run/user/1000.
Jul 17 09:08:19 example.org systemd[1]: Unmounted /run/user/0.
Jul 17 09:08:19 example.org systemd[1]: Unmounted Temporary Directory.
Jul 17 09:08:19 example.org systemd[1]: Stopped target Swap.

Comment 10 Ian Donaldson 2017-07-18 00:58:24 UTC
Attaching full kernel logs from tumeric (fc24) and radish (fc25)
(radish is a firewall; ip tables,  martian, ll header lines have been removed)

Comment 11 Ian Donaldson 2017-07-18 00:59:12 UTC
Created attachment 1300180 [details]
tumeric full kernel log (fc24)

Comment 12 Ian Donaldson 2017-07-18 01:00:09 UTC
Created attachment 1300181 [details]
radish full kernel log (fc25)

Comment 13 Ian Donaldson 2017-07-18 01:13:02 UTC
How do I test 4.11.10?  
(not sure what bodhi means)

It doesn't seem to be in updates-testing

dnf update --enablerepo=updates-testing 'kernel*'

just shows 4.11.9

Comment 14 Ian Donaldson 2017-07-18 01:17:12 UTC
Never mind; found it... testing soon

Comment 15 Laura Abbott 2017-07-18 17:21:02 UTC
I took another look and there was a fix for this missing from stable. I requested it be queued up but this kernel is probably getting rebased to 4.12 before that so I'm not going to bring it in as a separate patch.

Comment 16 Ian Donaldson 2017-07-19 02:00:55 UTC
Ok, let me know when its ready to test.  Thanks

Comment 17 Laura Abbott 2017-07-19 14:22:00 UTC
*** Bug 1472850 has been marked as a duplicate of this bug. ***

Comment 18 Ian Donaldson 2017-07-20 03:06:26 UTC
FWIW 4.11.10 has the same issue

Jul 20 02:40:01 dill kernel: BUG: scheduling while atomic: rcuos/2/29/0x00000200
Jul 20 02:40:01 dill kernel: Modules linked in: joydev nfsv3 nfs_acl nfs lockd grace fscache binfmt_misc cfg80211 rfkill sunrpc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf dcdbas ipmi_ssif wdat_wdt wmi ipmi_si ipmi_devintf video ipmi_msghandler shpchp mei_me acpi_power_meter i2c_i801 intel_pch_thermal mei ie31200_edac tpm_tis edac_core tpm_tis_core tpm raid1 mgag200 i2c_algo_bit drm_kms_helper ttm crc32c_intel drm tg3 ptp pps_core megaraid_sas
Jul 20 02:40:01 dill kernel: CPU: 1 PID: 29 Comm: rcuos/2 Not tainted 4.11.10-200.fc25.x86_64 #1
Jul 20 02:40:01 dill kernel: Hardware name: Dell Inc. PowerEdge R330/0H5N7P, BIOS 2.0.8 01/12/2017
Jul 20 02:40:01 dill kernel: Call Trace:
Jul 20 02:40:01 dill kernel: dump_stack+0x63/0x86
Jul 20 02:40:01 dill kernel: __schedule_bug+0x54/0x70
Jul 20 02:40:01 dill kernel: __schedule+0x627/0x8a0
Jul 20 02:40:01 dill kernel: schedule+0x36/0x80
Jul 20 02:40:01 dill kernel: schedule_timeout+0x238/0x300
Jul 20 02:40:01 dill kernel: wait_for_completion+0x111/0x180
Jul 20 02:40:01 dill kernel: ? wait_for_completion+0x111/0x180
Jul 20 02:40:01 dill kernel: ? wake_up_q+0x80/0x80
Jul 20 02:40:01 dill kernel: __wait_rcu_gp+0xc8/0xf0
Jul 20 02:40:01 dill kernel: synchronize_sched+0x5d/0x80
Jul 20 02:40:01 dill kernel: ? __call_rcu+0x320/0x320
Jul 20 02:40:01 dill kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 20 02:40:01 dill kernel: blk_queue_bypass_start+0x73/0x80
Jul 20 02:40:01 dill kernel: blkcg_deactivate_policy+0x110/0x120
Jul 20 02:40:01 dill kernel: blk_throtl_exit+0x34/0x50
Jul 20 02:40:01 dill kernel: blkcg_exit_queue+0x3a/0x40
Jul 20 02:40:01 dill kernel: blk_release_queue+0x2f/0x100
Jul 20 02:40:01 dill kernel: kobject_release+0x6a/0x170
Jul 20 02:40:01 dill kernel: kobject_put+0x2f/0x60
Jul 20 02:40:01 dill kernel: blk_exit_rl+0x35/0x40
Jul 20 02:40:01 dill kernel: blkg_free+0x60/0xc0
Jul 20 02:40:01 dill kernel: __blkg_release_rcu+0x5a/0xc0
Jul 20 02:40:01 dill kernel: rcu_nocb_kthread+0x2b4/0x4c0
Jul 20 02:40:01 dill kernel: kthread+0x109/0x140
Jul 20 02:40:01 dill kernel: ? get_state_synchronize_rcu+0x20/0x20
Jul 20 02:40:01 dill kernel: ? kthread_park+0x90/0x90
Jul 20 02:40:01 dill kernel: ret_from_fork+0x25/0x30
Jul 20 02:40:01 dill kernel: NOHZ: local_softirq_pending 282

Comment 19 Ian Donaldson 2017-07-22 09:28:22 UTC
Similarly on fc26, 4.11.10 is bad

Jul 22 03:27:00 garlic kernel: BUG: scheduling while atomic: rcuos/0/10/0x00000200
Jul 22 03:27:00 garlic kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache binfmt_misc cfg80211 rfkill sunrpc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ipmi_ssif crc32_pclmul ghash_clmulni_intel intel_cstate wdat_wdt intel_uncore intel_rapl_perf ipmi_si dcdbas ipmi_devintf wmi ipmi_msghandler acpi_power_meter mei_me mei tpm_tis tpm_tis_core tpm video ie31200_edac edac_core i2c_i801 shpchp intel_pch_thermal raid1 mgag200 i2c_algo_bit drm_kms_helper ttm drm crc32c_intel tg3 ptp pps_core megaraid_sas
Jul 22 03:27:00 garlic kernel: CPU: 0 PID: 10 Comm: rcuos/0 Not tainted 4.11.10-300.fc26.x86_64 #1
Jul 22 03:27:00 garlic kernel: Hardware name: Dell Inc. PowerEdge R330/0H5N7P, BIOS 2.0.8 01/12/2017
Jul 22 03:27:00 garlic kernel: Call Trace:
Jul 22 03:27:00 garlic kernel: dump_stack+0x63/0x84
Jul 22 03:27:00 garlic kernel: __schedule_bug+0x55/0x70
Jul 22 03:27:00 garlic kernel: __schedule+0x66e/0x8d0
Jul 22 03:27:00 garlic kernel: schedule+0x36/0x80
Jul 22 03:27:00 garlic kernel: schedule_timeout+0x202/0x300
Jul 22 03:27:00 garlic kernel: wait_for_completion+0x118/0x180
Jul 22 03:27:00 garlic kernel: ? wait_for_completion+0x118/0x180
Jul 22 03:27:00 garlic kernel: ? wake_up_q+0x80/0x80
Jul 22 03:27:00 garlic kernel: __wait_rcu_gp+0xcc/0x100
Jul 22 03:27:00 garlic kernel: synchronize_sched+0x5d/0x80
Jul 22 03:27:00 garlic kernel: ? __call_rcu+0x310/0x310
Jul 22 03:27:00 garlic kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Jul 22 03:27:00 garlic kernel: blk_queue_bypass_start+0x73/0x80
Jul 22 03:27:00 garlic kernel: blkcg_deactivate_policy+0xff/0x120
Jul 22 03:27:00 garlic kernel: blk_throtl_exit+0x34/0x50
Jul 22 03:27:00 garlic kernel: blkcg_exit_queue+0x3a/0x40
Jul 22 03:27:00 garlic kernel: blk_release_queue+0x2f/0x100
Jul 22 03:27:00 garlic kernel: kobject_release+0x67/0x170
Jul 22 03:27:00 garlic kernel: kobject_put+0x2b/0x50
Jul 22 03:27:00 garlic kernel: blk_exit_rl+0x3a/0x50
Jul 22 03:27:00 garlic kernel: blkg_free.part.7+0x4f/0xc0
Jul 22 03:27:00 garlic kernel: __blkg_release_rcu+0x61/0xd0
Jul 22 03:27:00 garlic kernel: rcu_nocb_kthread+0x15f/0x500
Jul 22 03:27:00 garlic kernel: kthread+0x125/0x140
Jul 22 03:27:00 garlic kernel: ? get_state_synchronize_sched+0x20/0x20
Jul 22 03:27:00 garlic kernel: ? kthread_park+0x90/0x90
Jul 22 03:27:00 garlic kernel: ret_from_fork+0x25/0x30
Jul 22 03:27:00 garlic kernel: NOHZ: local_softirq_pending 28a

Comment 20 Ian Donaldson 2017-08-02 07:10:50 UTC
seems fixed in 4.11.11 on fc26 at least; been running that for a week and 
haven't seen any repeats (it used to show up after 2-3 days in 4.11.9 and 4.11.10)

haven't tried 4.11.11 on fc25 yet.

Comment 21 ita 2017-08-07 18:11:17 UTC
Just had the same problem with 4.11.11-300.fc26.x86_64 #1 SMP on fc26. Also had trouble restarting the device.

jul 29 10:32:13 kernel: BUG: scheduling while atomic: rcuos/7/69/0x00000200
jul 29 10:32:13 kernel: Modules linked in: vhost_net vhost tap tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul igb crc32_pclmul crc32c_intel ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf joydev iTCO_wdt iTCO_vendor_support ipmi_ssif ptp pps_core dca lpc_ich tpm_crb tpm_tis ipmi_si tpm_tis_core tpm ipmi_devintf ipmi_msghandler mei_me 
jul 29 10:32:13 kernel: CPU: 2 PID: 69 Comm: rcuos/7 Not tainted 4.11.10-300.fc26.x86_64 #1
jul 29 10:32:13 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./E3C224D2I, BIOS P3.30 06/01/2015
jul 29 10:32:13 kernel: Call Trace: 
jul 29 10:32:13 kernel:  dump_stack+0x63/0x84
jul 29 10:32:13 kernel:  __schedule_bug+0x55/0x70
jul 29 10:32:13 kernel:  __schedule+0x66e/0x8d0
jul 29 10:32:13 kernel:  schedule+0x36/0x80
jul 29 10:32:13 kernel:  schedule_timeout+0x202/0x300
jul 29 10:32:13 kernel:  ? __free_pages+0x1f/0x40
jul 29 10:32:13 kernel:  wait_for_completion+0x118/0x180
jul 29 10:32:13 kernel:  ? wait_for_completion+0x118/0x180
jul 29 10:32:13 kernel:  ? wake_up_q+0x80/0x80
jul 29 10:32:13 kernel:  __wait_rcu_gp+0xcc/0x100
jul 29 10:32:13 kernel:  synchronize_sched+0x5d/0x80
jul 29 10:32:13 kernel:  ? __call_rcu+0x310/0x310
jul 29 10:32:13 kernel:  ? trace_raw_output_rcu_utilization+0x60/0x60
jul 29 10:32:13 kernel:  blk_queue_bypass_start+0x73/0x80
jul 29 10:32:13 kernel:  blkcg_deactivate_policy+0xff/0x120
jul 29 10:32:13 kernel:  blk_throtl_exit+0x34/0x50
jul 29 10:32:13 kernel:  blkcg_exit_queue+0x3a/0x40
jul 29 10:32:13 kernel:  blk_release_queue+0x2f/0x100
jul 29 10:32:13 kernel:  kobject_release+0x67/0x170
jul 29 10:32:13 kernel:  kobject_put+0x2b/0x50
jul 29 10:32:13 kernel:  blk_exit_rl+0x3a/0x50
jul 29 10:32:13 kernel:  blkg_free.part.7+0x4f/0xc0
jul 29 10:32:13 kernel:  __blkg_release_rcu+0x61/0xd0
jul 29 10:32:13 kernel:  rcu_nocb_kthread+0x15f/0x500
jul 29 10:32:13 kernel:  kthread+0x125/0x140
jul 29 10:32:13 kernel:  ? get_state_synchronize_sched+0x20/0x20
jul 29 10:32:13 kernel:  ? kthread_park+0x90/0x90
jul 29 10:32:13 kernel:  ret_from_fork+0x25/0x30

Comment 22 Ian Donaldson 2017-08-15 02:10:16 UTC
Just upgraded another fc25 system to fc26 and within minutes of reboot...

patriots$ rpm -q kernel
kernel-4.10.10-200.fc25.x86_64
kernel-4.11.3-200.fc25.x86_64
kernel-4.11.11-300.fc26.x86_64

patriots$ uname -a
Linux patriots.DOMAIN 4.11.11-300.fc26.x86_64 #1 SMP Mon Jul 17 16:32:11 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


Aug 15 01:56:32 patriots kernel: BUG: scheduling while atomic: rcuos/1/21/0x00000200
Aug 15 01:56:32 patriots kernel: Modules linked in: binfmt_misc vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfnetlink_queue nfnetlink_log nfnetlink bluetooth cfg80211 rfkill intel_powerclamp coretemp kvm_intel kvm joydev iTCO_wdt irqbypass iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore i2c_i801 ipmi_ssif lpc_ich ipmi_si ipmi_devintf ipmi_msghandler tpm_infineon acpi_cpufreq ioatdma i7core_edac edac_core i5500_temp shpchp tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc raid1 ast drm_kms_helper ttm drm crc32c_intel serio_raw igb ptp pps_core dca i2c_algo_bit
Aug 15 01:56:32 patriots kernel: CPU: 7 PID: 21 Comm: rcuos/1 Tainted: G           OE   4.11.11-300.fc26.x86_64 #1
Aug 15 01:56:32 patriots kernel: Hardware name: Sun Microsystems SUN FIRE X2270 M2/SUN FIRE X2270 M2, BIOS 2.07    11/12/2010
Aug 15 01:56:32 patriots kernel: Call Trace:
Aug 15 01:56:32 patriots kernel: dump_stack+0x63/0x84
Aug 15 01:56:32 patriots kernel: __schedule_bug+0x55/0x70
Aug 15 01:56:32 patriots kernel: __schedule+0x66e/0x8d0
Aug 15 01:56:32 patriots kernel: ? account_entity_enqueue+0xd8/0x100
Aug 15 01:56:32 patriots kernel: schedule+0x36/0x80
Aug 15 01:56:32 patriots kernel: schedule_timeout+0x202/0x300
Aug 15 01:56:32 patriots kernel: ? check_preempt_wakeup+0x10e/0x210
Aug 15 01:56:32 patriots kernel: wait_for_completion+0x118/0x180
Aug 15 01:56:32 patriots kernel: ? wait_for_completion+0x118/0x180
Aug 15 01:56:32 patriots kernel: ? wake_up_q+0x80/0x80
Aug 15 01:56:32 patriots kernel: __wait_rcu_gp+0xcc/0x100
Aug 15 01:56:32 patriots kernel: synchronize_sched+0x5d/0x80
Aug 15 01:56:32 patriots kernel: ? __call_rcu+0x310/0x310
Aug 15 01:56:32 patriots kernel: ? trace_raw_output_rcu_utilization+0x60/0x60
Aug 15 01:56:32 patriots kernel: blk_queue_bypass_start+0x73/0x80
Aug 15 01:56:32 patriots kernel: blkcg_deactivate_policy+0xff/0x120
Aug 15 01:56:32 patriots kernel: blk_throtl_exit+0x34/0x50
Aug 15 01:56:32 patriots kernel: blkcg_exit_queue+0x3a/0x40
Aug 15 01:56:32 patriots kernel: blk_release_queue+0x2f/0x100
Aug 15 01:56:32 patriots kernel: kobject_release+0x67/0x170
Aug 15 01:56:32 patriots kernel: kobject_put+0x2b/0x50
Aug 15 01:56:32 patriots kernel: blk_exit_rl+0x3a/0x50
Aug 15 01:56:32 patriots kernel: blkg_free.part.7+0x4f/0xc0
Aug 15 01:56:32 patriots kernel: __blkg_release_rcu+0x61/0xd0
Aug 15 01:56:32 patriots kernel: rcu_nocb_kthread+0x15f/0x500
Aug 15 01:56:32 patriots kernel: kthread+0x125/0x140
Aug 15 01:56:32 patriots kernel: ? get_state_synchronize_sched+0x20/0x20
Aug 15 01:56:32 patriots kernel: ? kthread_park+0x90/0x90
Aug 15 01:56:32 patriots kernel: ret_from_fork+0x25/0x30
Aug 15 01:56:32 patriots kernel: NOHZ: local_softirq_pending 28a
Aug 15 01:56:32 patriots kernel: BUG: scheduling while atomic: rcuos/1/21/0x7ffffe00
Aug 15 01:56:32 patriots kernel: Modules linked in: binfmt_misc vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfnetlink_queue nfnetlink_log nfnetlink bluetooth cfg80211 rfkill intel_powerclamp coretemp kvm_intel kvm joydev iTCO_wdt irqbypass iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore i2c_i801 ipmi_ssif lpc_ich ipmi_si ipmi_devintf ipmi_msghandler tpm_infineon acpi_cpufreq ioatdma i7core_edac edac_core i5500_temp shpchp tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc raid1 ast drm_kms_helper ttm drm crc32c_intel serio_raw igb ptp pps_core dca i2c_algo_bit
Aug 15 01:56:32 patriots kernel: CPU: 7 PID: 21 Comm: rcuos/1 Tainted: G        W  OE   4.11.11-300.fc26.x86_64 #1
Aug 15 01:56:32 patriots kernel: Hardware name: Sun Microsystems SUN FIRE X2270 M2/SUN FIRE X2270 M2, BIOS 2.07    11/12/2010
Aug 15 01:56:32 patriots kernel: Call Trace:
Aug 15 01:56:32 patriots kernel: dump_stack+0x63/0x84
Aug 15 01:56:32 patriots kernel: __schedule_bug+0x55/0x70
Aug 15 01:56:32 patriots kernel: __schedule+0x66e/0x8d0
Aug 15 01:56:32 patriots kernel: schedule+0x36/0x80
Aug 15 01:56:32 patriots kernel: rcu_nocb_kthread+0xa7/0x500
Aug 15 01:56:32 patriots kernel: kthread+0x125/0x140
Aug 15 01:56:32 patriots kernel: ? get_state_synchronize_sched+0x20/0x20
Aug 15 01:56:32 patriots kernel: ? kthread_park+0x90/0x90
Aug 15 01:56:32 patriots kernel: ret_from_fork+0x25/0x30


I see 4.12.5-300.fc26.x86_64 is just released so will try that...

Comment 23 Seth Jennings 2017-08-18 17:50:03 UTC
Twice on a fresh f26 install with 4.11.8

BUG: scheduling while atomic: rcuos/2/29/0x00000200
Modules linked in: ccm rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc dm_crypt arc4 iwlmvm intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp mac80211 kvm_intel vfat fat uvcvideo kvm videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core iTCO_wdt iwlwifi mei_wdt iTCO_vendor_support btusb
 irqbypass crct10dif_pclmul crc32_pclmul btrtl cfg80211 btbcm ghash_clmulni_intel videodev btintel intel_cstate intel_uncore bluetooth intel_rapl_perf snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic media joydev mei_me snd_hda_intel intel_pch_thermal mei i2c_i801 snd_hda_codec shpchp lpc_ich snd_hda_core snd_hwdep wmi snd_seq snd_seq_device thinkpad_acpi snd_pcm rfkill snd_timer snd soundcore tpm_tis tpm_tis_core tpm i915 i2c_algo_bit drm_kms_helper crc32c_intel drm e1000e serio_raw ptp pps_core video
CPU: 3 PID: 29 Comm: rcuos/2 Not tainted 4.11.8-300.fc26.x86_64 #1
Hardware name: LENOVO 20BTS1N700/20BTS1N700, BIOS N14ET26W (1.04 ) 01/23/2015
Call Trace:
 dump_stack+0x63/0x84
 __schedule_bug+0x55/0x70
 __schedule+0x66e/0x8d0
 schedule+0x36/0x80
 schedule_timeout+0x202/0x300
 ? account_entity_enqueue+0xd8/0x100
 wait_for_completion+0x118/0x180
 ? wait_for_completion+0x118/0x180
 ? wake_up_q+0x80/0x80
 __wait_rcu_gp+0xcc/0x100
 synchronize_sched+0x5d/0x80
 ? __call_rcu+0x310/0x310
 ? trace_raw_output_rcu_utilization+0x60/0x60
 blk_queue_bypass_start+0x73/0x80
 blkcg_deactivate_policy+0xff/0x120
 blk_throtl_exit+0x34/0x50
 blkcg_exit_queue+0x3a/0x40
 blk_release_queue+0x2f/0x100
 kobject_release+0x67/0x170
 kobject_put+0x2b/0x50
 blk_exit_rl+0x3a/0x50
 blkg_free.part.7+0x4f/0xc0
 __blkg_release_rcu+0x61/0xd0
 rcu_nocb_kthread+0x15f/0x500
 kthread+0x125/0x140
 ? get_state_synchronize_sched+0x20/0x20
 ? kthread_park+0x90/0x90
 ret_from_fork+0x2c/0x40

Also happens just every once and a while.  No recreate procedure.  Also not reportable in abrt for some reason.

Comment 24 Todd Gill 2017-11-07 20:21:58 UTC
# rpm -q kernel
kernel-4.11.8-300.fc26.x86_64
kernel-4.11.9-300.fc26.x86_64

# uname -a
Linux localhost.localdomain 4.11.9-300.fc26.x86_64 #1 SMP Wed Jul 5 16:21:56 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


[434468.164588] BUG: scheduling while atomic: rcuos/6/61/0x00000200
[434468.164622] Modules linked in: binfmt_misc dm_thin_pool dm_persistent_data dm_bio_prison ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ppdev ghash_clmulni_intel i2c_piix4 joydev virtio_balloon parport_pc parport tpm_tis tpm_tis_core tpm xfs libcrc32c virtio_net virtio_blk cirrus drm_kms_helper ttm drm crc32c_intel serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg
[434468.164669] CPU: 6 PID: 61 Comm: rcuos/6 Not tainted 4.11.8-300.fc26.x86_64 #1
[434468.164670] Hardware name: Red Hat OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014
[434468.164670] Call Trace:
[434468.164687]  dump_stack+0x63/0x84
[434468.164690]  __schedule_bug+0x55/0x70
[434468.164692]  __schedule+0x66e/0x8d0
[434468.164693]  schedule+0x36/0x80
[434468.164694]  schedule_timeout+0x202/0x300
[434468.164697]  ? account_entity_enqueue+0xd8/0x100
[434468.164698]  wait_for_completion+0x118/0x180
[434468.164699]  ? wait_for_completion+0x118/0x180
[434468.164699]  ? wake_up_q+0x80/0x80
[434468.164702]  __wait_rcu_gp+0xcc/0x100
[434468.164704]  synchronize_sched+0x5d/0x80
[434468.164705]  ? __call_rcu+0x310/0x310
[434468.164706]  ? trace_raw_output_rcu_utilization+0x60/0x60
[434468.164708]  blk_queue_bypass_start+0x73/0x80
[434468.164711]  blkcg_deactivate_policy+0xff/0x120
[434468.164712]  blk_throtl_exit+0x34/0x50
[434468.164713]  blkcg_exit_queue+0x3a/0x40
[434468.164715]  blk_release_queue+0x2f/0x100
[434468.164717]  kobject_release+0x67/0x170
[434468.164718]  kobject_put+0x2b/0x50
[434468.164719]  blk_exit_rl+0x3a/0x50
[434468.164720]  blkg_free.part.7+0x4f/0xc0
[434468.164721]  __blkg_release_rcu+0x61/0xd0
[434468.164722]  rcu_nocb_kthread+0x15f/0x500
[434468.164725]  kthread+0x125/0x140
[434468.164726]  ? get_state_synchronize_sched+0x20/0x20
[434468.164727]  ? kthread_park+0x90/0x90
[434468.164729]  ret_from_fork+0x2c/0x40
[599231.142263] XFS (dm-2): Mounting V5 Filesystem
[599231.181879] XFS (dm-2): Ending clean mount

Comment 25 Fedora End Of Life 2017-11-16 18:44:19 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 26 Fedora End Of Life 2017-12-12 10:22:15 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.