Bug 1498969
Summary: | x86/mce: suspicious RCU usage | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mikhail <mikhail.v.gavrilov> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 27 | CC: | airlied, ajax, bskeggs, eparis, esandeen, hdegoede, ichavero, itamar, jarodwilson, jeremy, jforbes, jglisse, jonathan, josef, jwboyer, kernel-maint, labbott, linville, mchehab, mjg59, nhorman, quintela, steved | ||||
Target Milestone: | --- | Keywords: | Triaged | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-4.13.12-200.fc26 kernel-4.13.12-100.fc25 kernel-4.13.12-300.fc27 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-11-14 01:59:59 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Hi, Thank you for taking the time to report this bug. I've brought it to the attention of the x86 MCE maintainers: https://marc.info/?l=linux-kernel&m=150766207223899 kernel-4.13.12-200.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-31d7720d7e kernel-4.13.12-300.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-abda708cee kernel-4.13.12-100.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-08a350c878 kernel-4.13.12-300.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-abda708cee kernel-4.13.12-100.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-08a350c878 kernel-4.13.12-200.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-31d7720d7e kernel-4.13.12-200.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.13.12-100.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.13.12-300.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report. |
Created attachment 1334917 [details] dmesg Description of problem: Oct 05 11:40:17 localhost.localdomain kernel: mce: [Hardware Error]: Machine check events logged Oct 05 11:40:17 localhost.localdomain kernel: Oct 05 11:40:17 localhost.localdomain kernel: ============================= Oct 05 11:40:17 localhost.localdomain kernel: WARNING: suspicious RCU usage Oct 05 11:40:17 localhost.localdomain kernel: 4.13.4-301.fc27.x86_64+debug #1 Not tainted Oct 05 11:40:17 localhost.localdomain kernel: ----------------------------- Oct 05 11:40:17 localhost.localdomain kernel: arch/x86/kernel/cpu/mcheck/dev-mcelog.c:60 suspicious mce_log_get_idx_check() usage! Oct 05 11:40:17 localhost.localdomain kernel: other info that might help us debug this: Oct 05 11:40:17 localhost.localdomain kernel: rcu_scheduler_active = 2, debug_locks = 1 Oct 05 11:40:17 localhost.localdomain kernel: 3 locks held by kworker/1:2/14637: Oct 05 11:40:17 localhost.localdomain kernel: #0: ("events"){.+.+.+}, at: [<ffffffffaa0d2ac0>] process_one_work+0x1d0/0x6a0 Oct 05 11:40:17 localhost.localdomain kernel: #1: ((&mce_work)){+.+...}, at: [<ffffffffaa0d2ac0>] process_one_work+0x1d0/0x6a0 Oct 05 11:40:17 localhost.localdomain kernel: #2: ((x86_mce_decoder_chain).rwsem){++++..}, at: [<ffffffffaa0dc92f>] blocking_notifier_call_chain+0x2f/0x70 Oct 05 11:40:17 localhost.localdomain kernel: stack backtrace: Oct 05 11:40:17 localhost.localdomain kernel: CPU: 1 PID: 14637 Comm: kworker/1:2 Not tainted 4.13.4-301.fc27.x86_64+debug #1 Oct 05 11:40:17 localhost.localdomain kernel: Hardware name: Gigabyte Technology Co., Ltd. Z87M-D3H/Z87M-D3H, BIOS F11 08/12/2014 Oct 05 11:40:17 localhost.localdomain kernel: Workqueue: events mce_gen_pool_process Oct 05 11:40:17 localhost.localdomain kernel: Call Trace: Oct 05 11:40:17 localhost.localdomain kernel: dump_stack+0x8e/0xd6 Oct 05 11:40:17 localhost.localdomain kernel: lockdep_rcu_suspicious+0xc5/0x100 Oct 05 11:40:17 localhost.localdomain kernel: dev_mce_log+0xf6/0x1e0 Oct 05 11:40:17 localhost.localdomain kernel: notifier_call_chain+0x39/0x90 Oct 05 11:40:17 localhost.localdomain kernel: blocking_notifier_call_chain+0x49/0x70 Oct 05 11:40:17 localhost.localdomain kernel: mce_gen_pool_process+0x41/0x70 Oct 05 11:40:17 localhost.localdomain kernel: process_one_work+0x253/0x6a0 Oct 05 11:40:17 localhost.localdomain kernel: worker_thread+0x4d/0x3b0 Oct 05 11:40:17 localhost.localdomain kernel: kthread+0x133/0x150 Oct 05 11:40:17 localhost.localdomain kernel: ? process_one_work+0x6a0/0x6a0 Oct 05 11:40:17 localhost.localdomain kernel: ? kthread_create_on_node+0x70/0x70 Oct 05 11:40:17 localhost.localdomain kernel: ret_from_fork+0x2a/0x40 Oct 05 11:40:17 localhost.localdomain mcelog[762]: Hardware event. This is not a software error. Oct 05 11:40:17 localhost.localdomain mcelog[762]: MCE 0 Oct 05 11:40:17 localhost.localdomain mcelog[762]: CPU 1 BANK 0 TSC 71eec2000849 Oct 05 11:40:17 localhost.localdomain mcelog[762]: TIME 1507185617 Thu Oct 5 11:40:17 2017 Oct 05 11:40:17 localhost.localdomain mcelog[762]: MCG status: Oct 05 11:40:17 localhost.localdomain mcelog[762]: MCi status: Oct 05 11:40:17 localhost.localdomain mcelog[762]: Corrected error Oct 05 11:40:17 localhost.localdomain mcelog[762]: Error enabled Oct 05 11:40:17 localhost.localdomain mcelog[762]: MCA: Internal parity error Oct 05 11:40:17 localhost.localdomain mcelog[762]: STATUS 90000040000f0005 MCGSTATUS 0 Oct 05 11:40:17 localhost.localdomain mcelog[762]: MCGCAP c09 APICID 2 SOCKETID 0 Oct 05 11:40:17 localhost.localdomain mcelog[762]: CPUID Vendor Intel Family 6 Model 60 Could anybody look into this? What means this error message?