Bug 1146489 - kernel 3.16 - kvm + nfs client - protection fault [NEEDINFO]
Summary: kernel 3.16 - kvm + nfs client - protection fault
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-25 10:32 UTC by Markus Stockhausen
Modified: 2014-12-10 15:00 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-12-10 15:00:08 UTC
Type: Bug
Embargoed:
jforbes: needinfo?


Attachments (Terms of Use)

Description Markus Stockhausen 2014-09-25 10:32:09 UTC
Description of problem:

machine protection fault - reboot of host.

Version-Release number of selected component (if applicable):

Fedora 20. 

uname -a
Linux colovn06.collogia.de 3.16.2-201.fc20.x86_64 #1 SMP Mon Sep 15 19:57:50 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Log:

Sep 25 11:29:01 colovn06 kernel: general protection fault: 0000 [#1] SMP
Sep 25 11:29:01 colovn06 kernel: Modules linked in: vfat fat loop vhost_net vhost macvtap macvlan ebt_arp binfmt_misc nfsv3 nfs fscache ip6table_filter ip6_tables ebtable_nat ebtables scsi_transport_iscsi xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack softdog nf_conntrack dm_service_time iTCO_wdt iTCO_vendor_support lpc_ich mfd_core igb i2c_i801 ses ptp enclosure pps_core dca shpchp ipmi_devintf coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel microcode acpi_power_meter i7core_edac edac_core acpi_cpufreq nfsd auth_rpcgss ipmi_si ipmi_msghandler nfs_acl lockd sunrpc dm_multipath 8021q garp mrp tun bridge stp llc bonding ib_umad ib_ipoib ib_cm mlx4_ib ib_sa i2c_algo_bit drm_kms_helper ttm drm i2c_core megaraid_sas mlx4_core ib_mad ib_core ib_addr dummy
Sep 25 11:29:01 colovn06 kernel: CPU: 7 PID: 668 Comm: systemd-logind Tainted: G        W I   3.16.2-201.fc20.x86_64 #1
Sep 25 11:29:01 colovn06 kernel: Hardware name: FUJITSU                          PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.13.2619.N1           01/19/2012
Sep 25 11:29:01 colovn06 kernel: task: ffff88197ab2d8e0 ti: ffff88197c2cc000 task.ti: ffff88197c2cc000
Sep 25 11:29:01 colovn06 kernel: RIP: 0010:[<ffffffff811d7005>]  [<ffffffff811d7005>] __kmalloc+0x95/0x240
Sep 25 11:29:01 colovn06 kernel: RSP: 0018:ffff88197c2cfb40  EFLAGS: 00010246
Sep 25 11:29:01 colovn06 kernel: RAX: 0000000000000000 RBX: ffff88197c2cfbd8 RCX: 0000000000000000
Sep 25 11:29:01 colovn06 kernel: RDX: 0000000000002009 RSI: 0000000000000000 RDI: 0000000000000009
Sep 25 11:29:01 colovn06 kernel: RBP: ffff88197c2cfb70 R08: 00000000000173c0 R09: ffff880d89803a00
Sep 25 11:29:01 colovn06 kernel: R10: ffffffff81219158 R11: ffff880d63627380 R12: 00dffff800008000
Sep 25 11:29:01 colovn06 kernel: R13: 00000000000000d0 R14: 000000000000004f R15: ffff880d89803a00
Sep 25 11:29:01 colovn06 kernel: FS:  00007fb1dfc968c0(0000) GS:ffff8819bfcc0000(0000) knlGS:0000000000000000
Sep 25 11:29:01 colovn06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 25 11:29:01 colovn06 kernel: CR2: 00007fb1dfca7000 CR3: 00000000bb0d1000 CR4: 00000000000027e0
Sep 25 11:29:01 colovn06 kernel: Stack:
Sep 25 11:29:01 colovn06 kernel: ffffffff81219158 ffff88197c2cfbd8 000000000000002f ffff8819549dd280
Sep 25 11:29:01 colovn06 kernel: 0000000000000000 ffffffff81196cc0 ffff88197c2cfb98 ffffffff81219158
Sep 25 11:29:01 colovn06 kernel: ffff88197c2cfbd8 0000000000000000 ffff881952f99d20 ffff88197c2cfbc8
Sep 25 11:29:01 colovn06 kernel: Call Trace:
Sep 25 11:29:01 colovn06 kernel: [<ffffffff81219158>] ? simple_xattr_alloc+0x28/0x60
Sep 25 11:29:02 colovn06 kernel: [<ffffffff81196cc0>] ? shmem_fill_super+0x1e0/0x1e0
Sep 25 11:29:02 colovn06 kernel: [<ffffffff81219158>] simple_xattr_alloc+0x28/0x60
Sep 25 11:29:02 colovn06 kernel: [<ffffffff81196d30>] shmem_initxattrs+0x70/0xe0
Sep 25 11:29:02 colovn06 kernel: [<ffffffff812ee95c>] security_inode_init_security+0xcc/0x100
Sep 25 11:29:02 colovn06 kernel: [<ffffffff811964a9>] shmem_mknod+0x89/0xe0
Sep 25 11:29:02 colovn06 kernel: [<ffffffff81196558>] shmem_create+0x18/0x20
Sep 25 11:29:07 colovn06 kernel: [<ffffffff812020dd>] vfs_create+0xcd/0x130
Sep 25 11:29:08 colovn06 kernel: [<ffffffff81202b66>] do_last+0xa26/0x1190
Sep 25 11:29:08 colovn06 kernel: [<ffffffff811fee71>] ? link_path_walk+0x81/0x890
Sep 25 11:29:08 colovn06 kernel: [<ffffffff811d6836>] ? kmem_cache_alloc_trace+0x1d6/0x200
Sep 25 11:29:08 colovn06 kernel: [<ffffffff812f5bdc>] ? selinux_file_alloc_security+0x3c/0x60
Sep 25 11:29:10 colovn06 kernel: [<ffffffff8120339d>] path_openat+0xcd/0x670
Sep 25 11:29:10 colovn06 kernel: [<ffffffff811fe8d9>] ? putname+0x29/0x40
Sep 25 11:29:10 colovn06 kernel: [<ffffffff81204082>] ? user_path_at_empty+0x72/0xc0
Sep 25 11:29:10 colovn06 kernel: [<ffffffff8120419d>] do_filp_open+0x4d/0xb0
Sep 25 11:29:10 colovn06 kernel: [<ffffffff81210d0d>] ? __alloc_fd+0x7d/0x120
Sep 25 11:29:10 colovn06 kernel: [<ffffffff811f27e7>] do_sys_open+0x137/0x240
Sep 25 11:29:10 colovn06 kernel: [<ffffffff811f290e>] SyS_open+0x1e/0x20
Sep 25 11:29:10 colovn06 kernel: [<ffffffff8170e469>] system_call_fastpath+0x16/0x1b
Sep 25 11:29:10 colovn06 kernel: Code: dc 00 00 49 8b 50 08 4d 8b 20 49 8b 40 10 4d 85 e4 0f 84 34 01 00 00 48 85 c0 0f 84 2b 01 00 00 49 63 41 20 4d 8b 01 41 f6 c0 0f <49> 8b 1c 04 0f 85 99 01 00 00 48 8d 4a 01 4c 89 e0 65 49 0f c7
Sep 25 11:29:10 colovn06 kernel: RIP  [<ffffffff811d7005>] __kmalloc+0x95/0x240
Sep 25 11:29:10 colovn06 kernel: RSP <ffff88197c2cfb40>
Sep 25 11:29:10 colovn06 systemd: systemd-logind.service: main process exited, code=killed, status=11/SEGV
Sep 25 11:29:10 colovn06 systemd: Unit systemd-logind.service entered failed state.
Sep 25 11:29:10 colovn06 systemd: systemd-logind.service holdoff time over, scheduling restart.
Sep 25 11:29:10 colovn06 systemd: Stopping Login Service...
Sep 25 11:29:10 colovn06 systemd: Starting Login Service...
Sep 25 11:29:10 colovn06 systemd: Started Login Service.
Sep 25 11:29:10 colovn06 kernel: ---[ end trace a9ed8a56a73c6498 ]---
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.180 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.193 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.200 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.208 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.214 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.221 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.227 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.236 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.243 msecs
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.251 msecs
Sep 25 11:29:10 colovn06 kernel: [sched_delayed] sched: RT throttling activated
Sep 25 11:29:10 colovn06 kernel: nmi_max_handler: 3 callbacks suppressed
Sep 25 11:29:10 colovn06 kernel: INFO: NMI handler (ghes_notify_nmi) took too long to run: 2.270 msecs

Comment 1 Josh Boyer 2014-09-25 12:26:34 UTC
Your report already has taints set, which means something happened before this error.  Do you have logs showing the first error?

Comment 2 Markus Stockhausen 2014-09-25 14:41:48 UTC
Hello,

sorry for having not more but these are the last lines before the bug.
Is there any other place I can have a look at?

Sep 25 11:12:01 colovn06 systemd: Started Session 90 of user root.
Sep 25 11:13:01 colovn06 systemd: Starting Session 91 of user root.
Sep 25 11:13:01 colovn06 systemd: Started Session 91 of user root.
Sep 25 11:14:01 colovn06 systemd: Starting Session 92 of user root.
Sep 25 11:14:01 colovn06 systemd: Started Session 92 of user root.
Sep 25 11:15:01 colovn06 systemd: Starting Session 93 of user root.
Sep 25 11:15:01 colovn06 systemd: Started Session 93 of user root.
Sep 25 11:16:01 colovn06 systemd: Starting Session 94 of user root.
Sep 25 11:16:01 colovn06 systemd: Started Session 94 of user root.
Sep 25 11:17:01 colovn06 systemd: Starting Session 95 of user root.
Sep 25 11:17:01 colovn06 systemd: Started Session 95 of user root.
Sep 25 11:18:01 colovn06 systemd: Starting Session 96 of user root.
Sep 25 11:18:01 colovn06 systemd: Started Session 96 of user root.
Sep 25 11:19:01 colovn06 systemd: Starting Session 97 of user root.
Sep 25 11:19:01 colovn06 systemd: Started Session 97 of user root.
Sep 25 11:20:01 colovn06 systemd: Starting Session 98 of user root.
Sep 25 11:20:01 colovn06 systemd: Started Session 98 of user root.
Sep 25 11:21:01 colovn06 systemd: Starting Session 99 of user root.
Sep 25 11:21:01 colovn06 systemd: Started Session 99 of user root.
Sep 25 11:22:01 colovn06 systemd: Starting Session 100 of user root.
Sep 25 11:22:01 colovn06 systemd: Started Session 100 of user root.
Sep 25 11:23:01 colovn06 systemd: Starting Session 101 of user root.
Sep 25 11:23:01 colovn06 systemd: Started Session 101 of user root.
Sep 25 11:24:01 colovn06 systemd: Starting Session 102 of user root.
Sep 25 11:24:01 colovn06 systemd: Started Session 102 of user root.
Sep 25 11:25:01 colovn06 systemd: Starting Session 103 of user root.
Sep 25 11:25:01 colovn06 systemd: Started Session 103 of user root.
Sep 25 11:25:01 colovn06 systemd: Starting Session 104 of user root.
Sep 25 11:25:01 colovn06 systemd: Started Session 104 of user root.
Sep 25 11:26:01 colovn06 systemd: Starting Session 105 of user root.
Sep 25 11:26:01 colovn06 systemd: Started Session 105 of user root.
Sep 25 11:27:01 colovn06 systemd: Starting Session 106 of user root.
Sep 25 11:27:01 colovn06 systemd: Started Session 106 of user root.
Sep 25 11:28:01 colovn06 systemd: Starting Session 107 of user root.
Sep 25 11:28:01 colovn06 systemd: Started Session 107 of user root.
Sep 25 11:29:01 colovn06 systemd: Starting Session 108 of user root.
Sep 25 11:29:01 colovn06 systemd: Started Session 108 of user root.
Sep 25 11:29:01 colovn06 kernel: general protection fault: 0000 [#1] SMP
Sep 25 11:29:01 colovn06 kernel: Modules linked in: vfat fat loop vhost_net vhost macvtap macvlan ebt_arp binfmt_misc nfsv3 nfs fscache ip6table_filter ip6_tables ebtable_nat ebtables scsi_transport_iscsi xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack softdog nf_conntrack dm_service_time iTCO_wdt iTCO_vendor_support lpc_ich mfd_core igb i2c_i801 ses ptp enclosure pps_core dca shpchp ipmi_devintf coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel microcode acpi_power_meter i7core_edac edac_core acpi_cpufreq nfsd auth_rpcgss ipmi_si ipmi_msghandler nfs_acl lockd sunrpc dm_multipath 8021q garp mrp tun bridge stp llc bonding ib_umad ib_ipoib ib_cm mlx4_ib ib_sa i2c_algo_bit drm_kms_helper ttm drm i2c_core megaraid_sas mlx4_core ib_mad ib_core ib_addr dummy
Sep 25 11:29:01 colovn06 kernel: CPU: 7 PID: 668 Comm: systemd-logind Tainted: G        W I   3.16.2-201.fc20.x86_64 #1
Sep 25 11:29:01 colovn06 kernel: Hardware name: FUJITSU                          PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.13.2619.N1           01/19/2012
Sep 25 11:29:01 colovn06 kernel: task: ffff88197ab2d8e0 ti: ffff88197c2cc000 task.ti: ffff88197c2cc000
Sep 25 11:29:01 colovn06 kernel: RIP: 0010:[<ffffffff811d7005>]  [<ffffffff811d7005>] __kmalloc+0x95/0x240

Best regards 

Markus

Comment 3 Markus Stockhausen 2014-09-25 15:09:13 UTC
Hello,

I guess I know what you are looking for. The machine always goes into the "tainted flags mode" because it raises the following exception during (normal) start:

[    7.883312] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host4/target4:0:16/4:0:16:0/enclosure/4:0:16:0/ArrayDevice03'
[    7.883313] Modules linked in: ses(+) e1000e(+) enclosure ptp pps_core tpm_tis tpm_infineon iTCO_wdt iTCO_vendor_support nfsd auth_rpcgss microcode(+) ipmi_si(+) ipmi_msghandler tpm i7core_edac lpc_ich edac_core mfd_core i2c_i801 shpc
hp acpi_power_meter binfmt_misc nfs_acl lockd acpi_cpufreq sunrpc dm_multipath 8021q garp mrp tun bridge stp llc bonding ib_umad ib_ipoib ib_cm mlx4_ib ib_sa i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi i2c_core megaraid_sas
 mlx4_core ib_mad ib_core ib_addr dummy
[    7.883341] CPU: 0 PID: 490 Comm: systemd-udevd Tainted: G          I  3.14.8-200.fc20.x86_64 #1
[    7.883343] Hardware name: FUJITSU                          PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.13.2619.N1           01/19/2012
[    7.883344]  0000000000000000 000000009573c8ee ffff8817b9381990 ffffffff816f0502
[    7.883383]  ffff8817b93819d8 ffff8817b93819c8 ffffffff8108a1cd ffff8817b122c000
[    7.883387]  ffff8817b122c000 ffff8817b9eb3078 ffff8817ae4d8df0 0000000000000004
[    7.883391] Call Trace:
[    7.883397]  [<ffffffff816f0502>] dump_stack+0x45/0x56
[    7.883405]  [<ffffffff8108a1cd>] warn_slowpath_common+0x7d/0xa0
[    7.883407]  [<ffffffff8108a24c>] warn_slowpath_fmt+0x5c/0x80
[    7.883409]  [<ffffffff81263d66>] sysfs_warn_dup+0x86/0xa0
[    7.883412]  [<ffffffff81263e0e>] sysfs_create_dir_ns+0x8e/0xa0
[    7.883416]  [<ffffffff81356fc0>] kobject_add_internal+0xc0/0x3f0
[    7.883418]  [<ffffffff813577b5>] kobject_add+0x75/0xd0
[    7.883422]  [<ffffffff81462583>] ? device_private_init+0x23/0x80
[    7.883427]  [<ffffffff81462705>] device_add+0x125/0x630
[    7.883429]  [<ffffffff81462c2a>] device_register+0x1a/0x20
[    7.883433]  [<ffffffffa04bd696>] enclosure_component_register+0xb6/0x100 [enclosure]
[    7.883437]  [<ffffffffa04d491d>] ses_enclosure_data_process+0x27d/0x370 [ses]
[    7.883439]  [<ffffffffa04d507d>] ses_intf_add+0x44d/0x4fc [ses]
[    7.883442]  [<ffffffff81466ab9>] class_interface_register+0xa9/0x100
[    7.883446]  [<ffffffffa0005000>] ? 0xffffffffa0004fff
[    7.883450]  [<ffffffff814920f6>] scsi_register_interface+0x16/0x20
[    7.883452]  [<ffffffffa0005013>] ses_init+0x13/0x1000 [ses]
[    7.883454]  [<ffffffffa0005000>] ? 0xffffffffa0004fff
[    7.883458]  [<ffffffff8100216a>] do_one_initcall+0xfa/0x1b0
[    7.883461]  [<ffffffff8105a953>] ? set_memory_nx+0x43/0x50
[    7.883468]  [<ffffffff81105c17>] load_module+0x1e37/0x25d0
[    7.883470]  [<ffffffff811012b0>] ? store_uevent+0x70/0x70
[    7.883474]  [<ffffffff811efa30>] ? kernel_read+0x50/0x80
[    7.883477]  [<ffffffff81106566>] SyS_finit_module+0xa6/0xd0
[    7.883481]  [<ffffffff817008e9>] system_call_fastpath+0x16/0x1b
[    7.883484] ---[ end trace 48100b02e39a6cc9 ]---
[    7.883485] ------------[ cut here ]------------
[    7.883487] WARNING: CPU: 0 PID: 490 at lib/kobject.c:240 kobject_add_internal+0x284/0x3f0()

From my opinion the simple_xattr_alloc() protection fault is the bug I'm searching for and the tainted flag comes from the boot error.

Comment 4 Markus Stockhausen 2014-09-25 16:35:31 UTC
To get things sorted out I opened BZ1146643 for the error during machine startup. So we are back to the protection fault during simple_xattr_alloc().

Comment 5 Justin M. Forbes 2014-11-13 16:00:09 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.17.2-200.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 6 Justin M. Forbes 2014-12-10 15:00:08 UTC
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.