Bug 1597559 - OOPS inside kmem_cache_alloc (nfs4).
Summary: OOPS inside kmem_cache_alloc (nfs4).
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 28
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1598229 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-03 08:30 UTC by Paweł Sikora
Modified: 2018-09-02 14:27 UTC (History)
36 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-02 14:27:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Strace output from starting nfs service, after the previous failed (1.77 KB, application/x-xz)
2018-07-30 20:03 UTC, Göran Uddeborg
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 200379 0 None None None 2019-05-22 16:59:19 UTC

Description Paweł Sikora 2018-07-03 08:30:22 UTC
hi,

i've noticed an unexpected nfs4-related kernel oops. machine exports /home/users for small local network.

/etc/exports: /home/users  *(rw,sync,no_subtree_check,no_root_squash,insecure_locks)

here's the journalctl fragment with all the oopses:

(...)
lip 03 09:53:37 dragon rpc.mountd[1038]: authenticated unmount request from 192.168.3.72:632 for /home/users (/home/users)
lip 03 09:54:43 dragon rpc.mountd[1038]: authenticated mount request from 192.168.3.171:989 for /home/users (/home/users)
lip 03 09:54:58 dragon kernel: general protection fault: 0000 [#1] SMP PTI
lip 03 09:54:58 dragon kernel: Modules linked in: rpcsec_gss_krb5 tun intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul ipmi_ssif crc
lip 03 09:54:58 dragon kernel: CPU: 7 PID: 1071 Comm: nfsd Not tainted 4.17.3-100.fc27.x86_64 #1
lip 03 09:54:58 dragon kernel: Hardware name: Supermicro Super Server/X11SSL-F, BIOS 2.0b 07/28/2017
lip 03 09:54:58 dragon kernel: RIP: 0010:prefetch_freepointer+0x10/0x20
lip 03 09:54:58 dragon kernel: RSP: 0018:ffffbd654210bc48 EFLAGS: 00010286
lip 03 09:54:58 dragon kernel: RAX: 0000000000000000 RBX: 8dd3122ac327ff2b RCX: 00000000000000ba
lip 03 09:54:58 dragon kernel: RDX: 00000000000000b9 RSI: 8dd3122ac327ff2b RDI: ffff9cb12492f680
lip 03 09:54:58 dragon kernel: RBP: ffff9cb12492f680 R08: ffff9cb1379eb2e0 R09: ffff9cb0ed2d9200
lip 03 09:54:58 dragon kernel: R10: ffffbd654210bcb0 R11: 0000000000000000 R12: 00000000014080c0
lip 03 09:54:58 dragon kernel: R13: ffffffffc0622a21 R14: ffff9cb1223d13f7 R15: ffff9cb12492f680
lip 03 09:54:58 dragon kernel: FS:  0000000000000000(0000) GS:ffff9cb1379c0000(0000) knlGS:0000000000000000
lip 03 09:54:58 dragon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lip 03 09:54:58 dragon kernel: CR2: 00007fc27e261218 CR3: 000000013220a004 CR4: 00000000003606e0
lip 03 09:54:58 dragon kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
lip 03 09:54:58 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:54:58 dragon kernel: Call Trace:
lip 03 09:54:58 dragon kernel:  kmem_cache_alloc+0xb4/0x1c0
lip 03 09:54:58 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:54:58 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:54:58 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:54:58 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:54:58 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:54:58 dragon kernel:  ? nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:54:58 dragon kernel:  nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:54:58 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:54:58 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:54:58 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:54:58 dragon kernel:  ? nfsd_destroy+0x60/0x60 [nfsd]
lip 03 09:54:58 dragon kernel:  svc_process+0xd7/0xf0 [sunrpc]
lip 03 09:54:58 dragon kernel:  nfsd+0xe3/0x150 [nfsd]
lip 03 09:54:58 dragon kernel:  kthread+0x113/0x130
lip 03 09:54:58 dragon kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
lip 03 09:54:58 dragon kernel:  ret_from_fork+0x35/0x40
lip 03 09:54:58 dragon kernel: Code: 75 58 48 c7 c7 58 20 0d 8d e8 4b 7e ea ff eb 90 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e f3 c
lip 03 09:54:58 dragon kernel: RIP: prefetch_freepointer+0x10/0x20 RSP: ffffbd654210bc48
lip 03 09:54:58 dragon kernel: ---[ end trace b4500dc7c46190b5 ]---
lip 03 09:55:03 dragon kernel: general protection fault: 0000 [#2] SMP PTI
lip 03 09:55:03 dragon kernel: Modules linked in: rpcsec_gss_krb5 tun intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul ipmi_ssif crc
lip 03 09:55:03 dragon kernel: CPU: 2 PID: 1072 Comm: nfsd Tainted: G      D           4.17.3-100.fc27.x86_64 #1
lip 03 09:55:03 dragon kernel: Hardware name: Supermicro Super Server/X11SSL-F, BIOS 2.0b 07/28/2017
lip 03 09:55:03 dragon kernel: RIP: 0010:prefetch_freepointer+0x10/0x20
lip 03 09:55:03 dragon kernel: RSP: 0018:ffffbd6542113c48 EFLAGS: 00010282
lip 03 09:55:03 dragon kernel: RAX: 0000000000000000 RBX: b7b21b2b5b5e9d9b RCX: 000000000000030c
lip 03 09:55:03 dragon kernel: RDX: 000000000000030b RSI: b7b21b2b5b5e9d9b RDI: ffff9cb12492f680
lip 03 09:55:03 dragon kernel: RBP: ffff9cb12492f680 R08: ffff9cb1378ab2e0 R09: ffff9cb11f76d0e0
lip 03 09:55:03 dragon kernel: R10: ffffbd6542113cb0 R11: 0000000000000000 R12: 00000000014080c0
lip 03 09:55:03 dragon kernel: R13: ffffffffc0622a21 R14: ffff9cb0f209f794 R15: ffff9cb12492f680
lip 03 09:55:03 dragon kernel: FS:  0000000000000000(0000) GS:ffff9cb137880000(0000) knlGS:0000000000000000
lip 03 09:55:03 dragon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lip 03 09:55:03 dragon kernel: CR2: 000055cee7a24db0 CR3: 000000013220a002 CR4: 00000000003606e0
lip 03 09:55:03 dragon kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
lip 03 09:55:03 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:55:03 dragon kernel: Call Trace:
lip 03 09:55:03 dragon kernel:  kmem_cache_alloc+0xb4/0x1c0
lip 03 09:55:03 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:55:03 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:55:03 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:55:03 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:55:03 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:55:03 dragon kernel:  ? nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:55:03 dragon kernel:  nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:55:03 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:55:03 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:55:03 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:55:03 dragon kernel:  ? nfsd_destroy+0x60/0x60 [nfsd]
lip 03 09:55:03 dragon kernel:  svc_process+0xd7/0xf0 [sunrpc]
lip 03 09:55:03 dragon kernel:  nfsd+0xe3/0x150 [nfsd]
lip 03 09:55:03 dragon kernel:  kthread+0x113/0x130
lip 03 09:55:03 dragon kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
lip 03 09:55:03 dragon kernel:  ret_from_fork+0x35/0x40
lip 03 09:55:03 dragon kernel: Code: 75 58 48 c7 c7 58 20 0d 8d e8 4b 7e ea ff eb 90 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e f3 c
lip 03 09:55:03 dragon kernel: RIP: prefetch_freepointer+0x10/0x20 RSP: ffffbd6542113c48
lip 03 09:55:03 dragon kernel: ---[ end trace b4500dc7c46190b6 ]---
lip 03 09:55:53 dragon kernel: general protection fault: 0000 [#3] SMP PTI
lip 03 09:55:53 dragon kernel: Modules linked in: rpcsec_gss_krb5 tun intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul ipmi_ssif crc
lip 03 09:55:53 dragon kernel: CPU: 2 PID: 1070 Comm: nfsd Tainted: G      D           4.17.3-100.fc27.x86_64 #1
lip 03 09:55:53 dragon kernel: Hardware name: Supermicro Super Server/X11SSL-F, BIOS 2.0b 07/28/2017
lip 03 09:55:53 dragon kernel: RIP: 0010:kmem_cache_alloc+0x82/0x1c0
lip 03 09:55:53 dragon kernel: RSP: 0018:ffffbd65420efc50 EFLAGS: 00010282
lip 03 09:55:53 dragon kernel: RAX: 0000000000000000 RBX: b7b21b2b5b5e9d9b RCX: ffff9cb11f76d9d0
lip 03 09:55:53 dragon kernel: RDX: 000000000000030c RSI: 00000000014080c0 RDI: 000000000002b2e0
lip 03 09:55:53 dragon kernel: RBP: ffff9cb12492f680 R08: ffff9cb1378ab2e0 R09: ffff9cb11f76d9e0
lip 03 09:55:53 dragon kernel: R10: ffffbd65420efcb0 R11: 0000000000000000 R12: 00000000014080c0
lip 03 09:55:53 dragon kernel: R13: ffffffffc0622a21 R14: b7b21b2b5b5e9d9b R15: ffff9cb12492f680
lip 03 09:55:53 dragon kernel: FS:  0000000000000000(0000) GS:ffff9cb137880000(0000) knlGS:0000000000000000
lip 03 09:55:53 dragon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lip 03 09:55:53 dragon kernel: CR2: 00007f2aec660653 CR3: 000000013220a003 CR4: 00000000003606e0
lip 03 09:55:53 dragon kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
lip 03 09:55:53 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:55:53 dragon kernel: Call Trace:
lip 03 09:55:53 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:55:53 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:55:53 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:55:53 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:55:53 dragon kernel:  ? nfsd4_open+0x5e9/0x700 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd4_open+0x5e9/0x700 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:55:53 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:55:53 dragon kernel:  ? nfsd_destroy+0x60/0x60 [nfsd]
lip 03 09:55:53 dragon kernel:  svc_process+0xd7/0xf0 [sunrpc]
lip 03 09:55:53 dragon kernel:  nfsd+0xe3/0x150 [nfsd]
lip 03 09:55:53 dragon kernel:  kthread+0x113/0x130
lip 03 09:55:53 dragon kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
lip 03 09:55:53 dragon kernel:  ret_from_fork+0x35/0x40
lip 03 09:55:53 dragon kernel: Code: 50 08 65 4c 03 05 87 37 da 73 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0
lip 03 09:55:53 dragon kernel: RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffbd65420efc50
lip 03 09:55:53 dragon kernel: ---[ end trace b4500dc7c46190b7 ]---
lip 03 09:58:14 dragon rpc.mountd[1038]: authenticated unmount request from 10.0.2.191:764 for /home/users (/home/users)
lip 03 09:58:23 dragon kernel: general protection fault: 0000 [#4] SMP PTI
lip 03 09:58:23 dragon kernel: Modules linked in: rpcsec_gss_krb5 tun intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul ipmi_ssif crc
lip 03 09:58:23 dragon kernel: CPU: 7 PID: 1067 Comm: nfsd Tainted: G      D           4.17.3-100.fc27.x86_64 #1
lip 03 09:58:23 dragon kernel: Hardware name: Supermicro Super Server/X11SSL-F, BIOS 2.0b 07/28/2017
lip 03 09:58:23 dragon kernel: RIP: 0010:kmem_cache_alloc+0x82/0x1c0
lip 03 09:58:23 dragon kernel: RSP: 0018:ffffbd65420c7c50 EFLAGS: 00010286
lip 03 09:58:23 dragon kernel: RAX: 0000000000000000 RBX: 8dd3122ac327ff2b RCX: ffff9cb0ed2d8d70
lip 03 09:58:23 dragon kernel: RDX: 00000000000000ba RSI: 00000000014080c0 RDI: 000000000002b2e0
lip 03 09:58:23 dragon kernel: RBP: ffff9cb12492f680 R08: ffff9cb1379eb2e0 R09: ffff9cb0ed2d8d80
lip 03 09:58:23 dragon kernel: R10: ffffbd65420c7cb0 R11: 0000000000000000 R12: 00000000014080c0
lip 03 09:58:23 dragon kernel: R13: ffffffffc0622a21 R14: 8dd3122ac327ff2b R15: ffff9cb12492f680
lip 03 09:58:23 dragon kernel: FS:  0000000000000000(0000) GS:ffff9cb1379c0000(0000) knlGS:0000000000000000
lip 03 09:58:23 dragon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lip 03 09:58:23 dragon kernel: CR2: 00007f2aec660653 CR3: 000000013220a003 CR4: 00000000003606e0
lip 03 09:58:23 dragon kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
lip 03 09:58:23 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:58:23 dragon kernel: Call Trace:
lip 03 09:58:23 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:58:23 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:58:23 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:58:23 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:58:23 dragon kernel:  ? nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:58:23 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:55:53 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:55:53 dragon kernel: Call Trace:
lip 03 09:55:53 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:55:53 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:55:53 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:55:53 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:55:53 dragon kernel:  ? nfsd4_open+0x5e9/0x700 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd4_open+0x5e9/0x700 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:55:53 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:55:53 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:55:53 dragon kernel:  ? nfsd_destroy+0x60/0x60 [nfsd]
lip 03 09:55:53 dragon kernel:  svc_process+0xd7/0xf0 [sunrpc]
lip 03 09:55:53 dragon kernel:  nfsd+0xe3/0x150 [nfsd]
lip 03 09:55:53 dragon kernel:  kthread+0x113/0x130
lip 03 09:55:53 dragon kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
lip 03 09:55:53 dragon kernel:  ret_from_fork+0x35/0x40
lip 03 09:55:53 dragon kernel: Code: 50 08 65 4c 03 05 87 37 da 73 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0
lip 03 09:55:53 dragon kernel: RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffbd65420efc50
lip 03 09:55:53 dragon kernel: ---[ end trace b4500dc7c46190b7 ]---
lip 03 09:58:14 dragon rpc.mountd[1038]: authenticated unmount request from 10.0.2.191:764 for /home/users (/home/users)
lip 03 09:58:23 dragon kernel: general protection fault: 0000 [#4] SMP PTI
lip 03 09:58:23 dragon kernel: Modules linked in: rpcsec_gss_krb5 tun intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul ipmi_ssif crc
lip 03 09:58:23 dragon kernel: CPU: 7 PID: 1067 Comm: nfsd Tainted: G      D           4.17.3-100.fc27.x86_64 #1
lip 03 09:58:23 dragon kernel: Hardware name: Supermicro Super Server/X11SSL-F, BIOS 2.0b 07/28/2017
lip 03 09:58:23 dragon kernel: RIP: 0010:kmem_cache_alloc+0x82/0x1c0
lip 03 09:58:23 dragon kernel: RSP: 0018:ffffbd65420c7c50 EFLAGS: 00010286
lip 03 09:58:23 dragon kernel: RAX: 0000000000000000 RBX: 8dd3122ac327ff2b RCX: ffff9cb0ed2d8d70
lip 03 09:58:23 dragon kernel: RDX: 00000000000000ba RSI: 00000000014080c0 RDI: 000000000002b2e0
lip 03 09:58:23 dragon kernel: RBP: ffff9cb12492f680 R08: ffff9cb1379eb2e0 R09: ffff9cb0ed2d8d80
lip 03 09:58:23 dragon kernel: R10: ffffbd65420c7cb0 R11: 0000000000000000 R12: 00000000014080c0
lip 03 09:58:23 dragon kernel: R13: ffffffffc0622a21 R14: 8dd3122ac327ff2b R15: ffff9cb12492f680
lip 03 09:58:23 dragon kernel: FS:  0000000000000000(0000) GS:ffff9cb1379c0000(0000) knlGS:0000000000000000
lip 03 09:58:23 dragon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lip 03 09:58:23 dragon kernel: CR2: 00007f2aec660653 CR3: 000000013220a003 CR4: 00000000003606e0
lip 03 09:58:23 dragon kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
lip 03 09:58:23 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:58:23 dragon kernel: Call Trace:
lip 03 09:58:23 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:58:23 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:58:23 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:58:23 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:58:23 dragon kernel:  ? nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:58:23 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:58:23 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:58:23 dragon kernel:  ? nfsd_destroy+0x60/0x60 [nfsd]
lip 03 09:58:23 dragon kernel:  svc_process+0xd7/0xf0 [sunrpc]
lip 03 09:58:23 dragon kernel:  nfsd+0xe3/0x150 [nfsd]
lip 03 09:58:23 dragon kernel:  kthread+0x113/0x130
lip 03 09:58:23 dragon kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
lip 03 09:58:23 dragon kernel:  ret_from_fork+0x35/0x40
lip 03 09:58:23 dragon kernel: Code: 50 08 65 4c 03 05 87 37 da 73 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0
lip 03 09:58:23 dragon kernel: RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffbd65420c7c50
lip 03 09:58:23 dragon kernel: ---[ end trace b4500dc7c46190b8 ]---
lip 03 09:59:54 dragon kernel: general protection fault: 0000 [#5] SMP PTI
lip 03 09:59:54 dragon kernel: Modules linked in: rpcsec_gss_krb5 tun intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul ipmi_ssif crc
lip 03 09:59:54 dragon kernel: CPU: 2 PID: 1065 Comm: nfsd Tainted: G      D           4.17.3-100.fc27.x86_64 #1
lip 03 09:59:54 dragon kernel: Hardware name: Supermicro Super Server/X11SSL-F, BIOS 2.0b 07/28/2017
lip 03 09:59:54 dragon kernel: RIP: 0010:kmem_cache_alloc+0x82/0x1c0
lip 03 09:59:54 dragon kernel: RSP: 0018:ffffbd6542417c50 EFLAGS: 00010282
lip 03 09:59:54 dragon kernel: RAX: 0000000000000000 RBX: b7b21b2b5b5e9d9b RCX: ffff9cb11f76ca10
lip 03 09:59:54 dragon kernel: RDX: 000000000000030c RSI: 00000000014080c0 RDI: 000000000002b2e0
lip 03 09:59:54 dragon kernel: RBP: ffff9cb12492f680 R08: ffff9cb1378ab2e0 R09: ffff9cb11f76ca20
lip 03 09:59:54 dragon kernel: R10: ffffbd6542417cb0 R11: 0000000000000000 R12: 00000000014080c0
lip 03 09:59:54 dragon kernel: R13: ffffffffc0622a21 R14: b7b21b2b5b5e9d9b R15: ffff9cb12492f680
lip 03 09:59:54 dragon kernel: FS:  0000000000000000(0000) GS:ffff9cb137880000(0000) knlGS:0000000000000000
lip 03 09:59:54 dragon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lip 03 09:59:54 dragon kernel: CR2: 000055cee7a06240 CR3: 000000013220a001 CR4: 00000000003606e0
lip 03 09:59:54 dragon kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
lip 03 09:59:54 dragon kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
lip 03 09:59:54 dragon kernel: Call Trace:
lip 03 09:59:54 dragon kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
lip 03 09:59:54 dragon kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
lip 03 09:59:54 dragon kernel:  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
lip 03 09:59:54 dragon kernel:  ? nfsd_permission+0x5a/0xf0 [nfsd]
lip 03 09:59:54 dragon kernel:  ? fh_verify+0x44b/0x600 [nfsd]
lip 03 09:59:54 dragon kernel:  ? nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:59:54 dragon kernel:  nfsd4_open+0x2dd/0x700 [nfsd]
lip 03 09:59:54 dragon kernel:  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
lip 03 09:59:54 dragon kernel:  nfsd_dispatch+0xf5/0x230 [nfsd]
lip 03 09:59:54 dragon kernel:  svc_process_common+0x4c3/0x720 [sunrpc]
lip 03 09:59:54 dragon kernel:  ? nfsd_destroy+0x60/0x60 [nfsd]
lip 03 09:59:54 dragon kernel:  svc_process+0xd7/0xf0 [sunrpc]
lip 03 09:59:54 dragon kernel:  nfsd+0xe3/0x150 [nfsd]
lip 03 09:59:54 dragon kernel:  kthread+0x113/0x130
lip 03 09:59:54 dragon kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
lip 03 09:59:54 dragon kernel:  ret_from_fork+0x35/0x40
lip 03 09:59:54 dragon kernel: Code: 50 08 65 4c 03 05 87 37 da 73 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0
lip 03 09:59:54 dragon kernel: RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffbd6542417c50
lip 03 09:59:54 dragon kernel: ---[ end trace b4500dc7c46190b9 ]---

Comment 1 Ruben Vermeersch 2018-07-04 13:58:29 UTC
Same thing with slightly different exports, so probably not related to that:

/home/ruben/Projects/Ticketmatic/devmatic2 192.168.39.0/24(rw,no_subtree_check,async,all_squash,anonuid=1000,anongid=1000)

Comment 2 Ruben Vermeersch 2018-07-04 13:59:47 UTC
This is on Fedora 28 for me, and started only recently (in the past week or two). Used to work fine so definitely a regression.

Comment 3 Ruben Vermeersch 2018-07-04 14:02:45 UTC
In my case: kernel 4.17.3-200.fc28.x86_64

Comment 4 z117 2018-07-04 18:30:45 UTC
Got the same problem with kernel-4.17.2-200.fc28.x86_64. kernel-4.16.16-300.fc28.x86_64 works normally. I haven't tested kernel-4.17.3-200.fc28.x86_64 yet, but the previous comment shows it is also useless.

Comment 5 Ruben Vermeersch 2018-07-05 07:13:35 UTC
Booted back into 4.16.16-300.fc28.x86_64 and can confirm that this is indeed a 4.17 regression.

Be warned that these crashes can cause shutdown / suspend hangs. An incorrect (failed) suspend almost caused my laptop to fry itself in a  backpack. Definitely not as harmless as it looks.

Comment 6 Frank Ch. Eigler 2018-07-08 22:38:25 UTC
*** Bug 1598229 has been marked as a duplicate of this bug. ***

Comment 7 Adam Williamson 2018-07-09 16:53:50 UTC
Also discussed in the forums: https://forums.fedoraforum.org/showthread.php?318645-houston-we-have-a-problem

Comment 8 Norman Gaywood 2018-07-11 00:27:42 UTC
Wondering if anyone is working on this? Only thing I can find that might be related is this kernel mailing list thread

https://lkml.org/lkml/2018/7/8/183

Comment 9 Norman Gaywood 2018-07-12 01:41:06 UTC
kernel 4.17.4-200.fc28 in updates testing has a number of NFS fixes.
Has anyone tried this?

Only one of my production systems is busy enough to hit this bug and it's running 4.16 as a work-around.

Changes in 4.17.4:
https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.17.4

Comment 10 z117 2018-07-12 15:25:35 UTC
kernel-4.17.4-200.fc28.x86_64 seems fine.

Comment 11 Norman Gaywood 2018-07-16 02:09:33 UTC
Just had another one in 4.17.5-200
Jul 16 10:55:34 turing.une.edu.au kernel: general protection fault: 0000 [#1] SMP PTI
Jul 16 10:55:34 turing.une.edu.au kernel: Modules linked in: fuse tcp_diag inet_diag unix_diag rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute brid>
Jul 16 10:55:34 turing.une.edu.au kernel:  crc32c_intel qemu_fw_cfg virtio_console virtio_blk virtio_net ata_generic pata_acpi
Jul 16 10:55:34 turing.une.edu.au kernel: CPU: 9 PID: 1841 Comm: nfsd Not tainted 4.17.5-200.fc28.x86_64 #1
Jul 16 10:55:34 turing.une.edu.au kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Jul 16 10:55:34 turing.une.edu.au kernel: RIP: 0010:prefetch_freepointer+0x10/0x20
Jul 16 10:55:34 turing.une.edu.au kernel: RSP: 0018:ffffb4fbc9edbc58 EFLAGS: 00010282
Jul 16 10:55:34 turing.une.edu.au kernel: RAX: 0000000000000000 RBX: ed8d1d11520786ab RCX: 0000000000009135
Jul 16 10:55:34 turing.une.edu.au kernel: RDX: 0000000000009134 RSI: ed8d1d11520786ab RDI: ffff882c31470480
Jul 16 10:55:34 turing.une.edu.au kernel: RBP: ffff882c31470480 R08: ffffd4fbbfc452d0 R09: 0000000000000004
Jul 16 10:55:34 turing.une.edu.au kernel: R10: 0000000000000000 R11: 000000000000000e R12: 00000000014080c0
Jul 16 10:55:34 turing.une.edu.au kernel: R13: ffffffffc07d3ad1 R14: ffff882aa39f9d01 R15: ffff882c31470480
Jul 16 10:55:34 turing.une.edu.au kernel: FS:  0000000000000000(0000) GS:ffff882c3f440000(0000) knlGS:0000000000000000
Jul 16 10:55:34 turing.une.edu.au kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 16 10:55:34 turing.une.edu.au kernel: CR2: 00007f25252df8a0 CR3: 0000000e1620a005 CR4: 00000000001606e0
Jul 16 10:55:34 turing.une.edu.au kernel: Call Trace:
Jul 16 10:55:34 turing.une.edu.au kernel:  kmem_cache_alloc+0xb4/0x1d0
Jul 16 10:55:34 turing.une.edu.au kernel:  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  nfs4_alloc_stid+0x21/0xa0 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  nfsd4_process_open2+0x1048/0x1360 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  ? nfsd_permission+0x63/0xe0 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  ? fh_verify+0x17a/0x5b0 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  ? nfsd4_process_open1+0x139/0x420 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  nfsd4_open+0x2b1/0x6b0 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  nfsd4_proc_compound+0x33e/0x640 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  nfsd_dispatch+0x9e/0x210 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  svc_process_common+0x46e/0x6c0 [sunrpc]
Jul 16 10:55:34 turing.une.edu.au kernel:  ? nfsd_destroy+0x50/0x50 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  svc_process+0xb7/0xf0 [sunrpc]
Jul 16 10:55:34 turing.une.edu.au kernel:  nfsd+0xe3/0x140 [nfsd]
Jul 16 10:55:34 turing.une.edu.au kernel:  kthread+0x112/0x130
Jul 16 10:55:34 turing.une.edu.au kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
Jul 16 10:55:34 turing.une.edu.au kernel:  ret_from_fork+0x35/0x40
Jul 16 10:55:34 turing.une.edu.au kernel: Code: 8f 89 d3 e8 c3 c1 67 00 85 c0 0f 85 c1 77 00 00 48 83 c4 08 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e c3 66 90 0f 1f 44 00 00 
Jul 16 10:55:34 turing.une.edu.au kernel: RIP: prefetch_freepointer+0x10/0x20 RSP: ffffb4fbc9edbc58
Jul 16 10:55:34 turing.une.edu.au kernel: ---[ end trace a4004dc629aa66ff ]---

Comment 12 H.J. Lu 2018-07-20 14:43:12 UTC
I also got this with 4.17.7-200.fc28.x86_64:

[160404.857873] general protection fault: 0000 [#1] SMP PTI
[160404.857924] Modules linked in: rpcsec_gss_krb5 intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore ipmi_ssif intel_rapl_perf ses iTCO_wdt iTCO_vendor_support enclosure ipmi_si hpilo hpwdt ipmi_devintf ioatdma i2c_i801 lpc_ich dca shpchp ipmi_msghandler wmi acpi_tad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc uas usb_storage mgag200 i2c_algo_bit drm_kms_helper ttm drm crc32c_intel serio_raw tg3 hpsa scsi_transport_sas
[160404.858191] CPU: 18 PID: 1856 Comm: nfsd Not tainted 4.17.7-200.fc28.x86_64 #1
[160404.858231] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
[160404.858283] RIP: 0010:prefetch_freepointer+0x10/0x20
[160404.858313] RSP: 0000:ffffa9da89337c58 EFLAGS: 00010206
[160404.858344] RAX: 0000000000000000 RBX: 22b88d6812101a17 RCX: 000000000000013a
[160404.858387] RDX: 0000000000000139 RSI: 22b88d6812101a17 RDI: ffff8fa5d539ea00
[160404.858431] RBP: ffff8fa5d539ea00 R08: ffffc9da7f7014a0 R09: 0000000000000004
[160404.858473] R10: 0000000000000000 R11: 0000000000000032 R12: 00000000014080c0
[160404.858517] R13: ffffffffc05c2ac1 R14: ffff8fa61707e911 R15: ffff8fa5d539ea00
[160404.858560] FS:  0000000000000000(0000) GS:ffff8faddef00000(0000) knlGS:0000000000000000
[160404.858604] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[160404.858636] CR2: 0000556bd740a378 CR3: 000000084920a001 CR4: 00000000003606e0
[160404.858676] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[160404.858720] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[160404.858762] Call Trace:
[160404.858792]  kmem_cache_alloc+0xb4/0x1d0
[160404.858864]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[160404.858907]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[160404.858951]  nfsd4_process_open2+0x1048/0x1360 [nfsd]
[160404.858996]  ? nfsd_permission+0x63/0xe0 [nfsd]
[160404.859033]  ? fh_verify+0x17a/0x5b0 [nfsd]
[160404.859074]  ? nfsd4_process_open1+0x139/0x420 [nfsd]
[160404.859114]  nfsd4_open+0x2b1/0x6b0 [nfsd]
[160404.859145]  nfsd4_proc_compound+0x33e/0x640 [nfsd]
[160404.859175]  ? nfsd4_read_rsize+0x20/0x20 [nfsd]
[160404.859200]  nfsd_dispatch+0x9e/0x210 [nfsd]
[160404.859264]  svc_process_common+0x46e/0x6c0 [sunrpc]
[160404.859299]  ? nfsd_destroy+0x50/0x50 [nfsd]
[160404.859328]  svc_process+0xb7/0xf0 [sunrpc]
[160404.859357]  nfsd+0xe3/0x140 [nfsd]
[160404.859390]  kthread+0x112/0x130
[160404.859411]  ? kthread_create_worker_on_cpu+0x70/0x70
[160404.859451]  ret_from_fork+0x35/0x40
[160404.859473] Code: 92 89 d3 e8 73 f6 67 00 85 c0 0f 85 c1 77 00 00 48 83 c4 08 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e c3 66 90 0f 1f 44 00 00 
[160404.859580] RIP: prefetch_freepointer+0x10/0x20 RSP: ffffa9da89337c58
[160404.859641] ---[ end trace bb41997ea65e9012 ]---

Comment 13 James 2018-07-20 19:09:49 UTC
Seen here too - same backtrace. 4.17.6-200.fc28.x86_644.17.6-200.fc28.x86_64

Comment 14 Norman Gaywood 2018-07-20 21:52:08 UTC
4.17.6-200 is known to have the problem.
4.17.7-200 was the hope to fix the problem

So comment #12 is worrisome :-(

Comment 15 Laura Abbott 2018-07-20 23:57:36 UTC
4.17.7 was designed to address a different problem. The two backtraces weren't the same so I'm not too surprised the NFS issue is still there. I was hoping something would trickle into stable but failing that, trying the debug kernel and booting with slub_debug=PFZU nokaslr on the kernel command line would be helpful. slub_debug sets some additional checks on the memory manager and nokaslr turns off address space randomization (one of the maintainers requested that to make it easier to debug)

Comment 16 Thomas Clark 2018-07-21 16:36:09 UTC
I'm not sure how this is related, but I get tons of the following errors on any kernel after 4.17.3, and I think (but not positive yet) that the errors only occur when writing to an NFS share.  The errors do not happen with 4.16 kernels.  The errors are sometimes, but not always, accompanied by a hard crash that requires a hard reset.  The crash leaves no evidence in logs.

Interestingly the error is for ata4.00 on multiple machines, one of which only has one drive, but one of which has eight.

[Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
[Sat Jul 21 09:10:04 2018] ata4.00: failed command: WRITE FPDMA QUEUED
[Sat Jul 21 09:10:04 2018] ata4.00: cmd 61/00:90:00:d0:e3/0a:00:03:00:00/40 tag 18 ncq dma 1310720 ou
                                    res 40/00:a0:00:1e:db/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
[Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
[Sat Jul 21 09:10:04 2018] ata4.00: failed command: WRITE FPDMA QUEUED
[Sat Jul 21 09:10:04 2018] ata4.00: cmd 61/00:98:00:ca:e3/06:00:03:00:00/40 tag 19 ncq dma 786432 out
                                    res 40/00:a0:00:1e:db/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
[Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
[Sat Jul 21 09:10:04 2018] ata4.00: failed command: WRITE FPDMA QUEUED
[Sat Jul 21 09:10:04 2018] ata4.00: cmd 61/00:a0:00:1e:db/0a:00:03:00:00/40 tag 20 ncq dma 1310720 ou
                                    res 40/00:a0:00:1e:db/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
[Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
[Sat Jul 21 09:10:04 2018] ata4: hard resetting link
[Sat Jul 21 09:10:05 2018] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[Sat Jul 21 09:10:05 2018] ata4.00: configured for UDMA/133
[Sat Jul 21 09:10:05 2018] ata4: EH complete

Comment 17 Terry Barnaby 2018-07-23 08:52:22 UTC
Just another one. We are running Fedota27 with kernel-4.17.7-100.fc27.x86_64 and that is exhibiting this bug. After every reboot instant kernel faults and no NFS. Went back to kernel-4.17.2-100.fc27.x86_64 and that showed the same issue (at least while the same clients were connected, after powering down/up clients after server seems to be ok at least for a bit).

Comment 18 Frank Ch. Eigler 2018-07-27 15:59:32 UTC
Given that nfs4 is the default, and that it is causing panics, I'm tempted to start karma=-1'ing 4.17 series kernels.

Comment 19 Laura Abbott 2018-07-27 18:40:50 UTC
https://koji.fedoraproject.org/koji/taskinfo?taskID=28651019 this is a scratch build which has a potential fix. It's been in upstream since 4.18-rc1 but didn't make it to stable yet

Comment 20 Adam Williamson 2018-07-27 21:33:08 UTC
"Given that nfs4 is the default, and that it is causing panics, I'm tempted to start karma=-1'ing 4.17 series kernels."

That would be useless and, if anything, counter-productive, as 4.17 is already stable for F27 and F28. All you'd be doing is preventing fixes within the 4.17 series to other areas from going out, to the disadvantage of others.

Comment 21 James 2018-07-28 17:04:29 UTC
Hmm... NFS may be something, but I've just seen a hard lock on .6 using the ganesha server. Right in the middle of a dnf update... anyway trying .9 now; if this messes up I'm back to .3.

Comment 22 Thomas Clark 2018-07-28 18:07:20 UTC
(In reply to Thomas Clark from comment #16)
> I'm not sure how this is related, but I get tons of the following errors on
> any kernel after 4.17.3, and I think (but not positive yet) that the errors
> only occur when writing to an NFS share.  The errors do not happen with 4.16
> kernels.  The errors are sometimes, but not always, accompanied by a hard
> crash that requires a hard reset.  The crash leaves no evidence in logs.
> 
> Interestingly the error is for ata4.00 on multiple machines, one of which
> only has one drive, but one of which has eight.
> 
> [Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
> [Sat Jul 21 09:10:04 2018] ata4.00: failed command: WRITE FPDMA QUEUED
> [Sat Jul 21 09:10:04 2018] ata4.00: cmd 61/00:90:00:d0:e3/0a:00:03:00:00/40
> tag 18 ncq dma 1310720 ou
>                                     res 40/00:a0:00:1e:db/00:00:03:00:00/40
> Emask 0x10 (ATA bus error)
> [Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
> [Sat Jul 21 09:10:04 2018] ata4.00: failed command: WRITE FPDMA QUEUED
> [Sat Jul 21 09:10:04 2018] ata4.00: cmd 61/00:98:00:ca:e3/06:00:03:00:00/40
> tag 19 ncq dma 786432 out
>                                     res 40/00:a0:00:1e:db/00:00:03:00:00/40
> Emask 0x10 (ATA bus error)
> [Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
> [Sat Jul 21 09:10:04 2018] ata4.00: failed command: WRITE FPDMA QUEUED
> [Sat Jul 21 09:10:04 2018] ata4.00: cmd 61/00:a0:00:1e:db/0a:00:03:00:00/40
> tag 20 ncq dma 1310720 ou
>                                     res 40/00:a0:00:1e:db/00:00:03:00:00/40
> Emask 0x10 (ATA bus error)
> [Sat Jul 21 09:10:04 2018] ata4.00: status: { DRDY }
> [Sat Jul 21 09:10:04 2018] ata4: hard resetting link
> [Sat Jul 21 09:10:05 2018] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl
> 320)
> [Sat Jul 21 09:10:05 2018] ata4.00: configured for UDMA/133
> [Sat Jul 21 09:10:05 2018] ata4: EH complete

More info on my post:  The disk errors were related, but only because of increased load secondary to NFS activity.  I ended up replacing the drive, even though it passed smartctl, and the ata errors stopped.  However, I still get crashes on 4.17 kernels within a few hours of NFS activity starting.  Performing backups over NFS is almost a guarantee of an immediate crash.

Comment 23 Thomas Clark 2018-07-29 02:21:17 UTC
(In reply to Laura Abbott from comment #19)
> https://koji.fedoraproject.org/koji/taskinfo?taskID=28651019 this is a
> scratch build which has a potential fix. It's been in upstream since
> 4.18-rc1 but didn't make it to stable yet

Does anybody know if this fix is in the 4.17.10-200.fc28 release that is now in updates-testing?

Comment 24 Thomas Clark 2018-07-30 12:22:18 UTC
(In reply to Thomas Clark from comment #23)
> (In reply to Laura Abbott from comment #19)
> > https://koji.fedoraproject.org/koji/taskinfo?taskID=28651019 this is a
> > scratch build which has a potential fix. It's been in upstream since
> > 4.18-rc1 but didn't make it to stable yet
> 
> Does anybody know if this fix is in the 4.17.10-200.fc28 release that is now
> in updates-testing?

I have successfully run 4.17.10-200.fc28 for 24 hours with no errors.

Comment 25 Gregory Lee Bartholomew 2018-07-30 14:19:22 UTC
I've been seeing this on Fedora 27 too.  It has been occurring for the last several weeks.  In my case, the NFS server will run anywhere from about 12 hours to about 4 or 5 days before crashing.  Thanks to whoever pointed out that the problem started with kernel 4.17.  I will try rolling back to a 4.16 kernel to see if that solves the problem.  In the meanwhile, below is a copy of the latest crash from my dmesg logs just in case there is something in it that helps someone narrow down the problem.

[Mon Jul 30 05:01:13 2018] general protection fault: 0000 [#1] SMP PTI
[Mon Jul 30 05:01:13 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:01:13 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:01:13 2018] CPU: 2 PID: 3947 Comm: nfsd Tainted: P           OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:01:13 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:01:13 2018] RIP: 0010:prefetch_freepointer+0x10/0x20
[Mon Jul 30 05:01:13 2018] RSP: 0018:ffffb65c94707c48 EFLAGS: 00010202
[Mon Jul 30 05:01:13 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: 000000000003d74e
[Mon Jul 30 05:01:13 2018] RDX: 000000000003d74d RSI: 0e342a5288524ea1 RDI: ffff920ac2dae600
[Mon Jul 30 05:01:13 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a3cf079e0
[Mon Jul 30 05:01:13 2018] R10: ffffb65c94707cb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:01:13 2018] R13: ffffffffc0d69a21 R14: ffff920a6fc53791 R15: ffff920ac2dae600
[Mon Jul 30 05:01:13 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:01:13 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:01:13 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:01:13 2018] Call Trace:
[Mon Jul 30 05:01:13 2018]  kmem_cache_alloc+0xb4/0x1c0
[Mon Jul 30 05:01:13 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:01:13 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:01:13 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:01:13 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:01:13 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:01:13 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:01:13 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:01:13 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:01:13 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:01:13 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:01:13 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:01:13 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:01:13 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:01:13 2018]  kthread+0x113/0x130
[Mon Jul 30 05:01:13 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:01:13 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:01:13 2018] Code: 75 58 48 c7 c7 00 2a 0d 92 e8 9b 7c ea ff eb 90 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e f3 c3 90 66 66 66 66 90 
[Mon Jul 30 05:01:13 2018] RIP: prefetch_freepointer+0x10/0x20 RSP: ffffb65c94707c48
[Mon Jul 30 05:01:13 2018] ---[ end trace c12ab55ea9b2c7f1 ]---
[Mon Jul 30 05:06:06 2018] general protection fault: 0000 [#2] SMP PTI
[Mon Jul 30 05:06:06 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:06:06 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:06:06 2018] CPU: 2 PID: 3946 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:06:06 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:06:06 2018] RIP: 0010:prefetch_freepointer+0x10/0x20
[Mon Jul 30 05:06:06 2018] RSP: 0018:ffffb65c946efc48 EFLAGS: 00010202
[Mon Jul 30 05:06:06 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: 000000000003d760
[Mon Jul 30 05:06:06 2018] RDX: 000000000003d75f RSI: 0e342a5288524ea1 RDI: ffff920ac2dae600
[Mon Jul 30 05:06:06 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a3cf06ea0
[Mon Jul 30 05:06:06 2018] R10: ffffb65c946efcb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:06:06 2018] R13: ffffffffc0d69a21 R14: ffff920a6fc52000 R15: ffff920ac2dae600
[Mon Jul 30 05:06:06 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:06:06 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:06:06 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:06:06 2018] Call Trace:
[Mon Jul 30 05:06:06 2018]  kmem_cache_alloc+0xb4/0x1c0
[Mon Jul 30 05:06:06 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:06:06 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:06:06 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:06:06 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:06:06 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:06:06 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:06:06 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:06:06 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:06:06 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:06:06 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:06:06 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:06:06 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:06:06 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:06:06 2018]  kthread+0x113/0x130
[Mon Jul 30 05:06:06 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:06:06 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:06:06 2018] Code: 75 58 48 c7 c7 00 2a 0d 92 e8 9b 7c ea ff eb 90 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e f3 c3 90 66 66 66 66 90 
[Mon Jul 30 05:06:06 2018] RIP: prefetch_freepointer+0x10/0x20 RSP: ffffb65c946efc48
[Mon Jul 30 05:06:06 2018] ---[ end trace c12ab55ea9b2c7f2 ]---
[Mon Jul 30 05:07:21 2018] general protection fault: 0000 [#3] SMP PTI
[Mon Jul 30 05:07:21 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:07:21 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:07:21 2018] CPU: 2 PID: 3945 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:07:21 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:07:21 2018] RIP: 0010:kmem_cache_alloc+0x82/0x1c0
[Mon Jul 30 05:07:21 2018] RSP: 0018:ffffb65c9469fc50 EFLAGS: 00010202
[Mon Jul 30 05:07:21 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: ffff920a672771f0
[Mon Jul 30 05:07:21 2018] RDX: 000000000003d760 RSI: 00000000014080c0 RDI: 000000000002c620
[Mon Jul 30 05:07:21 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a67277200
[Mon Jul 30 05:07:21 2018] R10: ffffb65c9469fcb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:07:21 2018] R13: ffffffffc0d69a21 R14: 0e342a5288524ea1 R15: ffff920ac2dae600
[Mon Jul 30 05:07:21 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:07:21 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:07:21 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:07:21 2018] Call Trace:
[Mon Jul 30 05:07:21 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:07:21 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:07:21 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:07:21 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:07:21 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:07:21 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:07:21 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:07:21 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:07:21 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:07:21 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:07:21 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:07:21 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:07:21 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:07:21 2018]  kthread+0x113/0x130
[Mon Jul 30 05:07:21 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:07:21 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:07:21 2018] Code: 50 08 65 4c 03 05 b7 35 da 6e 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0f 85 26 01 00 00 48 
[Mon Jul 30 05:07:21 2018] RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffb65c9469fc50
[Mon Jul 30 05:07:21 2018] ---[ end trace c12ab55ea9b2c7f3 ]---
[Mon Jul 30 05:10:15 2018] audit: type=1130 audit(1532945415.885:8511): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[Mon Jul 30 05:10:15 2018] audit: type=1131 audit(1532945415.885:8512): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[Mon Jul 30 05:11:06 2018] general protection fault: 0000 [#4] SMP PTI
[Mon Jul 30 05:11:06 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:11:06 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:11:06 2018] CPU: 2 PID: 3944 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:11:06 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:11:06 2018] RIP: 0010:prefetch_freepointer+0x10/0x20
[Mon Jul 30 05:11:06 2018] RSP: 0018:ffffb65c9456bc48 EFLAGS: 00010202
[Mon Jul 30 05:11:06 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: 000000000003d76a
[Mon Jul 30 05:11:06 2018] RDX: 000000000003d769 RSI: 0e342a5288524ea1 RDI: ffff920ac2dae600
[Mon Jul 30 05:11:06 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a55243440
[Mon Jul 30 05:11:06 2018] R10: ffffb65c9456bcb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:11:06 2018] R13: ffffffffc0d69a21 R14: ffff920a6fc52910 R15: ffff920ac2dae600
[Mon Jul 30 05:11:06 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:11:06 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:11:06 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:11:06 2018] Call Trace:
[Mon Jul 30 05:11:06 2018]  kmem_cache_alloc+0xb4/0x1c0
[Mon Jul 30 05:11:06 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:11:06 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:11:06 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:11:06 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:11:06 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:11:06 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:11:06 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:11:06 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:11:06 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:11:06 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:11:06 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:11:06 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:11:06 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:11:06 2018]  kthread+0x113/0x130
[Mon Jul 30 05:11:06 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:11:06 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:11:06 2018] Code: 75 58 48 c7 c7 00 2a 0d 92 e8 9b 7c ea ff eb 90 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e f3 c3 90 66 66 66 66 90 
[Mon Jul 30 05:11:06 2018] RIP: prefetch_freepointer+0x10/0x20 RSP: ffffb65c9456bc48
[Mon Jul 30 05:11:06 2018] ---[ end trace c12ab55ea9b2c7f4 ]---
[Mon Jul 30 05:12:02 2018] general protection fault: 0000 [#5] SMP PTI
[Mon Jul 30 05:12:02 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:12:02 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:12:02 2018] CPU: 2 PID: 3943 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:12:02 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:12:02 2018] RIP: 0010:prefetch_freepointer+0x10/0x20
[Mon Jul 30 05:12:02 2018] RSP: 0018:ffffb65c94523c48 EFLAGS: 00010202
[Mon Jul 30 05:12:02 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: 000000000003d76c
[Mon Jul 30 05:12:02 2018] RDX: 000000000003d76b RSI: 0e342a5288524ea1 RDI: ffff920ac2dae600
[Mon Jul 30 05:12:02 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a55242120
[Mon Jul 30 05:12:02 2018] R10: ffffb65c94523cb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:12:02 2018] R13: ffffffffc0d69a21 R14: ffff920a6fc52bc8 R15: ffff920ac2dae600
[Mon Jul 30 05:12:02 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:12:02 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:12:02 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:12:02 2018] Call Trace:
[Mon Jul 30 05:12:02 2018]  kmem_cache_alloc+0xb4/0x1c0
[Mon Jul 30 05:12:02 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:12:02 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:12:02 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:12:02 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:12:02 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:12:02 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:12:02 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:12:02 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:12:02 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:12:02 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:12:02 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:12:02 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:12:02 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:12:02 2018]  kthread+0x113/0x130
[Mon Jul 30 05:12:02 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:12:02 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:12:02 2018] Code: 75 58 48 c7 c7 00 2a 0d 92 e8 9b 7c ea ff eb 90 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e f3 c3 90 66 66 66 66 90 
[Mon Jul 30 05:12:02 2018] RIP: prefetch_freepointer+0x10/0x20 RSP: ffffb65c94523c48
[Mon Jul 30 05:12:02 2018] ---[ end trace c12ab55ea9b2c7f5 ]---
[Mon Jul 30 05:14:00 2018] general protection fault: 0000 [#6] SMP PTI
[Mon Jul 30 05:14:00 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:14:00 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:14:00 2018] CPU: 2 PID: 3942 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:14:00 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:14:00 2018] RIP: 0010:kmem_cache_alloc+0x82/0x1c0
[Mon Jul 30 05:14:00 2018] RSP: 0018:ffffb65c9450bc50 EFLAGS: 00010202
[Mon Jul 30 05:14:00 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: ffff920a552428f0
[Mon Jul 30 05:14:00 2018] RDX: 000000000003d76c RSI: 00000000014080c0 RDI: 000000000002c620
[Mon Jul 30 05:14:00 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a55242900
[Mon Jul 30 05:14:00 2018] R10: ffffb65c9450bcb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:14:00 2018] R13: ffffffffc0d69a21 R14: 0e342a5288524ea1 R15: ffff920ac2dae600
[Mon Jul 30 05:14:00 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:14:00 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:14:00 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:14:00 2018] Call Trace:
[Mon Jul 30 05:14:00 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:14:00 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:14:00 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:14:00 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:14:00 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:14:00 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:14:00 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:14:00 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:14:00 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:14:00 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:14:00 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:14:00 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:14:00 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:14:00 2018]  kthread+0x113/0x130
[Mon Jul 30 05:14:00 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:14:00 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:14:00 2018] Code: 50 08 65 4c 03 05 b7 35 da 6e 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0f 85 26 01 00 00 48 
[Mon Jul 30 05:14:00 2018] RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffb65c9450bc50
[Mon Jul 30 05:14:00 2018] ---[ end trace c12ab55ea9b2c7f6 ]---
[Mon Jul 30 05:14:57 2018] general protection fault: 0000 [#7] SMP PTI
[Mon Jul 30 05:14:57 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:14:57 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:14:57 2018] CPU: 2 PID: 3941 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:14:57 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:14:57 2018] RIP: 0010:kmem_cache_alloc+0x82/0x1c0
[Mon Jul 30 05:14:57 2018] RSP: 0018:ffffb65c94503c50 EFLAGS: 00010202
[Mon Jul 30 05:14:57 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: ffff920a55242b30
[Mon Jul 30 05:14:57 2018] RDX: 000000000003d76c RSI: 00000000014080c0 RDI: 000000000002c620
[Mon Jul 30 05:14:57 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a55242b40
[Mon Jul 30 05:14:57 2018] R10: ffffb65c94503cb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:14:57 2018] R13: ffffffffc0d69a21 R14: 0e342a5288524ea1 R15: ffff920ac2dae600
[Mon Jul 30 05:14:57 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:14:57 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:14:57 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:14:57 2018] Call Trace:
[Mon Jul 30 05:14:57 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:14:57 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:14:57 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:14:57 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:14:57 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:14:57 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:14:57 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:14:57 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:14:57 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:14:57 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:14:57 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:14:57 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:14:57 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:14:57 2018]  kthread+0x113/0x130
[Mon Jul 30 05:14:57 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:14:57 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:14:57 2018] Code: 50 08 65 4c 03 05 b7 35 da 6e 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0f 85 26 01 00 00 48 
[Mon Jul 30 05:14:57 2018] RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffb65c94503c50
[Mon Jul 30 05:14:57 2018] ---[ end trace c12ab55ea9b2c7f7 ]---
[Mon Jul 30 05:15:36 2018] general protection fault: 0000 [#8] SMP PTI
[Mon Jul 30 05:15:36 2018] Modules linked in: rpcsec_gss_krb5 binfmt_misc nfsd nfs_acl lockd grace aoe xt_geoip(OE) nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack mpt3sas raid_class ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables radeon i2c_algo_bit ttm ibmpex coretemp gpio_ich iTCO_wdt iTCO_vendor_support drm_kms_helper ipmi_ssif kvm_intel ibmaem drm kvm bnx2 lpc_ich ses i5000_edac ipmi_si e1000e enclosure irqbypass scsi_transport_sas
[Mon Jul 30 05:15:36 2018]  i2c_i801 ipmi_devintf i5k_amb shpchp ipmi_msghandler auth_rpcgss sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) megaraid_sas serio_raw ata_generic pata_acpi
[Mon Jul 30 05:15:36 2018] CPU: 2 PID: 3940 Comm: nfsd Tainted: P      D    OE     4.17.7-100.fc27.x86_64 #1
[Mon Jul 30 05:15:36 2018] Hardware name: IBM IBM System x3650 -[7979ac1]-/System Planar, BIOS -[GGE149AUS-1.19]- 02/11/2011
[Mon Jul 30 05:15:36 2018] RIP: 0010:kmem_cache_alloc+0x82/0x1c0
[Mon Jul 30 05:15:36 2018] RSP: 0018:ffffb65c944fbc50 EFLAGS: 00010202
[Mon Jul 30 05:15:36 2018] RAX: 0000000000000000 RBX: 0e342a5288524ea1 RCX: ffff920a3cc7ee90
[Mon Jul 30 05:15:36 2018] RDX: 000000000003d76c RSI: 00000000014080c0 RDI: 000000000002c620
[Mon Jul 30 05:15:36 2018] RBP: ffff920ac2dae600 R08: ffff920affcac620 R09: ffff920a3cc7eea0
[Mon Jul 30 05:15:36 2018] R10: ffffb65c944fbcb0 R11: 0000000000000000 R12: 00000000014080c0
[Mon Jul 30 05:15:36 2018] R13: ffffffffc0d69a21 R14: 0e342a5288524ea1 R15: ffff920ac2dae600
[Mon Jul 30 05:15:36 2018] FS:  0000000000000000(0000) GS:ffff920affc80000(0000) knlGS:0000000000000000
[Mon Jul 30 05:15:36 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jul 30 05:15:36 2018] CR2: 00007f047152bd58 CR3: 00000008f020a000 CR4: 00000000000006e0
[Mon Jul 30 05:15:36 2018] Call Trace:
[Mon Jul 30 05:15:36 2018]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[Mon Jul 30 05:15:36 2018]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[Mon Jul 30 05:15:36 2018]  nfsd4_process_open2+0xb84/0x14c0 [nfsd]
[Mon Jul 30 05:15:36 2018]  ? nfsd_permission+0x5a/0xf0 [nfsd]
[Mon Jul 30 05:15:36 2018]  ? fh_verify+0x44b/0x600 [nfsd]
[Mon Jul 30 05:15:36 2018]  ? nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:15:36 2018]  nfsd4_open+0x2dd/0x700 [nfsd]
[Mon Jul 30 05:15:36 2018]  nfsd4_proc_compound+0x4f9/0x6e0 [nfsd]
[Mon Jul 30 05:15:36 2018]  nfsd_dispatch+0xf5/0x230 [nfsd]
[Mon Jul 30 05:15:36 2018]  svc_process_common+0x4c3/0x720 [sunrpc]
[Mon Jul 30 05:15:36 2018]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jul 30 05:15:36 2018]  svc_process+0xd7/0xf0 [sunrpc]
[Mon Jul 30 05:15:36 2018]  nfsd+0xe3/0x150 [nfsd]
[Mon Jul 30 05:15:36 2018]  kthread+0x113/0x130
[Mon Jul 30 05:15:36 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 30 05:15:36 2018]  ret_from_fork+0x35/0x40
[Mon Jul 30 05:15:36 2018] Code: 50 08 65 4c 03 05 b7 35 da 6e 49 83 78 10 00 4d 8b 30 0f 84 06 01 00 00 4d 85 f6 0f 84 fd 00 00 00 41 8b 5f 20 49 8b 3f 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 40 f6 c7 0f 0f 85 26 01 00 00 48 
[Mon Jul 30 05:15:36 2018] RIP: kmem_cache_alloc+0x82/0x1c0 RSP: ffffb65c944fbc50
[Mon Jul 30 05:15:36 2018] ---[ end trace c12ab55ea9b2c7f8 ]---

Comment 26 H.J. Lu 2018-07-30 19:19:11 UTC
4.17.11-200.fc28.x86_64 still failed:

[   81.222513] general protection fault: 0000 [#1] SMP PTI
[   81.222570] Modules linked in: devlink ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic kvm snd_hda_intel snd_hda_codec mei_wdt irqbypass crct10dif_pclmul gpio_ich iTCO_wdt crc32_pclmul iTCO_vendor_support ppdev snd_hda_core snd_hwdep snd_seq ghash_clmulni_intel intel_cstate snd_seq_device intel_uncore snd_pcm intel_rapl_perf joydev snd_timer snd i2c_i801 mei_me lpc_ich soundcore shpchp parport_pc mei parport pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915 i2c_algo_bit drm_kms_helper r8169 drm crc32c_intel mii video
[   81.223069] CPU: 2 PID: 1047 Comm: nfsd Not tainted 4.17.11-200.fc28.x86_64 #1
[   81.223130] Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
[   81.223207] RIP: 0010:prefetch_freepointer+0x10/0x20
[   81.223250] RSP: 0018:ffffa73043d33c58 EFLAGS: 00010286
[   81.223296] RAX: 0000000000000000 RBX: e093175f59a73d71 RCX: 0000000000000037
[   81.223354] RDX: 0000000000000036 RSI: e093175f59a73d71 RDI: ffff9689f3bb3380
[   81.223413] RBP: ffff9689f3bb3380 R08: ffffc7303fc91960 R09: 0000000000000004
[   81.223471] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000014080c0
[   81.223529] R13: ffffffffc05e2ac1 R14: ffff9689e41815c1 R15: ffff9689f3bb3380
[   81.223589] FS:  0000000000000000(0000) GS:ffff968a1e280000(0000) knlGS:0000000000000000
[   81.223655] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   81.223703] CR2: 00007f4cec584160 CR3: 000000058820a005 CR4: 00000000001606e0
[   81.223762] Call Trace:
[   81.223791]  kmem_cache_alloc+0xb4/0x1d0
[   81.223855]  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
[   81.223916]  nfs4_alloc_stid+0x21/0xa0 [nfsd]
[   81.223973]  nfsd4_process_open2+0x1048/0x1360 [nfsd]
[   81.224031]  ? nfsd_permission+0x63/0xe0 [nfsd]
[   81.224083]  ? fh_verify+0x17a/0x5b0 [nfsd]
[   81.224137]  ? nfsd4_process_open1+0x139/0x420 [nfsd]
[   81.224196]  nfsd4_open+0x2b1/0x6b0 [nfsd]
[   81.224249]  nfsd4_proc_compound+0x33e/0x640 [nfsd]
[   81.224304]  nfsd_dispatch+0x9e/0x210 [nfsd]
[   81.224371]  svc_process_common+0x46e/0x6c0 [sunrpc]
[   81.224438]  ? nfsd_destroy+0x50/0x50 [nfsd]
[   81.224508]  svc_process+0xb7/0xf0 [sunrpc]
[   81.224566]  nfsd+0xe3/0x140 [nfsd]
[   81.224609]  kthread+0x112/0x130
[   81.224649]  ? kthread_create_worker_on_cpu+0x70/0x70
[   81.224704]  ret_from_fork+0x35/0x40
[   81.224745] Code: b7 89 d3 e8 13 fa 67 00 85 c0 0f 85 c1 77 00 00 48 83 c4 08 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e c3 66 90 0f 1f 44 00 00 
[   81.224984] RIP: prefetch_freepointer+0x10/0x20 RSP: ffffa73043d33c58
[   81.225076] ---[ end trace a1b3921de0d50156 ]---

Comment 27 Göran Uddeborg 2018-07-30 20:03:03 UTC
Created attachment 1471638 [details]
Strace output from starting nfs service, after the previous failed

In case it might help in debugging:

When this happened here one of the first times, I tried to restart the nfs-server service, but it didn't want to come up.  I then tried to manually start the server, with strace to see what it did.  It turns out it hangs trying to read /proc/fs/nfsd/versions.  The "openat" system call succeeds, but the "read" call hangs forever.  The process is in uninterruptible sleep in this situation.  Trying a plain "cat /proc/fs/nfsd/versions" also hangs in the same way.  I attach the strace output.

After going back to 4.16.11-300.fc28.x86_64 things works again

Comment 28 Thomas Clark 2018-07-30 23:12:05 UTC
I was too hopeful too soon.  Kernel 4.17.10-200.fc28 ran for 36 hours, then crashed with an error log almost identical to H. J. Lu, above.  There was no apparent spike in activity that provoked it.  The only clue I saw is that the load precipitously went up to 12, then became unresponsive, followed by the crash.

Comment 29 H.J. Lu 2018-07-30 23:23:26 UTC
I have a workload which can trigger kernel NFS oops in less than 10
minutes.

Comment 30 Thomas Clark 2018-07-30 23:52:41 UTC
(In reply to Thomas Clark from comment #28)
> I was too hopeful too soon.  Kernel 4.17.10-200.fc28 ran for 36 hours, then
> crashed with an error log almost identical to H. J. Lu, above.  There was no
> apparent spike in activity that provoked it.  The only clue I saw is that
> the load precipitously went up to 12, then became unresponsive, followed by
> the crash.

Just to clarify, the load was not due to increased workload.  The load of 12 was caused by whatever caused the crash.  The normal workload on this server is less than 1.

Comment 31 H.J. Lu 2018-07-31 13:29:01 UTC
4.18.0-rc1 survived my workload.

Comment 32 H.J. Lu 2018-07-31 19:33:48 UTC
I got

[   38.985832] ==================================================================
[   38.986011] BUG: KASAN: use-after-free in nfsd4_process_open2+0x12c9/0x22b0 [nfsd]
[   38.986138] Read of size 16 at addr ffff8807a43a96e0 by task nfsd/1087

[   38.986279] CPU: 4 PID: 1087 Comm: nfsd Not tainted 4.17.11+ #12
[   38.986281] Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
[   38.986283] Call Trace:
[   38.986292]  dump_stack+0x71/0xac
[   38.986299]  print_address_description+0x6c/0x23c
[   38.986342]  ? nfsd4_process_open2+0x12c9/0x22b0 [nfsd]
[   38.986347]  kasan_report.cold.6+0x241/0x2fd
[   38.986389]  nfsd4_process_open2+0x12c9/0x22b0 [nfsd]
[   38.986433]  ? nfsd4_process_open1+0x790/0x790 [nfsd]
[   38.986469]  ? fh_verify+0x299/0x9f0 [nfsd]
[   38.986474]  ? kmem_cache_alloc+0x1bb/0x260
[   38.986478]  ? memcmp+0x45/0x70
[   38.986482]  ? __radix_tree_preload+0x30/0xd0
[   38.986518]  ? SVCFH_fmt+0xb0/0xb0 [nfsd]
[   38.986560]  nfsd4_open+0x42f/0xb70 [nfsd]
[   38.986601]  nfsd4_proc_compound+0x681/0xaf0 [nfsd]
[   38.986644]  ? nfsd4_release_compoundargs+0xb0/0xb0 [nfsd]
[   38.986678]  nfsd_dispatch+0x11b/0x350 [nfsd]
[   38.986736]  svc_process_common+0x828/0xc60 [sunrpc]
[   38.986771]  ? nfsd_svc+0x3c0/0x3c0 [nfsd]
[   38.986829]  ? svc_xprt_do_enqueue+0x1f/0x2c0 [sunrpc]
[   38.986884]  ? svc_printk+0x170/0x170 [sunrpc]
[   38.986941]  ? svc_xprt_release+0x183/0x300 [sunrpc]
[   38.986996]  svc_process+0x196/0x1f0 [sunrpc]
[   38.987032]  nfsd+0x182/0x200 [nfsd]
[   38.987068]  ? nfsd_destroy+0xb0/0xb0 [nfsd]
[   38.987072]  kthread+0x1a0/0x1c0
[   38.987078]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[   38.987083]  ret_from_fork+0x35/0x40

[   38.987119] Allocated by task 1087:
[   38.987183]  kasan_kmalloc+0xbf/0xe0
[   38.987187]  kmem_cache_alloc+0x107/0x260
[   38.987228]  nfs4_alloc_stid+0x25/0x120 [nfsd]
[   38.987269]  nfsd4_process_open2+0x1c47/0x22b0 [nfsd]
[   38.987307]  nfsd4_open+0x42f/0xb70 [nfsd]
[   38.987346]  nfsd4_proc_compound+0x681/0xaf0 [nfsd]
[   38.987380]  nfsd_dispatch+0x11b/0x350 [nfsd]
[   38.987434]  svc_process_common+0x828/0xc60 [sunrpc]
[   38.987488]  svc_process+0x196/0x1f0 [sunrpc]
[   38.987522]  nfsd+0x182/0x200 [nfsd]
[   38.987526]  kthread+0x1a0/0x1c0
[   38.987530]  ret_from_fork+0x35/0x40

[   38.987562] Freed by task 1087:
[   38.987620]  __kasan_slab_free+0x12e/0x180
[   38.987624]  kmem_cache_free+0x7a/0x220
[   38.987664]  nfs4_free_deleg+0x14/0x30 [nfsd]
[   38.987705]  nfs4_put_stid+0x77/0xb0 [nfsd]
[   38.987745]  destroy_unhashed_deleg+0xc7/0x100 [nfsd]
[   38.987786]  nfsd4_process_open2+0x202d/0x22b0 [nfsd]
[   38.987824]  nfsd4_open+0x42f/0xb70 [nfsd]
[   38.987863]  nfsd4_proc_compound+0x681/0xaf0 [nfsd]
[   38.987897]  nfsd_dispatch+0x11b/0x350 [nfsd]
[   38.987951]  svc_process_common+0x828/0xc60 [sunrpc]
[   38.988005]  svc_process+0x196/0x1f0 [sunrpc]
[   38.988039]  nfsd+0x182/0x200 [nfsd]
[   38.988042]  kthread+0x1a0/0x1c0
[   38.988047]  ret_from_fork+0x35/0x40

[   38.988080] The buggy address belongs to the object at ffff8807a43a96d8
                which belongs to the cache nfsd4_delegations of size 232
[   38.988288] The buggy address is located 8 bytes inside of
                232-byte region [ffff8807a43a96d8, ffff8807a43a97c0)
[   38.988470] The buggy address belongs to the page:
[   38.988552] page:ffffea001e90ea00 count:1 mapcount:0 mapping:0000000000000000 index:0xffff8807a43ab910 compound_mapcount: 0
[   38.988732] flags: 0x17fffe00008100(slab|head)
[   38.988812] raw: 0017fffe00008100 0000000000000000 ffff8807a43ab910 00000001001c0001
[   38.988940] raw: ffffea001abff420 ffff8806ba25be50 ffff8806bd343040 0000000000000000
[   38.989065] page dumped because: kasan: bad access detected

[   38.989186] Memory state around the buggy address:
[   38.989268]  ffff8807a43a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   38.989386]  ffff8807a43a9600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   38.989505] >ffff8807a43a9680: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb
[   38.989621]                                                        ^
[   38.989744]  ffff8807a43a9700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   38.989863]  ffff8807a43a9780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[   38.989979] ==================================================================
[   38.990096] Disabling lock debugging due to kernel taint
[   38.990120] ------------[ cut here ]------------
[   38.990121] refcount_t: underflow; use-after-free.
[   38.990158] WARNING: CPU: 4 PID: 1087 at lib/refcount.c:281 refcount_dec_not_one+0x111/0x120
[   38.990159] Modules linked in: devlink ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf iTCO_wdt gpio_ich iTCO_vendor_support mei_wdt ppdev snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel joydev snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm pcspkr snd_timer i2c_i801 lpc_ich snd mei_me soundcore mei parport_pc parport pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915 i2c_algo_bit crc32c_intel drm_kms_helper r8169 drm mii video
[   38.990237] CPU: 4 PID: 1087 Comm: nfsd Tainted: G    B             4.17.11+ #12
[   38.990239] Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
[   38.990244] RIP: 0010:refcount_dec_not_one+0x111/0x120
[   38.990247] RSP: 0018:ffff88069bdf79b0 EFLAGS: 00010286
[   38.990251] RAX: 0000000000000000 RBX: ffff8807a43a96d8 RCX: ffffffff8129be0e
[   38.990253] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff8807b931f910
[   38.990256] RBP: 00000000ffffffff R08: ffffed00f7263f23 R09: ffffed00f7263f22
[   38.990259] R10: ffffed00f7263f22 R11: ffff8807b931f917 R12: 1ffff100d37bef37
[   38.990261] R13: ffff8807a85c8f08 R14: ffff8807ae5d6c28 R15: ffff8807a85c8f08
[   38.990265] FS:  0000000000000000(0000) GS:ffff8807b9300000(0000) knlGS:0000000000000000
[   38.990267] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   38.990270] CR2: 00007f61b28d04a0 CR3: 0000000002c0e006 CR4: 00000000001606e0
[   38.990272] Call Trace:
[   38.990279]  ? refcount_dec_if_one+0xb0/0xb0
[   38.990284]  ? _raw_spin_unlock_irqrestore+0x1b/0x30
[   38.990289]  refcount_dec_and_lock+0x11/0x50
[   38.990332]  nfs4_put_stid+0x3a/0xb0 [nfsd]
[   38.990374]  nfsd4_process_open2+0x1324/0x22b0 [nfsd]
[   38.990421]  ? nfsd4_process_open1+0x790/0x790 [nfsd]
[   38.990457]  ? fh_verify+0x299/0x9f0 [nfsd]
[   38.990462]  ? kmem_cache_alloc+0x1bb/0x260
[   38.990466]  ? memcmp+0x45/0x70
[   38.990470]  ? __radix_tree_preload+0x30/0xd0
[   38.990509]  ? SVCFH_fmt+0xb0/0xb0 [nfsd]
[   38.990550]  nfsd4_open+0x42f/0xb70 [nfsd]
[   38.990592]  nfsd4_proc_compound+0x681/0xaf0 [nfsd]
[   38.990638]  ? nfsd4_release_compoundargs+0xb0/0xb0 [nfsd]
[   38.990673]  nfsd_dispatch+0x11b/0x350 [nfsd]
[   38.990729]  svc_process_common+0x828/0xc60 [sunrpc]
[   38.990765]  ? nfsd_svc+0x3c0/0x3c0 [nfsd]
[   38.990826]  ? svc_xprt_do_enqueue+0x1f/0x2c0 [sunrpc]
[   38.990881]  ? svc_printk+0x170/0x170 [sunrpc]
[   38.990938]  ? svc_xprt_release+0x183/0x300 [sunrpc]
[   38.990994]  svc_process+0x196/0x1f0 [sunrpc]
[   38.991033]  nfsd+0x182/0x200 [nfsd]
[   38.991068]  ? nfsd_destroy+0xb0/0xb0 [nfsd]
[   38.991073]  kthread+0x1a0/0x1c0
[   38.991078]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[   38.991083]  ret_from_fork+0x35/0x40
[   38.991086] Code: ff 74 c7 83 f8 01 74 27 8d 68 ff 39 e8 73 91 80 3d 58 41 a8 01 00 75 b2 48 c7 c7 e0 9a 5e 82 c6 05 48 41 a8 01 01 e8 ef 07 9f ff <0f> 0b eb 9b 31 c0 eb 9c e8 e2 04 9f ff 66 90 41 54 49 89 f4 55

Comment 33 H.J. Lu 2018-07-31 21:28:57 UTC
I am testing 2 backports from 4.18.0-rc on 4.17.11:

commit 692ad280bff3e81721ab138b9455948ab5289acf
Author: Andrew Elble <aweits>
Date:   Wed Apr 18 17:04:37 2018 -0400

    nfsd: fix error handling in nfs4_set_delegation()

and

commit 3171822fdcdd6e6d536047c425af6dc7a92dc585
Author: Scott Mayhew <smayhew>
Date:   Fri Jun 8 16:31:46 2018 -0400

    nfsd: fix potential use-after-free in nfsd4_decode_getdeviceinfo

They survived my workload.

Comment 34 Thomas Clark 2018-08-04 17:02:23 UTC
This morning, kernel 4.17.11-200.fc28 was moved from updates-testing into the stable repository.  Yet it is known that 4.17.11 is not stable--even worse, it has a major flaw that has a 100% chance of crashing on any server that is performing one of the basic--if not the most basic functions that a Linux server has.  Furthermore, this crash has a significant chance of causing data loss and even filesystem corruption.

Can anybody explain how the 4.17 kernel series has marched on through testing and validation, each iteration having the same well documented fatal flaw?  What is the point of having a testing system if kernels known to be flawed get moved forward anyway?

Comment 35 H.J. Lu 2018-08-04 19:40:30 UTC
4.17.12 should fix this bug since it has the 2 backports I have been
using.  I have been running 4.17.12 for more than a day without issues.

Comment 36 Frank Ch. Eigler 2018-09-02 14:27:17 UTC
4.17.14-102.fc27 also looking good.


Note You need to log in before you can comment on or make changes to this bug.