Bug 1713937 - rpc.mountd dumps core with nfs-utils 2.3.4
Summary: rpc.mountd dumps core with nfs-utils 2.3.4
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: nfs-utils
Version: 30
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-25 21:27 UTC by Bojan Smojver
Modified: 2019-11-24 01:54 UTC (History)
4 users (show)

Fixed In Version: nfs-utils-2.3.4-2.fc30 nfs-utils-2.3.3-4.rc2.fc29
Clone Of:
Environment:
Last Closed: 2019-06-25 01:25:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Bojan Smojver 2019-05-25 21:27:55 UTC
Description of problem:

rpc.mountd dumps core. Reverting to previous build makes the NFS server work again.

      PID: 3056 (rpc.mountd)
       UID: 0 (root)
       GID: 0 (root)
    Signal: 11 (SEGV)
 Timestamp: Fri 2019-05-24 15:43:23 AEST (10min ago)

Command Line: /usr/sbin/rpc.mountd
 Executable: /usr/sbin/rpc.mountd
 Control Group: /system.slice/nfs-mountd.service
 Unit: nfs-mountd.service
 Slice: system.slice
 Boot ID: <boot>
 Machine ID: <machine>
 Hostname: <host>
 Storage: /var/lib/systemd/coredump/core.rpc\x2emountd.0.9a81e480746b479d9c8ae9618ff17404.3056.1558676603000000.lz4 
Message: Process 3056 (rpc.mountd) of user 0 dumped core.<

            Stack trace of thread 3056:
            #0  0x000055b227043f5f n/a (rpc.mountd)
            #1  0x000055b22704418d n/a (rpc.mountd)
            #2  0x000055b22703df13 n/a (rpc.mountd)
            #3  0x000055b22703e34b n/a (rpc.mountd)
            #4  0x000055b22703ae32 n/a (rpc.mountd)
            #5  0x000055b22703ce93 n/a (rpc.mountd)
            #6  0x000055b22703d3a0 n/a (rpc.mountd)
            #7  0x000055b2270380f0 n/a (rpc.mountd)
            #8  0x00007f15da422f33 __libc_start_main (libc.so.6)
            #9  0x000055b22703823e n/a (rpc.mountd)


Version-Release number of selected component (if applicable):
2.3.4-1

How reproducible:
Always.


Steps to Reproduce:
1. Attempt to mount an NFS directory.
2. rpc.mountd crashes.

Actual results:
SIGSEGV.

Expected results:
Previous RPM 2.3.3-7.rc2 works fine

Comment 1 Bojan Smojver 2019-05-27 11:12:54 UTC
I tested this version with kernel 5.1.5 as well and got the same result. Core dump on the server side and a kernel Oops on the client side. Reverting to previous version makes things work.

The client works against EFS, but not against itself on F30.

Comment 2 Bojan Smojver 2019-05-27 12:11:46 UTC
For posterity, client side Oops-es:
--------------------
May 27 17:51:14 host kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
May 27 17:51:14 host kernel: #PF error: [normal kernel read fault]
May 27 17:51:14 host kernel: PGD 0 P4D 0 
May 27 17:51:14 host kernel: Oops: 0000 [#1] SMP PTI
May 27 17:51:14 host kernel: CPU: 0 PID: 3415 Comm: automount Not tainted 5.1.5-300.fc30.x86_64 #1
May 27 17:51:14 host kernel: Hardware name: LENOVO 20BXCTO1WW/20BXCTO1WW, BIOS JBET72WW (1.36 ) 02/23/2019
May 27 17:51:14 host kernel: RIP: 0010:xprt_adjust_timeout+0x9/0xe0 [sunrpc]
May 27 17:51:14 host kernel: Code: 05 00 01 00 00 48 89 83 f8 00 00 00 5b 5d 41 5c 41 5d 41 5e c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 54 55 53 <48> 8b 87 98 00 00 00 48 89 fb 48 8b 80 a8 00 00 00 48 8b 68 78 48
May 27 17:51:14 host kernel: RSP: 0018:ffffb99a0174baf0 EFLAGS: 00010207
May 27 17:51:14 host kernel: RAX: 00000000fffffff5 RBX: ffff94505f378a00 RCX: 0000000000000003
May 27 17:51:14 host kernel: RDX: ffff94506252cac0 RSI: 00000000fffffe01 RDI: 0000000000000000
May 27 17:51:14 host kernel: RBP: ffff94505ea04500 R08: ffff94506252ca90 R09: ffff94506252cac0
May 27 17:51:14 host kernel: R10: 0000000000000003 R11: ffffe6e10752ce20 R12: ffff944ffcbd3400
May 27 17:51:14 host kernel: R13: ffff94505ea04530 R14: 0000000000004080 R15: ffffffffc07b13f0
May 27 17:51:14 host kernel: FS:  00007f262a1e1700(0000) GS:ffff945065a00000(0000) knlGS:0000000000000000
May 27 17:51:14 host kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 27 17:51:14 host kernel: CR2: 0000000000000098 CR3: 000000021fb0a005 CR4: 00000000003606f0
May 27 17:51:14 host kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 27 17:51:14 host kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 27 17:51:14 host kernel: Call Trace:
May 27 17:51:14 host kernel:  rpc_check_timeout+0x1e/0xe0 [sunrpc]
May 27 17:51:14 host kernel:  call_decode+0x123/0x180 [sunrpc]
May 27 17:51:14 host kernel:  __rpc_execute+0x7c/0x330 [sunrpc]
May 27 17:51:14 host kernel:  ? recalibrate_cpu_khz+0x10/0x10
May 27 17:51:14 host kernel:  rpc_run_task+0x10a/0x140 [sunrpc]
May 27 17:51:14 host kernel:  nfs4_call_sync_sequence+0x68/0xa0 [nfsv4]
May 27 17:51:14 host kernel:  _nfs4_proc_getattr+0xfb/0x120 [nfsv4]
May 27 17:51:14 host kernel:  nfs4_proc_getattr+0x73/0x110 [nfsv4]
May 27 17:51:14 host kernel:  __nfs_revalidate_inode+0x11a/0x2f0 [nfs]
May 27 17:51:14 host kernel:  nfs_getattr+0x115/0x2a0 [nfs]
May 27 17:51:14 host kernel:  ? security_inode_getattr+0x3a/0x50
May 27 17:51:14 host kernel:  vfs_statx+0x94/0xf0
May 27 17:51:14 host kernel:  __do_sys_newlstat+0x39/0x70
May 27 17:51:14 host kernel:  do_syscall_64+0x5b/0x170
May 27 17:51:14 host kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 27 17:51:14 host kernel: RIP: 0033:0x7f2642542599
May 27 17:51:14 host kernel: Code: ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 49 89 f0 48 89 d6 83 ff 01 77 31 4c 89 c7 b8 06 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 07 c3 66 0f 1f 44 00 00 48 8b 15 b9 48 0d 00
May 27 17:51:14 host kernel: RSP: 002b:00007f262a1deb48 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
May 27 17:51:14 host kernel: RAX: ffffffffffffffda RBX: 0000556f2b47f2b0 RCX: 00007f2642542599
May 27 17:51:14 host kernel: RDX: 00007f262a1deb70 RSI: 00007f262a1deb70 RDI: 00007f262a1dec50
May 27 17:51:14 host kernel: RBP: 0000556f2b47f2b0 R08: 00007f262a1dec50 R09: 00007f262a1deae0
May 27 17:51:14 host kernel: R10: 00007f262a1dfd20 R11: 0000000000000246 R12: 00007f262a1dec50
May 27 17:51:14 host kernel: R13: 0000000000000001 R14: 0000556f2a3a09a0 R15: 00007f262a1e0e40
May 27 17:51:14 host kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables rmi_smbus rmi_core vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mei_wdt mei_hdcp irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf joydev intel_pch_thermal i2c_i801 mei_me mei pcc_cpufreq auth_rpcgss sunrpc i915 rtsx_pci_sdmmc mmc_core i2c_algo_bit drm_kms_helper e1000e crc32c_intel drm serio_raw rtsx_pci wmi video
May 27 17:51:14 host kernel: CR2: 0000000000000098
May 27 17:51:14 host kernel: ---[ end trace 9dc0e1a830bdaf41 ]---
May 27 17:51:16 host kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
May 27 17:51:16 host kernel: #PF error: [normal kernel read fault]
May 27 17:51:16 host kernel: PGD 0 P4D 0 
May 27 17:51:16 host kernel: Oops: 0000 [#2] SMP PTI
May 27 17:51:16 host kernel: CPU: 3 PID: 166 Comm: kworker/u16:3 Tainted: G      D           5.1.5-300.fc30.x86_64 #1
May 27 17:51:16 host kernel: Hardware name: LENOVO 20BXCTO1WW/20BXCTO1WW, BIOS JBET72WW (1.36 ) 02/23/2019
May 27 17:51:16 host kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
May 27 17:51:16 host kernel: RIP: 0010:xprt_adjust_timeout+0x9/0xe0 [sunrpc]
May 27 17:51:16 host kernel: Code: 05 00 01 00 00 48 89 83 f8 00 00 00 5b 5d 41 5c 41 5d 41 5e c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 54 55 53 <48> 8b 87 98 00 00 00 48 89 fb 48 8b 80 a8 00 00 00 48 8b 68 78 48
May 27 17:51:16 host kernel: RSP: 0018:ffffb99a01067d68 EFLAGS: 00010207
May 27 17:51:16 host kernel: RAX: 00000000fffffff5 RBX: ffff94505f37b600 RCX: 0000000000000003
May 27 17:51:16 host kernel: RDX: ffff94506252cac0 RSI: 00000000fffffe01 RDI: 0000000000000000
May 27 17:51:16 host kernel: RBP: ffff94505fecc800 R08: ffff94506252ca90 R09: ffff94506252cac0
May 27 17:51:16 host kernel: R10: 0000000000000003 R11: 0000000000000018 R12: ffff94505f379000
May 27 17:51:16 host kernel: R13: ffff94505fecc830 R14: 0000000000005a81 R15: ffffffffc07b13f0
May 27 17:51:16 host kernel: FS:  0000000000000000(0000) GS:ffff945065ac0000(0000) knlGS:0000000000000000
May 27 17:51:16 host kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 27 17:51:16 host kernel: CR2: 0000000000000098 CR3: 000000014520e006 CR4: 00000000003606e0
May 27 17:51:16 host kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 27 17:51:16 host kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 27 17:51:16 host kernel: Call Trace:
May 27 17:51:16 host kernel:  rpc_check_timeout+0x1e/0xe0 [sunrpc]
May 27 17:51:16 host kernel:  call_decode+0x123/0x180 [sunrpc]
May 27 17:51:16 host kernel:  __rpc_execute+0x7c/0x330 [sunrpc]
May 27 17:51:16 host kernel:  rpc_async_schedule+0x29/0x40 [sunrpc]
May 27 17:51:16 host kernel:  process_one_work+0x19d/0x380
May 27 17:51:16 host kernel:  worker_thread+0x50/0x3b0
May 27 17:51:16 host kernel:  kthread+0xfb/0x130
May 27 17:51:16 host kernel:  ? process_one_work+0x380/0x380
May 27 17:51:16 host kernel:  ? kthread_park+0x90/0x90
May 27 17:51:16 host kernel:  ret_from_fork+0x35/0x40
May 27 17:51:16 host kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables rmi_smbus rmi_core vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mei_wdt mei_hdcp irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf joydev intel_pch_thermal i2c_i801 mei_me mei pcc_cpufreq auth_rpcgss sunrpc i915 rtsx_pci_sdmmc mmc_core i2c_algo_bit drm_kms_helper e1000e crc32c_intel drm serio_raw rtsx_pci wmi video
May 27 17:51:16 host kernel: CR2: 0000000000000098
May 27 17:51:16 host kernel: ---[ end trace 9dc0e1a830bdaf42 ]---
--------------------

Comment 3 Steve Dickson 2019-05-27 19:08:36 UTC
I am not able to reproduce this... 


(In reply to Bojan Smojver from comment #1)
> I tested this version with kernel 5.1.5 as well and got the same result.
> Core dump on the server side and a kernel Oops on the client side. Reverting
> to previous version makes things work.
Would it be possible to do a dnf debuginfo-install nfs-utils-2.3.4-1.fc30.x86_64
which should produce a populated backtrace.

> 
> The client works against EFS, but not against itself on F30.
I don't understand what this means...

Comment 4 Bojan Smojver 2019-05-27 20:27:42 UTC
Trace with debuginfo installed:
--------------------------
           PID: 3704 (rpc.mountd)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Tue 2019-05-28 06:18:23 AEST (1min 12s ago)
  Command Line: /usr/sbin/rpc.mountd
    Executable: /usr/sbin/rpc.mountd
 Control Group: /system.slice/nfs-mountd.service
          Unit: nfs-mountd.service
         Slice: system.slice
       Boot ID: 0ecbf9e012f44868b6051b46f73caa50
    Machine ID: <machine>
      Hostname: <host>
       Storage: /var/lib/systemd/coredump/core.rpc\x2emountd.0.0ecbf9e012f44868b6051b46f73caa50.3704.1558988303000000.lz4
       Message: Process 3704 (rpc.mountd) of user 0 dumped core.
                
                Stack trace of thread 3704:
                #0  0x0000555be4766f5f DoMatch (rpc.mountd)
                #1  0x0000555be476718d wildmat (rpc.mountd)
                #2  0x0000555be4760f13 check_wildcard (rpc.mountd)
                #3  0x0000555be476134b client_compose (rpc.mountd)
                #4  0x0000555be475de32 auth_unix_ip (rpc.mountd)
                #5  0x0000555be475fe93 cache_process_req (rpc.mountd)
                #6  0x0000555be47603a0 my_svc_run (rpc.mountd)
                #7  0x0000555be475b0f0 main (rpc.mountd)
                #8  0x00007fea3deb0f33 __libc_start_main (libc.so.6)
                #9  0x0000555be475b23e _start (rpc.mountd)
--------------------------

>> The client works against EFS, but not against itself on F30.
> I don't understand what this means...

Client F30, nfs-utils 2.3.4 --> Server F30, nfs-utils 2.3.4: server core dumps, client kernel Oops-es
Client F30, nfs-utils 2.3.4 --> Server Amazon EFS (i.e. NFSv4 essentially), works fine on both ends

I tried both of the above scenarios with kernel 5.1.4-1.300 and 5.1.5-1.300. Same.

My exports file (F30), if it matters:
--------------------------
#
/home/groups *.<my-domain>(rw,sync,sec=krb5)
/home/users *.<my-domain>(rw,sync,sec=krb5)
--------------------------

I tried without sec=krb5 too. Same result, if I remember correctly. The FS on client side is mounted through autofs, but I tried by hand. Same.

Comment 5 Bojan Smojver 2019-05-27 20:37:18 UTC
(In reply to Bojan Smojver from comment #4)

> I tried both of the above scenarios with kernel 5.1.4-1.300 and 5.1.5-1.300.

Er, I meant -300 there, not -1.300, of course.

Comment 6 Steve Dickson 2019-05-28 14:19:04 UTC
Could you please try this scratch build of nfs-utils
   https://koji.fedoraproject.org/koji/taskinfo?taskID=35111648 

It contains an upstream patch that I believe fixes the problem.

Comment 7 Bojan Smojver 2019-05-28 22:28:44 UTC
(In reply to Steve Dickson from comment #6)
> Could you please try this scratch build of nfs-utils
>    https://koji.fedoraproject.org/koji/taskinfo?taskID=35111648 
> 
> It contains an upstream patch that I believe fixes the problem.

Thanks for the quick turnaround, it works. No core dumps on the server.

I did see a kernel Oops once since the upgrade of the client, but not after I rebooted it. If it happens again, I'll let you know.

Comment 8 Fedora Update System 2019-05-29 18:31:33 UTC
FEDORA-2019-06f611666c has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-06f611666c

Comment 9 Fedora Update System 2019-05-29 18:31:37 UTC
FEDORA-2019-06f611666c has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-06f611666c

Comment 10 Fedora Update System 2019-05-29 18:42:45 UTC
FEDORA-2019-4cefd3161a has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-4cefd3161a

Comment 11 Fedora Update System 2019-05-29 18:42:48 UTC
FEDORA-2019-4cefd3161a has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-4cefd3161a

Comment 12 Fedora Update System 2019-05-30 13:57:29 UTC
nfs-utils-2.3.4-2.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-06f611666c

Comment 13 Fedora Update System 2019-05-30 15:34:41 UTC
nfs-utils-2.3.3-4.rc2.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-4cefd3161a

Comment 14 Bojan Smojver 2019-06-06 23:22:23 UTC
Steve,

Can I ask a general question here about those kernel Oops that I posted in this bug and that have been reported using abrt. Surely, a buggy nfs-utils package should not be able to cause those in the kernel, right? These are kernel bugs, correct?

Comment 15 J. Bruce Fields 2019-06-07 00:19:50 UTC
(In reply to Bojan Smojver from comment #14)
> Can I ask a general question here about those kernel Oops that I posted in
> this bug and that have been reported using abrt. Surely, a buggy nfs-utils
> package should not be able to cause those in the kernel, right? These are
> kernel bugs, correct?

Yes, that's a kernel bug probably unrelated to the mountd bug.  I don't recognize it.

Comment 16 Fedora Update System 2019-06-25 01:25:17 UTC
nfs-utils-2.3.4-2.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2019-11-24 01:54:58 UTC
nfs-utils-2.3.3-4.rc2.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.