Bug 848867

Summary: 3.5.1-1.fc17.x86_64 kernel panic
Product: [Fedora] Fedora Reporter: H.J. Lu <hongjiu.lu>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 17CC: bfields, gansalmon, htl10, itamar, jaroslav.sykora, jlayton, jonathan, kernel-maint, madhu.chinakonda, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-05 15:18:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description H.J. Lu 2012-08-16 15:51:58 UTC
On Fedora 17 with 24 cores, when I restart nfs server,
I got

[443503.679360] BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
[443503.679386] IP: [<ffffffffa01e20aa>] svc_destroy+0x1a/0x130 [sunrpc]
[443503.679413] PGD 0 
[443503.679420] Oops: 0000 [#1] SMP 
[443503.679429] CPU 11 
[443503.679434] Modules linked in: nfs fscache tpm_bios igb ptp pps_core i7core_edac ioatdma edac_core dca lpc_ich coretemp kvm_intel i2c_i801 snd_hda_codec_realtek shpchp mfd_core kvm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm nfsd snd_page_alloc snd_timer nfs_acl snd auth_rpcgss microcode soundcore lockd sunrpc uinput crc32c_intel ghash_clmulni_intel firewire_ohci firewire_core crc_itu_t nouveau mxm_wmi wmi video i2c_algo_bit drm_kms_helper ttm drm i2c_core [last unloaded: scsi_wait_scan]
[443503.679565] 
[443503.679568] Pid: 13391, comm: rpc.nfsd Not tainted 3.5.1-1.fc17.x86_64 #1 Intel Corporation S5520SC/S5520SC
[443503.679584] RIP: 0010:[<ffffffffa01e20aa>]  [<ffffffffa01e20aa>] svc_destroy+0x1a/0x130 [sunrpc]
[443503.679607] RSP: 0018:ffff880322801e58  EFLAGS: 00010246
[443503.679625] RAX: 00000000ffffff91 RBX: 0000000000000000 RCX: 0000000000000100
[443503.679633] RDX: 0000000000000100 RSI: 0000000050251f41 RDI: 0000000000000000
[443503.679642] RBP: ffff880322801e60 R08: ffff88065c1cd000 R09: 0000000000000000
[443503.679650] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
[443503.679659] R13: 0000000000000004 R14: ffff880322801f58 R15: 00007f2d0d0212ae
[443503.679668] FS:  00007f2d0cff4740(0000) GS:ffff88066fca0000(0000) knlGS:0000000000000000
[443503.679684] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[443503.679698] CR2: 0000000000000014 CR3: 000000065be8d000 CR4: 00000000000007e0
[443503.679739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[443503.679796] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[443503.679837] Process rpc.nfsd (pid: 13391, threadinfo ffff880322800000, task ffff88035d701710)
[443503.679878] Stack:
[443503.679895]  ffff880382fae008 ffff880322801ee0 ffffffffa0285b7f 00007f2dffffff91
[443503.679961]  ffff880382fae009 ffff880322801e8f 0034fffffffffff4 0000000000000002
[443503.680023]  ffff880322801ea8 ffffffff81128586 ffff880322801ee0 ffffffff811a89d5
[443503.680103] Call Trace:
[443503.680133]  [<ffffffffa0285b7f>] write_ports+0x2cf/0x3e0 [nfsd]
[443503.680185]  [<ffffffff81128586>] ? get_zeroed_page+0x16/0x20
[443503.680229]  [<ffffffff811a89d5>] ? simple_transaction_get+0xc5/0xe0
[443503.680264]  [<ffffffffa02858b0>] ? write_gracetime+0x60/0x60 [nfsd]
[443503.680295]  [<ffffffffa0285027>] nfsctl_transaction_write+0x57/0x90 [nfsd]
[443503.680327]  [<ffffffff811861bc>] vfs_write+0xac/0x180
[443503.680348]  [<ffffffff811864ea>] sys_write+0x4a/0x90
[443503.680365]  [<ffffffff8160a26d>] system_call_fastpath+0x1a/0x1f
[443503.680385] Code: 31 c0 e8 82 63 41 e1 eb be e8 23 60 e7 e0 0f 1f 00 55 48 89 e5 53 66 66 66 66 90 f6 05 6c 1c 02 00 02 48 89 fb 0f 85 ef 00 00 00 <8b> 43 14 85 c0 0f 84 ce 00 00 00 83 e8 01 85 c0 89 43 14 0f 85 
[443503.680724] RIP  [<ffffffffa01e20aa>] svc_destroy+0x1a/0x130 [sunrpc]
[443503.680761]  RSP <ffff880322801e58>
[443503.680777] CR2: 0000000000000014
[443503.685860] ---[ end trace 7a679bc1c3066273 ]---
[root@gnu-mic-2 gcc]#

It doesn't always happen.

Comment 1 Jeff Layton 2012-08-21 17:56:12 UTC
(gdb) list *(svc_destroy+0x1a)
0xe1aa is in svc_destroy (net/sunrpc/svc.c:540).
535	 * protect the sv_nrthreads, sv_permsocks and sv_tempsocks.
536	 */
537	void
538	svc_destroy(struct svc_serv *serv)
539	{
540		dprintk("svc: svc_destroy(%s, %d)\n",
541					serv->sv_program->pg_name,
542					serv->sv_nrthreads);
543	
544		if (serv->sv_nrthreads) {

...which was called from:

(gdb) list *(write_ports+0x2cf)
0x1b9f is in write_ports (fs/nfsd/nfsctl.c:679).
674		err = svc_addsock(nfsd_serv, fd, buf, SIMPLE_TRANSACTION_LIMIT);
675		if (err < 0) {
676			if (nfsd_serv->sv_nrthreads == 1)
677				svc_shutdown_net(nfsd_serv, net);
678			svc_destroy(nfsd_serv);
679			return err;
680		}
681	
682		/* Decrease the count, but don't shut down the service */
683		nfsd_serv->sv_nrthreads--;


(gdb) disassemble svc_destroy
Dump of assembler code for function svc_destroy:
   0x000000000000e190 <+0>:	push   %rbp
   0x000000000000e191 <+1>:	mov    %rsp,%rbp
   0x000000000000e194 <+4>:	push   %rbx
   0x000000000000e195 <+5>:	sub    $0x8,%rsp
   0x000000000000e199 <+9>:	callq  0xe19e <svc_destroy+14>
   0x000000000000e19e <+14>:	testb  $0x2,0x0(%rip)        # 0xe1a5 <svc_destroy+21>
   0x000000000000e1a5 <+21>:	mov    %rdi,%rbx
   0x000000000000e1a8 <+24>:	jne    0xe2a5 <svc_destroy+277>
   0x000000000000e1ae <+30>:	mov    0x14(%rbx),%eax

...so I think what happened was that nfsd_serv got set to NULL just before svc_destroy got called on it. IOW, a race of some sort (ugh).

Comment 2 Jaroslav Sykora 2012-09-19 06:46:57 UTC
I've got the same:
Fedora 17, 3.5.3-1.fc17.x86_64

Sep 19 07:48:14 nibbler kernel: [   36.710360] nfsd: last server has exited, flushing export cache
Sep 19 07:48:14 nibbler kernel: [   36.710394] BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
Sep 19 07:48:14 nibbler kernel: [   36.710418] IP: [<ffffffffa028c1ae>] svc_destroy+0x1e/0x140 [sunrpc]
Sep 19 07:48:14 nibbler kernel: [   36.710447] PGD 0 
Sep 19 07:48:14 nibbler kernel: [   36.710453] Oops: 0000 [#1] SMP 
Sep 19 07:48:14 nibbler kernel: [   36.710461] CPU 2 
Sep 19 07:48:14 nibbler kernel: [   36.710465] Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle vboxpci(O) bridge stp llc vboxnetadp(O) bnep bluetooth rfkill vboxnetflt(O) vboxdrv(O) be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi max6650 f71882fg ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack snd_hda_codec_realtek nfsd snd_hda_intel nfs_acl snd_hda_codec auth_rpcgss snd_hwdep snd_pcm snd_page_alloc snd_timer snd r8169 lpc_ich mfd_core e1000e soundcore mii i2c_i801 coretemp serio_raw microcode lockd vhost_net tun macvtap macvlan sunrpc kvm_intel kvm binfmt_misc uinput firewire_ohci firewire_core crc_itu_t nouveau mxm_wmi wmi video i2c_algo_bit drm_kms_helper ttm drm i2c_core [last unloaded: scsi_wait_scan]
Sep 19 07:48:14 nibbler kernel: [   36.710686] 
Sep 19 07:48:14 nibbler kernel: [   36.710689] Pid: 939, comm: rpc.nfsd Tainted: G         C O 3.5.3-1.fc17.x86_64 #1 Hewlett-Packard HP Compaq dx7500 Microtower/2A84h
Sep 19 07:48:14 nibbler kernel: [   36.710705] RIP: 0010:[<ffffffffa028c1ae>]  [<ffffffffa028c1ae>] svc_destroy+0x1e/0x140 [sunrpc]
Sep 19 07:48:14 nibbler kernel: [   36.710726] RSP: 0018:ffff88020547de28  EFLAGS: 00010246
Sep 19 07:48:14 nibbler kernel: [   36.710735] RAX: 00000000fffffff3 RBX: 0000000000000000 RCX: 0000000000000100
Sep 19 07:48:14 nibbler kernel: [   36.710743] RDX: 0000000000000100 RSI: 0000000050595c7a RDI: 0000000000000000
Sep 19 07:48:14 nibbler kernel: [   36.710751] RBP: ffff88020547de38 R08: ffff8802125c7000 R09: 0000000000000000
Sep 19 07:48:14 nibbler kernel: [   36.710758] R10: 0000000000000000 R11: 0000000000040000 R12: 0000000000000002
Sep 19 07:48:14 nibbler kernel: [   36.710766] R13: 0000000000000004 R14: ffff88020547df50 R15: 00007fc6681cd2ae
Sep 19 07:48:14 nibbler kernel: [   36.710774] FS:  00007fc668186740(0000) GS:ffff88021fd00000(0000) knlGS:0000000000000000
Sep 19 07:48:14 nibbler kernel: [   36.710782] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 19 07:48:14 nibbler kernel: [   36.710789] CR2: 0000000000000014 CR3: 000000020e41a000 CR4: 00000000000407e0
Sep 19 07:48:14 nibbler kernel: [   36.710797] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 19 07:48:14 nibbler kernel: [   36.710804] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep 19 07:48:14 nibbler kernel: [   36.710811] Process rpc.nfsd (pid: 939, threadinfo ffff88020547c000, task ffff8802055d2e20)
Sep 19 07:48:14 nibbler kernel: [   36.710819] Stack:
Sep 19 07:48:14 nibbler kernel: [   36.710823]  ffff880210d62008 ffff880210d62008 ffff88020547dec8 ffffffffa0398b9f
Sep 19 07:48:14 nibbler kernel: [   36.710836]  ffff88020547df50 00007fc6fffffff3 ffff880210d62009 ffff88020547de6f
Sep 19 07:48:14 nibbler kernel: [   36.710850]  0034fffffffffff4 0000000000000002 ffff88020547de88 ffffffff81129a36
Sep 19 07:48:14 nibbler kernel: [   36.710863] Call Trace:
Sep 19 07:48:14 nibbler kernel: [   36.710878]  [<ffffffffa0398b9f>] write_ports+0x2cf/0x3e0 [nfsd]
Sep 19 07:48:14 nibbler kernel: [   36.710890]  [<ffffffff81129a36>] ? get_zeroed_page+0x16/0x20
Sep 19 07:48:14 nibbler kernel: [   36.710899]  [<ffffffff811aae95>] ? simple_transaction_get+0xc5/0xe0
Sep 19 07:48:14 nibbler kernel: [   36.710911]  [<ffffffffa03988d0>] ? write_gracetime+0x60/0x60 [nfsd]
Sep 19 07:48:14 nibbler kernel: [   36.710923]  [<ffffffffa0398047>] nfsctl_transaction_write+0x57/0x90 [nfsd]
Sep 19 07:48:14 nibbler kernel: [   36.710934]  [<ffffffff811881bc>] vfs_write+0xac/0x180
Sep 19 07:48:14 nibbler kernel: [   36.710941]  [<ffffffff811884ea>] sys_write+0x4a/0x90
Sep 19 07:48:14 nibbler kernel: [   36.710953]  [<ffffffff81614ae9>] system_call_fastpath+0x16/0x1b
Sep 19 07:48:14 nibbler kernel: [   36.710959] Code: 68 37 e1 eb b9 e8 23 c1 dc e0 0f 1f 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 f6 05 68 1b 02 00 02 48 89 fb 0f 85 f7 00 00 00 <8b> 43 14 85 c0 0f 84 d6 00 00 00 83 e8 01 85 c0 89 43 14 0f 85 
Sep 19 07:48:14 nibbler kernel: [   36.711066] RIP  [<ffffffffa028c1ae>] svc_destroy+0x1e/0x140 [sunrpc]
Sep 19 07:48:14 nibbler kernel: [   36.711084]  RSP <ffff88020547de28>
Sep 19 07:48:14 nibbler kernel: [   36.711089] CR2: 0000000000000014
Sep 19 07:48:14 nibbler kernel: [   36.734295] ---[ end trace db1e88ecfb895970 ]---
Sep 19 07:48:14 nibbler systemd[1]: nfs-server.service: main process exited, code=killed, status=9
Sep 19 07:48:14 nibbler systemd[1]: Job nfs-idmap.service/start failed with result 'dependency'.
Sep 19 07:48:14 nibbler systemd[1]: Job nfs-rquotad.service/start failed with result 'dependency'.
Sep 19 07:48:14 nibbler systemd[1]: Job nfs-mountd.service/start failed with result 'dependency'.
Sep 19 07:48:14 nibbler systemd[1]: Unit nfs-server.service entered failed state.

Comment 3 Jaroslav Sykora 2012-09-19 06:51:19 UTC
related: bug 858489

Comment 4 Jeff Layton 2012-12-05 13:38:34 UTC
*** Bug 858489 has been marked as a duplicate of this bug. ***

Comment 5 Jeff Layton 2012-12-05 15:18:21 UTC
Ok, I think I see the bug here. It was a regression created by 074d0f67cfe.

In nfsd_create_serv, we create the struct svc_serv by calling svc_create_pooled(). That function is passed an argument that sets the sv_shutdown parameter to nfsd_last_thread(). That function is what sets nfsd_serv to NULL.

It then tries to svc_bind the service to register it with rpcbind. This fails, and we end up calling svc_destroy() on the service. It used to be that svc_destroy() would call sv_shutdown but in early 3.5 kernels, that was moved to svc_shutdown_net().

So now we have nfsd_serv set to a non-NULL value, but its refcount has gone to zero and it has been destroyed. On the next call into 

I think all of this code was fixed upstream in commit 57c8b13e3c, which unfortunately relies on a number of previous commits.

I'm going to close this with a resolution of ERRATA since any kernel 3.6 and up shouldn't be vulnerable to the problem.