Bug 1131551

Summary: Kernel oops in nfs3_list_one_acl kernel version 3.15.8 and up
Product: [Fedora] Fedora Reporter: wolf
Component: kernelAssignee: nfs-maint
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 20CC: bfields, dabbill, gansalmon, gbarzini, iglesias, itamar, john.ellson, jonathan, kernel-maint, madhu.chinakonda, mchehab, sengend
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.14.19-100.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-30 03:58:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description wolf 2014-08-19 14:40:33 UTC
Description of problem:

While copying a file in nautilus from and to a nfsv3 share a kernel oops occures and nautilus hangs.

Version-Release number of selected component (if applicable):
verified with kernel version 3.15.8-200.fc20.x86_64 and 3.15.10-200.fc20.x86_64
problem does not occure on same system running kernel 3.15.7-200.fc20.x86_64

How reproducible:
each time

Steps to Reproduce:
1. mount a nfs share using nfs version 3
2. open nautilus
3. copy a file on this share to a subdirectory on this share

Actual results:
nautilus hangs, kernel oops is logged

Expected results:
file is copied


Additional info:
[14665.429102] BUG: unable to handle kernel paging request at ffffffffffffffa1
[14665.429133] IP: [<ffffffffa03b030b>] nfs3_list_one_acl+0x2b/0x80 [nfsv3]
[14665.429159] PGD 1c14067 PUD 1c16067 PMD 0 
[14665.429177] Oops: 0002 [#1] SMP 
[14665.429192] Modules linked in: tcp_lp bnep bluetooth rfkill fuse nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache binfmt_misc vfat fat iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic dcdbas coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcsp snd_pcm snd_timer microcode snd serio_raw soundcore lpc_ich i2c_i801 mfd_core nfsd auth_rpcgss nfs_acl lockd sunrpc i915 e1000e ptp pps_core i2c_algo_bit drm_kms_helper drm i2c_core video
[14665.429419] CPU: 0 PID: 17329 Comm: pool Not tainted 3.15.8-200.fc20.x86_64 #1
[14665.429441] Hardware name: Dell Inc. OptiPlex 7010/0KRC95, BIOS A14 06/10/2013
[14665.429463] task: ffff8800c3b13160 ti: ffff8800367dc000 task.ti: ffff8800367dc000
[14665.429486] RIP: 0010:[<ffffffffa03b030b>]  [<ffffffffa03b030b>] nfs3_list_one_acl+0x2b/0x80 [nfsv3]
[14665.429515] RSP: 0018:ffff8800367dfe80  EFLAGS: 00010282
[14665.429532] RAX: ffffffffffffffa1 RBX: ffff8800367dfeb8 RCX: 0000000000000000
[14665.429553] RDX: 0000000000000012 RSI: 0000000000008000 RDI: ffff8800367dfe20
[14665.429574] RBP: ffff8800367dfea8 R08: 0000000000000000 R09: ffff8800367dfeb8
[14665.429595] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[14665.429616] R13: ffffffffa03b1c2a R14: 0000000000000000 R15: 0000000000000000
[14665.429638] FS:  00007f08cbfff700(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
[14665.429662] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14665.429679] CR2: ffffffffffffffa1 CR3: 000000002e861000 CR4: 00000000001407f0
[14665.429700] Stack:
[14665.429707]  ffff8801e57fc648 0000000000000000 0000000000000000 0000000000000000
[14665.429734]  00007f089000eed0 ffff8800367dfee0 ffffffffa03b0901 0000000000000000
[14665.429760]  0000000043f3bd19 ffff8801e453b9c0 0000000000000000 0000000000000000
[14665.429786] Call Trace:
[14665.429798]  [<ffffffffa03b0901>] nfs3_listxattr+0x51/0xa8 [nfsv3]
[14665.429820]  [<ffffffff8120aef2>] vfs_listxattr+0x42/0x70
[14665.429838]  [<ffffffff8120b1fd>] listxattr+0x10d/0x120
[14665.429855]  [<ffffffff8120c03a>] SyS_flistxattr+0x5a/0xc0
[14665.429874]  [<ffffffff816ff9e9>] system_call_fastpath+0x16/0x1b
[14665.429892] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 49 89 cf 41 56 4d 8b 31 41 55 49 89 d5 41 54 4d 89 c4 53 4c 89 cb e8 9a 3c e9 e0 48 85 c0 74 32 <f0> ff 08 74 40 4c 89 ef e8 88 50 fa e0 48 03 03 4d 85 e4 48 8d 
[14665.430034] RIP  [<ffffffffa03b030b>] nfs3_list_one_acl+0x2b/0x80 [nfsv3]
[14665.430057]  RSP <ffff8800367dfe80>
[14665.430068] CR2: ffffffffffffffa1
[14665.437996] ---[ end trace 9ddd160c7bf81e5b ]---

Comment 1 J. Bruce Fields 2014-08-19 15:48:15 UTC
This appears to be fixed upstream by 7a9e75a185e6 "nfs3_list_one_acl(): check get_acl() result with IS_ERR_OR_NULL", in v3.17-rc1, haven't checked if that's made it to stable kernels yet.

Comment 2 Josh Boyer 2014-08-19 15:52:08 UTC
It hasn't yet.  Thanks for the pointer, I'll look at getting it rolled into updates later today.

Comment 3 Josh Boyer 2014-08-19 16:16:10 UTC
Applied on all branches.  Thanks again\!

Comment 4 Josh Boyer 2014-08-19 18:17:15 UTC
*** Bug 1131617 has been marked as a duplicate of this bug. ***

Comment 5 Josh Boyer 2014-08-22 17:18:29 UTC
*** Bug 1133095 has been marked as a duplicate of this bug. ***

Comment 6 Josh Boyer 2014-08-26 21:41:18 UTC
*** Bug 1134086 has been marked as a duplicate of this bug. ***

Comment 7 Ryan Gillette 2014-08-26 23:11:35 UTC
(In reply to Josh Boyer from comment #3)
> Applied on all branches.  Thanks again\!

Any idea when this will be pushed out, or anyway to manually patch this our self?
Thanks

Comment 8 John Ellson 2014-08-27 20:34:54 UTC
My fix was to rollback the kernel to kernel-3.15.4-200.fc20,   but I agree, its not good to leave fc20 this long in this broken state.

Comment 9 Ryan Gillette 2014-08-27 20:37:11 UTC
(In reply to John Ellson from comment #8)
> My fix was to rollback the kernel to kernel-3.15.4-200.fc20,   but I agree,
> its not good to leave fc20 this long in this broken state.

kernel-3.15.7-200.fc20 works as well if you would like to run a newer kernel. I just using an annoying workaround right now. Running newest FC20 kernel, and use SCP to move files back and forth.

Comment 10 Fedora Update System 2014-08-28 12:17:49 UTC
kernel-3.15.10-201.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.15.10-201.fc20

Comment 11 Brian Daniels 2014-08-28 22:36:10 UTC
I am also seeing this bug with 3.14.17-100.fc19.x86_64 on Fedora 19.  3.14.13-100.fc19.x86_64 does not exhibit the issue.  Will this fix be backported to F19?

Comment 12 Josh Boyer 2014-08-28 23:08:48 UTC
It's been backported and will be in the next build.

Comment 13 Fedora Update System 2014-08-30 03:58:06 UTC
kernel-3.15.10-201.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 14 John Ellson 2014-09-04 14:04:07 UTC
Still seeing this bug with kernel-3.14.17-100.fc19.x86_64
 on Fedora 19

Comment 15 Fedora Update System 2014-09-09 21:17:39 UTC
kernel-3.14.18-100.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.14.18-100.fc19

Comment 16 Fedora Update System 2014-09-18 13:24:48 UTC
kernel-3.14.19-100.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.14.19-100.fc19

Comment 17 Fedora Update System 2014-09-30 01:58:55 UTC
kernel-3.14.19-100.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.