Bug 1683382

Summary: [Patch] NFS4 hangs in all 4.20 kernels
Product: [Fedora] Fedora Reporter: Jason Tibbitts <j>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 29CC: airlied, bskeggs, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mchehab, mjg59, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-4.20.13-200.fc29 kernel-4.20.14-100.fc28 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-02 01:47:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch backported to 4.20 none

Description Jason Tibbitts 2019-02-26 18:16:36 UTC
I have worked with upstream to debug this but I figured I would file a Fedora ticket to see if we would be willing to carry a patch to fix this.

Since the first 4.20 series kernel release in F29, I've been seeing NFS client hangs.  Our servers are running Centos 7.6 (kernel 3.10.8-957.1.3 or thereabouts) and we use kerberized NFS4.2.  Despite trying to find a reproducer, I've not managed to come up with one but some usage pattern exhibited by several of my users manages to trigger this.  Sometimes they can use the machine for hours, sometimes only minutes.  But eventually any process which accesses something on the mounted volume will go into the D state and make no further progress.  Nothing at all is logged when this happens.

I talked to folks on linux-nfs and sent them some traces.  The entire thread is available at https://marc.info/?t=154835359800001.  In the end, a patch was produced which I believe is queued for submission, though I don't know when.  (It's a regression in 4.20 that will also be in 5.0 so I would think they'd want to fix it immediately, but perhaps it will just go in via the stable trees.)

Trond's tree is http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=shortlog;h=refs/heads/linux-next and the patch fixing the issue is http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commit;h=3453d5708b33efe76f40eca1c0ed60923094b971

So I don't know if this meets the criteria for being carried in the Fedora kernels, but I would like to ask that it be considered.  I have tested this about as much as I'm able to, on 50 machines on my network which happen to have secure boot disabled.  Obviously I will be getting rid of secure boot on everything as soon as I can but for now it would help me greatly to have this in a signed kernel release.

Thanks!

Comment 1 Jason Tibbitts 2019-02-27 18:14:48 UTC
Created attachment 1539233 [details]
Patch backported to 4.20

I forgot that I had to slightly modify Trond's patch to apply against the 4.20 stable series.  Sorry about that; I should have attached it originally.

Comment 2 Fedora Update System 2019-02-28 13:21:18 UTC
kernel-headers-4.20.13-200.fc29 kernel-4.20.13-200.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-b3969845cf

Comment 3 Fedora Update System 2019-02-28 13:22:24 UTC
kernel-headers-4.20.13-100.fc28 kernel-4.20.13-100.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2019-1b70dbc980

Comment 4 Fedora Update System 2019-03-01 03:35:56 UTC
kernel-4.20.13-200.fc29, kernel-headers-4.20.13-200.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-b3969845cf

Comment 5 Fedora Update System 2019-03-01 22:23:10 UTC
kernel-4.20.13-100.fc28, kernel-headers-4.20.13-100.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-1b70dbc980

Comment 6 Fedora Update System 2019-03-02 01:47:02 UTC
kernel-4.20.13-200.fc29, kernel-headers-4.20.13-200.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 7 Fedora Update System 2019-03-06 13:08:59 UTC
kernel-headers-4.20.14-100.fc28 kernel-4.20.14-100.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2019-196ab64d65

Comment 8 Fedora Update System 2019-03-08 21:29:39 UTC
kernel-4.20.14-100.fc28, kernel-headers-4.20.14-100.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-196ab64d65

Comment 9 Fedora Update System 2019-03-11 20:20:00 UTC
kernel-4.20.14-100.fc28, kernel-headers-4.20.14-100.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.