Bug 1037793

Summary: Missing info file in dummy gssd entry in rpc_pipefs
Product: [Fedora] Fedora Reporter: nmorey <nicolas>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: bastian_knight, bfields, gansalmon, itamar, jlayton, jonathan, kernel-maint, luc.lalonde, madhu.chinakonda, steved
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.12.6-200.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-03 08:31:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description nmorey 2013-12-03 20:03:40 UTC
Description of problem:
When starting rpc.gssd service and/or trying to access a NFS4 + krb5 mounted partition, rpc.gssd fills the /var/log/messages with:

Dec  3 20:19:59 sat rpc.gssd[2014]: ERROR: can't open /var/lib/nfs/rpc_pipefs/gssd/clntXX/info: No such file or directory
Dec  3 20:19:59 sat rpc.gssd[2014]: ERROR: failed to read service info


From a "quick" analysis,this is due to the dummy gssd entry in rpc_pipefs introduced by this patch: http://permalink.gmane.org/gmane.linux.redhat.fedora.extras.cvs/1139312

Either nfs-utils should be updated to handle clntXX without info or a valid info file also generated for the dummy client.

Version-Release number of selected component (if applicable):
kernel-3.11.10-300.fc20.x86_64
nfs-utils-1.2.8-6.0.fc20.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Run rpc.gssd

Actual results:


Expected results:


Additional info:
Works with kernel-3.11.9-300.fc20.x86_64
(At least no error in /var/log/messages)

Comment 1 Josh Boyer 2013-12-03 20:08:06 UTC
Moving to nfs-utils for now.  Steve and Jeff asked us to add those patches to the kernel.

Comment 2 Jeff Layton 2013-12-03 20:13:02 UTC
Makes sense. We could certainly fix it with a kernel patch that adds a dummy "info" file, but...ick. Maybe it'd be best to just make gssd silence these ERROR messages by default. They aren't particularly helpful, IMO...

Steve, thoughts?

Comment 3 Jeff Layton 2013-12-05 15:02:39 UTC
Ok I sent out a patch for it this morning. We'll see what Trond says:

    http://marc.info/?l=linux-nfs&m=138624689302466&w=2

...we'll probably also want to take this patch:

    http://marc.info/?l=linux-nfs&m=138624684502447&w=2

...and I'm working on a 3rd patch to fix some leaks that can happen when the mount of rpc_pipefs fails due to the notifier failing.

Comment 4 Steve Dickson 2013-12-09 15:11:25 UTC
This is now fixed in the latest f20 kernel release.

Comment 5 Josh Boyer 2013-12-09 15:18:38 UTC
Er... it is?  We didn't bring any of the patches Jeff mentioned in comment #3 into the kernel.  Can you point to the Fedora kernel commit that fixed it?

Comment 6 Steve Dickson 2013-12-09 15:25:23 UTC
(In reply to Josh Boyer from comment #5)
> Er... it is?  We didn't bring any of the patches Jeff mentioned in comment
> #3 into the kernel.  Can you point to the Fedora kernel commit that fixed it?
My bad... I updated my f19 kernel to 3.11.10-200.fc19 and it
appeared the 15 second delay was gone, so I just assumed 
they went into F20 as well... 

Reopening...

Comment 7 Josh Boyer 2013-12-09 15:32:05 UTC
(In reply to Steve Dickson from comment #6)
> (In reply to Josh Boyer from comment #5)
> > Er... it is?  We didn't bring any of the patches Jeff mentioned in comment
> > #3 into the kernel.  Can you point to the Fedora kernel commit that fixed it?
> My bad... I updated my f19 kernel to 3.11.10-200.fc19 and it
> appeared the 15 second delay was gone, so I just assumed 
> they went into F20 as well... 

We added the patches to get rid of the 15 second delay in all releases, so you're correct there.  This bug is tracking some fallout of adding those patches though.

Comment 8 Steve Dickson 2013-12-09 16:03:19 UTC
Here is the upstream patch that will mostly like fix this problem

http://marc.info/?l=linux-nfs&m=138659989103371&w=2

Also changing the component from nfs-utils to kernel.

Comment 9 Josh Boyer 2013-12-09 20:21:34 UTC
Jeff, do you think we need all 3 patches here?

Comment 10 Jeff Layton 2013-12-09 20:43:58 UTC
Yeah, I think that would be best.

Comment 11 Josh Boyer 2013-12-10 14:15:50 UTC
Could someone test this scratch build when it completes and see if the issue is sufficiently resolved?

http://koji.fedoraproject.org/koji/taskinfo?taskID=6276121

I'd like to get confirmation before I commit the patches.

Comment 12 Michal Piotrowski 2013-12-19 15:21:10 UTC
I have the same problem. I have tested kernels: 
kernel-3.11.10-301.3.fc20.src.rpm from above address and 
kernel-3.12.5-301.fc20.x86_64 from updates testing.

The 3.11.10-301.3.fc20 gets rid of the error message but kerberos authentication fail with any kernel with access denied message for NFS4 exported share. The share worked without problems in FC 19.

Comment 13 Luc Lalonde 2013-12-19 15:33:36 UTC
I tested kernel-3.11.10-301.3.fc20.x86_64 on a F19 system and it solves the problem for me...

The last known kernel that worked properly for me is:

kernel-3.11.6-200.fc19.x86_64

However, after updating to latest F19 kernel (kernel-3.11.10-200.fc19.x86_64) I get the above GSSD error messages and am unable to initiate an NFSv4 kerberos mount.

I'll be sticking with kernel-3.11.6-200.fc19.x86_64 for now until a fix filters down to  F19.

Comment 14 Jeff Layton 2013-12-19 16:55:42 UTC
Strange....I seem to be able to mount with sec=krb5 with 3.11.10-301.3.fc20.x86_64+debug just fine.

My client is a rawhide box though, so maybe the nfs-utils differences are making a difference there. I'll see if I can roll up a f19 VM and do some testing with it.

Comment 15 Jeff Layton 2013-12-19 18:49:56 UTC
Ok, I set up an f19 VM and patched to current updates. Installed the kernel from comment #11. When I first booted, I wasn't able to mount either, but that was because rpc.gssd wasn't running.

Once I started it, it worked fine. Can you confirm whether gssd was running on your system after booting to the new kernel?

Comment 16 Michal Piotrowski 2013-12-19 19:35:16 UTC
Yes. gssd (started by nfs-secure.service) is running on my system and I get "Permission denied" message.

Comment 17 Jeff Layton 2013-12-19 21:24:12 UTC
...and if you just boot to the older kernel w/o changing anything else. Does it work just fine then? What nfs-utils version do you have?

Comment 18 Jeff Layton 2013-12-19 21:29:36 UTC
...and to be clear, what's the latest kernel version that does work for you?

Comment 19 Michal Piotrowski 2013-12-19 21:37:09 UTC
I have currently nfs-utils-1.2.8-6.0.fc20.x86_64.

The kerberized share worked correctly on Feedora 19. After upgrading to F20 none of the kernel versions allows me to access the share (initial one after upgrade, the one from comment 11 and the one from updates testing).

To be clear the share is still accessible on F19 boxes.

Comment 20 nmorey 2013-12-19 21:41:49 UTC
I had some similar problems after upgrading to fc20. HoweverI cannot say if its due to fc 20. I've been upgrading since fc14 and had some very old config files. I tried to use the rpmnew ones and it all went haywire. After a couple of hours playing with old and new ones it got working again,

At leat from a fresh install nfs4+krb seems ok according to my IT guy.

Comment 21 Jeff Layton 2013-12-19 21:47:26 UTC
(In reply to Michal Piotrowski from comment #19)
> I have currently nfs-utils-1.2.8-6.0.fc20.x86_64.
> 
> The kerberized share worked correctly on Feedora 19. After upgrading to F20
> none of the kernel versions allows me to access the share (initial one after
> upgrade, the one from comment 11 and the one from updates testing).
> 
> To be clear the share is still accessible on F19 boxes.

I'm afraid that doesn't tell me anything helpful. You'll likely need to do some debugging to figure out why it isn't working for you and open a new bug if it turns out to be one. I suspect however that that is not at all related to what this bug is about (which is spurious warnings in the logs).

Comment 22 Jeff Layton 2013-12-19 21:50:00 UTC
(In reply to Josh Boyer from comment #11)
> Could someone test this scratch build when it completes and see if the issue
> is sufficiently resolved?
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=6276121
> 
> I'd like to get confirmation before I commit the patches.

As best I can tell, the ERROR: messages originally reported go away with this kernel. I think you should go ahead and merge these for the next f19/f20 kernels. FWIW, it looks like the upstream patches are probably going to make 3.14.

Comment 23 Josh Boyer 2013-12-20 13:31:29 UTC
(In reply to Jeff Layton from comment #22)
> (In reply to Josh Boyer from comment #11)
> > Could someone test this scratch build when it completes and see if the issue
> > is sufficiently resolved?
> > 
> > http://koji.fedoraproject.org/koji/taskinfo?taskID=6276121
> > 
> > I'd like to get confirmation before I commit the patches.
> 
> As best I can tell, the ERROR: messages originally reported go away with
> this kernel. I think you should go ahead and merge these for the next
> f19/f20 kernels. FWIW, it looks like the upstream patches are probably going
> to make 3.14.

OK, I'll add them.  They'll be on top of 3.12.y at this point, but I don't believe that will matter.

Comment 24 Josh Boyer 2013-12-20 13:50:52 UTC
OK, added them to F19 and F20, and added these 3 and the 3 for fixing the 15sec hang to rawhide since none of them will land until 3.14.

Comment 25 Fedora Update System 2013-12-23 20:37:07 UTC
kernel-3.12.6-300.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.12.6-300.fc20

Comment 26 Fedora Update System 2013-12-23 20:37:36 UTC
kernel-3.12.6-200.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.12.6-200.fc19

Comment 27 Fedora Update System 2013-12-25 02:32:46 UTC
Package kernel-3.12.6-200.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.12.6-200.fc19'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-23935/kernel-3.12.6-200.fc19
then log in and leave karma (feedback).

Comment 28 Fedora Update System 2014-01-03 08:31:59 UTC
kernel-3.12.6-300.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 29 Fedora Update System 2014-01-03 08:43:45 UTC
kernel-3.12.6-200.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.