Bug 1452881 - gssproxy fails to start because /proc/net/rpc/use-gss-proxy is missing
Summary: gssproxy fails to start because /proc/net/rpc/use-gss-proxy is missing
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gssproxy   
(Show other bugs)
Version: rawhide
Hardware: Unspecified Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Robbie Harwood
QA Contact: Fedora Extras Quality Assurance
URL: https://pagure.io/gssproxy/pull-reque...
Whiteboard:
Keywords:
Depends On:
Blocks: 1449238
TreeView+ depends on / blocked
 
Reported: 2017-05-19 20:59 UTC by Zbigniew Jędrzejewski-Szmek
Modified: 2017-05-31 17:19 UTC (History)
11 users (show)

Fixed In Version: gssproxy-0.7.0-9.fc27
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-05-31 17:19:40 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description Zbigniew Jędrzejewski-Szmek 2017-05-19 20:59:18 UTC
Description of problem:

May 19 16:55:05 rawhide systemd[1]: Starting GSSAPI Proxy Daemon...
May 19 16:55:05 rawhide gssproxy[601]: GSS-Proxy is not supported by this kernel since file /proc/net/rpc/use-gss-proxy could not be found: 2 (No such file or directory)
May 19 16:55:05 rawhide systemd[1]: gssproxy.service: Control process exited, code=exited status=1
May 19 16:55:05 rawhide systemd[1]: Failed to start GSSAPI Proxy Daemon.
May 19 16:55:05 rawhide systemd[1]: gssproxy.service: Unit entered failed state.
May 19 16:55:05 rawhide systemd[1]: gssproxy.service: Failed with result 'exit-code'.

Version-Release number of selected component (if applicable):
kernel-core-4.12.0-0.rc0.git9.1.fc27.x86_64
gssproxy-0.7.0-5.fc27.x86_64

How reproducible:
100%

Comment 1 Robbie Harwood 2017-05-19 21:11:23 UTC
proc-fs-nfsd.mount needs Before=gssproxy.service

Comment 2 Zbigniew Jędrzejewski-Szmek 2017-05-19 21:14:29 UTC
I don't think that's the solution. On my machine, proc-fs-nfsd.mount is not started at all, so just adding Before= will not change anything.

Comment 3 Robbie Harwood 2017-05-19 21:19:24 UTC
nfs-utils is responsible for setting up that socket.  Your machine most likely has nfs-utils installed because gssproxy doesn't check for that socket otherwise (unless I've broken something else).  I don't actually know what the service component to set it up is (if there even is one).

Comment 4 Zbigniew Jędrzejewski-Szmek 2017-05-19 21:33:11 UTC
This a freshly installed (~3 days old) rawhide server machine (VM, but this shouldn't matter). I didn't do any customizations that would matter for this, this is all defaults.

nfs-utils is in @standard in comps, so it's installed by default.

If I do 'systemctl start proc-fs-nfsd.mount && systemctl start gssproxy.service', then the latter starts without issue. I don't know enough about the relationship between various nfs services to say whether gssproxy.service should be started. But gssproxy.service is WantedBy auth-rpcgss-module.service, which in turn is WantedBy nfs-client.target, which in turn is WantedBy remote-fs.target, which is WantedBy multi-user.target. So either gssproxy.service needs to grow a dep on something, or, alternatively, the dep chain needs to be broken, so that gssproxy.service is not started.

Comment 5 Justin Mitchell 2017-05-22 09:23:57 UTC
The use-gss-proxy file that gssproxy is failing on is provided by the auth-rpcgss module. 

Adding 'Wants: auth-rpcgss-module' to gssproxy.service fixes the issue that bug 1449238 has, but i do not understand the interdependencies enough to know if this has any undesirable side effects.

Comment 6 Robbie Harwood 2017-05-22 14:23:28 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #4)
> nfs-utils is in @standard in comps, so it's installed by default.

Only in standard, I believe, not in minimal.  We need to work when nfs-utils isn't installed, and can't depend on its presence.

My understanding is that this means the proc-fs-nfsd.mount needs to indicate that it's a requirement to start gssproxy, but it sounds like you two know more about systemd than I do, so please correct me if that's not right.

Comment 7 Zbigniew Jędrzejewski-Szmek 2017-05-22 23:46:44 UTC
OK, with gssproxy-0.7.0-7.fc27.x86_64, the warning does not appear any more. I guess the bug could be closed.

Nevertheless, I don't think that the way the units are currently arranged is OK:

1. normally, dependencies are declared in the unit that needs something, not in the unit that provides something. Here the reverse is done, and auth-rpcgss-module.service has
  Before=gssproxy.service rpc-svcgssd.service rpc-gssd.service
  Wants=gssproxy.service rpc-svcgssd.service rpc-gssd.service
The problem is that gssproxy.service itself declares no dependencies, so for example, typing
   systemctl start gssproxy.service
does not work as expected, unless the auth-rpcgss-module.service has already been loaded for other reasons. But this makes things brittle: starting gssproxy will work sometimes, and other times not, confusingly. gssproxy.service should have
  Requires=auth-rpcgss-module.service
  After=auth-rpcgss-module.service
and auth-rpcgss-module.service should only have
  Wants=gssproxy.service

2. auth-rpcgss-module.service has ConditionPathExists=/etc/krb5.keytab, which would be OK, but Conditions work in such a way that only the start job for that unit is skipped, but other units which it pulls in through Wants and Requires are still started. So if /etc/krb5.keytab is not present, auth-rpcgss-module.service/start is skipped because ConditionPathExists is not satisfied, and then gssproxy.service is started and (with gssproxy-0.7.0-7.fc27.x86_64) silently exits because it does not see /proc/net/rpc/use-gss-proxy. Since nfs-utils already has systemd generators, I think it'd be better to use a generator here too, and pull either gssproxy.service or rpc-gssd.service into the transaction, without starting both and having them exit.

Comment 8 Simo Sorce 2017-05-23 15:39:10 UTC
Zbigniew it is a complicate dependency.

Let me explain a little.

Gssproxy does not *need* kernel modules to work, however because of the way auth-rpcgss works we need to touch that file to make it call to gssproxy.

So when we initially built gss-proxy we decided to touch that file at startup if, and only if a specific configuration item is provided in gssproxy.

Since then we modularize its configuration, and we moved the nfs server configuration to the nfs packages.

Now what happens is that by just installing those packages gssproxy now is told to go and touch the kernel interface.

What we can do is to make the operation to touch this kernel file not fatal for gssproxy, but then we would need a way to instruct systemd to reload gssproxy right after the unit that loads the modules is execute and before auth-rpcgss-module is started.

Is there such a facility ?

The problem is that once that file is touched the kernel module is locked into a specific behavior, so we have to nail it right or the system will misbheave.

Hope this explanation helps understanding the complexity of this problem.

Comment 9 Simo Sorce 2017-05-25 14:04:00 UTC
Moving back to gssproxy, I really think we need to make this failure not fatal for gssproxy, and open a new nfs-utils bug to make sure gssproxy is restarted if the modules are loaded after gssproxy starts or make sure the unit the load modules has a before: gssproxy statement.

Comment 10 Robbie Harwood 2017-05-31 17:19:40 UTC
Forgot to close this.


Note You need to log in before you can comment on or make changes to this bug.