Bug 1868696

Summary: sssd_kcm loops on NFS server occasionally after an NFSv4.0 mount using Kerberos
Product: [Fedora] Fedora Reporter: Chuck Lever <chuck.lever>
Component: sssdAssignee: jstephen
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 32CC: abokovoy, atikhono, jhrozek, lslebodn, mzidek, pbrezina, rharwood, sbose, ssorce, sssd-maintainers
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-25 17:27:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Chuck Lever 2020-08-13 13:56:34 UTC
Description of problem:
Multi-homed Fedora 32 NFS client and server, both with keytabs. Sometimes after performing an NFSv4.0 mount, the sssd_kcm daemon on the server runs at 100% and fills up the sssd_kcm.log file (and eventually, the root partition).

(2020-08-10 15:03:04:680265): [kcm] [kcm_input_parse] (0x1000): Received
message with length 0
(2020-08-10 15:03:04:680284): [kcm] [kcm_input_parse] (0x0020): Illegal
zero-length message
(2020-08-10 15:03:04:680302): [kcm] [kcm_recv] (0x0010): Failed to parse
data (74, Bad message), aborting client
(2020-08-10 15:03:04:680319): [kcm] [kcm_reply_error] (0x0040): KCM
operation returs failure [74]: Bad message
(2020-08-10 15:03:04:680353): [kcm] [kcm_failbuf_construct] (0x1000): Sent
reply with error -1765328188

Version-Release number of selected component (if applicable):

How reproducible:
Happens every second or third mount operation.

Steps to Reproduce:
1. Set up NFS client and server with keytabs
2. Repeat: "mount -o vers=4.0,sec=sys" / do some operations / umount
3. Watch the server with "top"

Actual results:
Sometimes sssd_kcm goes to 100% of a CPU and must be killed.

Expected results:
No looping.
Additional info:
The NFSv4.0 backchannel, in this case, is secured with GSS krb5i. It appears that gssd on the server is accessing the Kerberos ticket cache while setting up the backchannel, and sometimes this triggers the kcm loop.

Comment 1 Fedora Program Management 2021-04-29 16:56:09 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 2 Ben Cotton 2021-05-25 17:27:00 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this

Thank you for reporting this bug and we are sorry it could not be fixed.