Bug 1781129 - long-lasting automount lookup failures if any one lookup was attempted while network offline
Summary: long-lasting automount lookup failures if any one lookup was attempted while ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: autofs
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ian Kent
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-09 11:50 UTC by Ian Collier
Modified: 2021-03-17 02:37 UTC (History)
1 user (show)

Fixed In Version: autofs-5.1.6-11.fc33 autofs-5.1.6-11.fc32
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-25 16:39:35 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
logs from automount after setting logging=debug (51.32 KB, text/plain)
2020-07-28 16:34 UTC, Ian Collier
no flags Details
automount log from test build (61.71 KB, text/plain)
2020-08-04 15:49 UTC, Ian Collier
no flags Details
Patch - fix ldap sasl reconnect problem (10.04 KB, patch)
2020-08-31 05:49 UTC, Ian Kent
no flags Details | Diff

Description Ian Collier 2019-12-09 11:50:46 UTC
We have systems configured as IPA and automount clients where the auto maps are stored in IPA.  If the machine's network connection goes offline for some reason and a mount is attempted, then all further mounts will fail even if the network is back online.

I can reliably replicate this on a VM running Fedora 30, as follows.

- Boot the machine

- Test the automounter is working
  $ ls -ld /auto/users/user1
  FS is successfully mounted, and syslog contains the following messages:
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func: GSSAPI client step 2

- Disconnect the VM's network interface
  (NetworkManager issues some messages about the interface going offline)

- $ ls -ld /auto/users/user2
  ls: cannot access '/auto/users/user2': No such file or directory
  syslog contains the following messages:
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: do_sasl_bind: Error sending sasl_bind request to the server: Can't contact LDAP server
  automount[1111]: sasl_do_kinit: krb5_get_init_creds_keytab failed with error -1765328228
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: do_reconnect: lookup(ldap): failed to find available server
  automount[1111]: key "user2" not found in map source(s).

- Reconnect the VM's network interface
  (NetworkManager issues some messages about the interface coming back online
  and a DHCPREQUEST transaction happens)

- $ ls -ld /auto/users/user2
  ls: cannot access '/auto/users/user2': No such file or directory
  (no messages are issued; negative lookup was presumably cached - it is
  40 seconds since the first attempt)

- $ ls -ld /auto/users/user3
  ls: cannot access '/auto/users/user3': No such file or directory
  syslog contains the following messages:
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: do_reconnect: lookup(ldap): failed to find available server
  automount[1111]: key "user3" not found in map source(s).

- Results for /auto/users/user4 same as for user3

- /auto/users/user1 is still mounted

- Results for /auto/users/user2 now same as for user3 and user4 (it is now
  65 seconds since the first attempt)

- 10 minutes later, /auto/users/user1 is automatically mounted.

- ls -ld /auto/users/user1
  Mount successful (lookup was presumably cached), but the following messages
  are issued:
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: sasl_log_func: GSSAPI client step 1
  automount[1111]: sasl_log_func:128: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Credential cache is empty)
  automount[1111]: do_reconnect: lookup(ldap): failed to find available server

- 28 minutes after the network outage, problem still persists.
  $ ls -ld /auto/users/user2
  ls: cannot access '/auto/users/user2': No such file or directory
  (same syslog messages as for user3 above)

Versions of packages tested:
autofs-5.1.5-10.fc30.x86_64
krb5-libs-1.17-15.fc30.x86_64
sssd-2.2.2-3.fc30.x86_64
kernel-5.3.13-200.fc30.x86_64

Comment 1 Ben Cotton 2020-04-30 20:39:05 UTC
This message is a reminder that Fedora 30 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 30 on 2020-05-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '30'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 30 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 2 Ben Cotton 2020-05-26 18:00:01 UTC
Fedora 30 changed to end-of-life (EOL) status on 2020-05-26. Fedora 30 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 3 Ian Kent 2020-05-27 01:57:58 UTC
I don't remember seeing this bug?
Do you still have this problem?

Comment 4 Ian Collier 2020-06-05 13:30:38 UTC
Yes, I can still duplicate this exactly as in the original report.
Versions tested just now:

autofs-5.1.6-4.fc32.x86_64
krb5-libs-1.18-1.fc32.x86_64
sssd-2.3.0-1.fc32.x86_64
kernel-5.6.15-300.fc32.x86_64

Comment 5 Ian Kent 2020-06-06 05:22:25 UTC
(In reply to Ian Collier from comment #4)
> Yes, I can still duplicate this exactly as in the original report.
> Versions tested just now:
> 
> autofs-5.1.6-4.fc32.x86_64
> krb5-libs-1.18-1.fc32.x86_64
> sssd-2.3.0-1.fc32.x86_64
> kernel-5.6.15-300.fc32.x86_64

Can you post a full debug log from start up until after those errors please.
And post /etc/nsswitch.conf, maybe later I might need more ...

I don't understand where sssd comes into this although it's probably not
related since, from what you posted, it looks like your using the ldap
lookup module not sssd.

Comment 6 Ian Kent 2020-06-09 06:48:37 UTC
Could you test something for me?

I's like to get a debug log output of what a patch I've added to autofs
does.

Take care, this build is unlikely to fix the problem or autofs might
not even work properly but I'd like to see the debug output.

The build is located at:
https://koji.fedoraproject.org/koji/taskinfo?taskID=45567438

Ian

Comment 7 Ian Collier 2020-07-28 16:34:16 UTC
Created attachment 1702687 [details]
logs from automount after setting logging=debug

Sorry for the delay in replying to this.  I'm attaching a log file which
covers: automount startup, successful mount, network down, unsuccessful mount,
network up, a couple more unsuccessful mounts.

The version tested is autofs-5.1.6-7.fc32.x86_64.  It is indeed configured
for ldap (nsswitch.conf says "automount:ldap"), so you're right that the sssd
version is probably irrelevant.

Regarding the test build in your last comment, I'm afraid I can't find any
downloadable packages from that build.

Comment 8 Ian Kent 2020-07-29 01:05:43 UTC
(In reply to Ian Collier from comment #7)
> Created attachment 1702687 [details]
> logs from automount after setting logging=debug
> 
> Sorry for the delay in replying to this.  I'm attaching a log file which
> covers: automount startup, successful mount, network down, unsuccessful
> mount,
> network up, a couple more unsuccessful mounts.
> 
> The version tested is autofs-5.1.6-7.fc32.x86_64.  It is indeed configured
> for ldap (nsswitch.conf says "automount:ldap"), so you're right that the sssd
> version is probably irrelevant.

Unfortunately I needed logs from the scratch build, it has a change that
might fix the problem (unlikely on first attempt) and I wanted to see the
effect in log output.

> 
> Regarding the test build in your last comment, I'm afraid I can't find any
> downloadable packages from that build.

Yeah, the scratch build will have been removed by now.
I build another.

Ian

Comment 9 Ian Kent 2020-07-29 07:18:08 UTC
(In reply to Ian Kent from comment #8)
> 
> > 
> > Regarding the test build in your last comment, I'm afraid I can't find any
> > downloadable packages from that build.
> 
> Yeah, the scratch build will have been removed by now.
> I build another.

Please try scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=48086384

Comment 10 Ian Collier 2020-08-04 15:49:12 UTC
Created attachment 1710354 [details]
automount log from test build

Using the test build I was able to successfully mount directories after
a network interruption.  This is the log it produced.

Comment 11 Ian Kent 2020-08-18 09:00:31 UTC
(In reply to Ian Collier from comment #10)
> Created attachment 1710354 [details]
> automount log from test build
> 
> Using the test build I was able to successfully mount directories after
> a network interruption.  This is the log it produced.

Sorry I haven't got back to you sooner I've been focusing on a time critical
priority task.

So from the log it appears my guess was fairly accurate.

If you can confirm (once again) that what I'm seeing in the log is what's
needed I'll have another look at the patch, clean it up if needed, and
update autofs. The update will require another test when the package gets
pushed to the testing repo so if I do make any mistakes we will catch them
there.

Comment 12 Fedora Update System 2020-08-31 05:46:49 UTC
FEDORA-2020-61398cc84f has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2020-61398cc84f

Comment 13 Fedora Update System 2020-08-31 05:46:49 UTC
FEDORA-2020-c64e5cadad has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-c64e5cadad

Comment 14 Ian Kent 2020-08-31 05:49:11 UTC
Created attachment 1713103 [details]
Patch - fix ldap sasl reconnect problem

Comment 15 Fedora Update System 2020-08-31 15:55:30 UTC
FEDORA-2020-c64e5cadad has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-c64e5cadad`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-c64e5cadad

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2020-08-31 18:57:29 UTC
FEDORA-2020-61398cc84f has been pushed to the Fedora 33 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-61398cc84f`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-61398cc84f

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Ian Collier 2020-09-16 10:38:50 UTC
I'm afraid I didn't understand your question, but I can confirm that
after updating to 5.1.6-11 from updates-testing I no longer have the bug.

Comment 18 Ian Kent 2020-09-17 13:32:54 UTC
(In reply to Ian Collier from comment #17)
> I'm afraid I didn't understand your question, but I can confirm that
> after updating to 5.1.6-11 from updates-testing I no longer have the bug.

Ha, I was just asking if the updated autofs still resolves the problem.
Hmm ... I think I've forgotten to check if it's been pushed to stable ...

Comment 19 Fedora Update System 2020-09-25 16:39:35 UTC
FEDORA-2020-61398cc84f has been pushed to the Fedora 33 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 20 Fedora Update System 2021-03-17 02:37:51 UTC
FEDORA-2020-c64e5cadad has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.