Bug 1873965 - rpc.idmap will not start after upgrade
Summary: rpc.idmap will not start after upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: nfs-utils
Version: 31
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-31 08:33 UTC by dag
Modified: 2020-09-25 17:46 UTC (History)
3 users (show)

Fixed In Version: nfs-utils-2.5.1-4.rc4.fc32 nfs-utils-2.5.1-4.rc4.fc33 nfs-utils-2.5.1-4.rc4.fc31
Clone Of:
Environment:
Last Closed: 2020-09-12 16:34:25 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description dag 2020-08-31 08:33:55 UTC
Description of problem:
/usr/sbin/rpc.idmap will fail after update to 2.5.1-2 from 2.4.1-1

Version-Release number of selected component (if applicable):
2.5.1-2

How reproducible:
Always

Steps to Reproduce:
1.Do a dnf update nfs-utils
2.Check the nfs-idmap service status
3.Notice that it is in failed state

Actual results:
The startup of rpc.idmap fails

Expected results:
The service should start

Additional info:
Running the command /usr/sbin/rpc.idmapd -f -vvv by hand will give a very weird errormessage containing part of /etc/nsswitch.conf. Like this:

% /usr/sbin/rpc.idmapd -f -vvv
rpc.idmapd: Could not find user "turn] files"

Deducted that this is just a seemingly random part of /etc/nsswitch.conf and is a substring of [NOTFOUND=return] files. Checked this by removing all those lines from the file and running again gave me:

% /usr/sbin/rpc.idmapd -f -vvv
rpc.idmapd: Could not find user "us tells us..."

Which is just part of a comment in the file.

Something is very wrong here.
Downgrading nfs-utils made everything work again

Comment 1 Steve Dickson 2020-08-31 14:30:53 UTC
(In reply to dag from comment #0)
> Description of problem:
> /usr/sbin/rpc.idmap will fail after update to 2.5.1-2 from 2.4.1-1
> 
> Version-Release number of selected component (if applicable):
> 2.5.1-2
> 
> How reproducible:
> Always
> 
> Steps to Reproduce:
> 1.Do a dnf update nfs-utils
> 2.Check the nfs-idmap service status
> 3.Notice that it is in failed state
> 
> Actual results:
> The startup of rpc.idmap fails
> 
> Expected results:
> The service should start
> 
> Additional info:
> Running the command /usr/sbin/rpc.idmapd -f -vvv by hand will give a very
> weird errormessage containing part of /etc/nsswitch.conf. Like this:
> 
> % /usr/sbin/rpc.idmapd -f -vvv
> rpc.idmapd: Could not find user "turn] files"
> 
THe problem is rpc.idmapd can not find the "nobody" user account.
The odd thing is this "turn] files" That is the user it is looking 
for or there is something wrong wit the printf. 

What is the output of 'id nobody nfsnobody'

Comment 2 dag 2020-08-31 14:45:57 UTC
Hi

% id nobody nfsnobody
uid=99(nobody) gid=99(nobody) grupper=99(nobody),1011(nagiocmd),1021(video)
uid=65534(nfsnobody) gid=65534(nfsnobody) grupper=65534(nfsnobody)

And why is everything ok in 2.4.1-1 ?

Comment 3 dag 2020-08-31 14:59:55 UTC
Additional note:

I did strace the execution of /usr/sbin/rpc.idmapd while looking for the problem here and can see rpc.idmap sending a request to sssd to look for user "turn] files\0". So it seems like printf is doing exactly what it is supposed to do but rpc.idmapd is really confused about the name of nfsnobody (or nobody) and extracting the name from /etc/nsswitch.conf for some reason.

Here from the strace:

[pid 11282] connect(5, {sa_family=AF_UNIX, sun_path="/var/lib/sss/pipes/nss"}, 110) = 0
[pid 11282] fstat(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
[pid 11282] poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, revents=POLLOUT}])
[pid 11282] sendto(5, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL, 0) = 16
[pid 11282] poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, revents=POLLOUT}])
[pid 11282] sendto(5, "\1\0\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4
[pid 11282] poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, revents=POLLIN}])
[pid 11282] read(5, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16) = 16
[pid 11282] poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, revents=POLLIN}])
[pid 11282] read(5, "\1\0\0\0", 4)      = 4
[pid 11282] poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, revents=POLLOUT}])
[pid 11282] sendto(5, "\34\0\0\0\21\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL, 0) = 16
[pid 11282] poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, revents=POLLOUT}])
[pid 11282] sendto(5, "turn] files\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
[pid 11282] poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, revents=POLLIN}])
[pid 11282] read(5, "\30\0\0\0\21\0\0\0\0\0\0\0\0\0\0\0", 16) = 16
[pid 11282] poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, revents=POLLIN}])
[pid 11282] read(5, "\0\0\0\0\0\0\0\0", 8) = 8

Comment 4 Steve Dickson 2020-08-31 17:42:23 UTC
(In reply to dag from comment #2)
> Hi
> 
> % id nobody nfsnobody
> uid=99(nobody) gid=99(nobody) grupper=99(nobody),1011(nagiocmd),1021(video)
> uid=65534(nfsnobody) gid=65534(nfsnobody) grupper=65534(nfsnobody)
> 
Well when I do a 'id nobody nfsnobody' I get
uid=65534(nobody) gid=65534(nobody) groups=65534(nobody)
id: 'nfsnobody': no such user

which is expected. A while back the systemd folks wanted the
65534 uid to be nobody instead of nfsnobody. I assumed since 
the actually uid didn't change and only the name things
would not break... maybe that is not the case. 

The question is where the "turn] files" is coming from.
Looking at the code the variable used to get the password 
info (which is failing), is set one of two ways via a 
static string ("nobody") or via the /etc/idmapd.conf 
(thru the [Mapping] section). I'm assuming nothing
has changed in idmapd.conf... 

Out of curiosity here is the latest f31 nfs-utils build 
https://koji.fedoraproject.org/koji/taskinfo?taskID=50510306

Would you mind giving that a run to see if the problem 
still exists?

Comment 5 dag 2020-08-31 17:51:31 UTC
No, I did not change /etc/idmap.conf between the runs.

Downloaded f31 nfs-utils candidate and this one fails in a more reasonable way, but still fails:

% ./rpc.idmapd -f -vvv
rpc.idmapd: Could not find user ""

Has the syntax of idmap.conf changed?

Comment 6 Steve Dickson 2020-09-01 18:24:08 UTC
(In reply to dag from comment #5)
> No, I did not change /etc/idmap.conf between the runs.
> 
> Downloaded f31 nfs-utils candidate and this one fails in a more reasonable
> way, but still fails:
Thanks!

> 
> % ./rpc.idmapd -f -vvv
> rpc.idmapd: Could not find user ""
Now the variable is NULL which is odd since the
first thing rpc.idmap does is set the variable
to "nobody"... Then it process /etc/idmap.conf.
Then it process the command line args... 

So from the time the variable is initialized
to when variable use used to look up the 
password info... It becomes NULL. 

So I'm thinking its got to be something 
in your environment causing the variable
to become NULL.   

> 
> Has the syntax of idmap.conf changed?
Well... not for the last 100 years ;-)

Comment 7 dag 2020-09-01 20:33:19 UTC
Well. There are no secrets in my idmap.conf so attaching it here.

Also checked for control characters in the file - None.

There is a stale LDAP setting there, but that should be ignored as the Method doesn't mention it. Or?

So here the file:

[General]
#Verbosity = 0
# The following should be set to the local NFSv4 domain name
# The default is the host's DNS domain name.
#Domain = local.domain.edu
Domain = localdomain

# The following is a comma-separated list of Kerberos realm
# names that should be considered to be equivalent to the
# local realm, such that <user>@REALM.A can be assumed to
# be the same user as <user>@REALM.B
# If not specified, the default local realm is the domain name,
# which defaults to the host's DNS domain name,
# translated to upper-case.
# Note that if this value is specified, the local realm name
# must be included in the list!
#Local-Realms = 

[Mapping]

#Nobody-User = nobody
#Nobody-Group = nobody
Nobody-User = nfsnobody
Nobody-Group = nfsnobody

[Translation]

# Translation Method is an comma-separated, ordered list of
# translation methods that can be used.  Distributed methods
# include "nsswitch", "umich_ldap", and "static".  Each method
# is a dynamically loadable plugin library.
# New methods may be defined and inserted in the list.
# The default is "nsswitch".
Method = nsswitch

# Optional.  This is a comma-separated, ordered list of
# translation methods to be used for translating GSS
# authenticated names to ids.
# If this option is omitted, the same methods as those
# specified in "Method" are used.
#GSS-Methods = <alternate method list for translating GSS names>
 
#-------------------------------------------------------------------#
# The following are used only for the "static" Translation Method.
#-------------------------------------------------------------------#
[Static]

# A "static" list of GSS-Authenticated names to
# local user name mappings

#someuser@REALM = localuser


#-------------------------------------------------------------------#
# The following are used only for the "umich_ldap" Translation Method.
#-------------------------------------------------------------------#

[UMICH_SCHEMA]

# server information (REQUIRED)
LDAP_server = ldap.newtech.fi

# the default search base (REQUIRED)
LDAP_base = dc=local,dc=newtech,dc=fi

#-----------------------------------------------------------#
# The remaining options have defaults (as shown)
# and are therefore not required.
#-----------------------------------------------------------#

# whether or not to perform canonicalization on the
# name given as LDAP_server
#LDAP_canonicalize_name = true

# absolute search base for (people) accounts
#LDAP_people_base = <LDAP_base>

# absolute search base for groups
#LDAP_group_base = <LDAP_base>

# Set to true to enable SSL - anything else is not enabled
#LDAP_use_ssl = false

# You must specify a CA certificate location if you enable SSL
#LDAP_ca_cert = /etc/ldapca.cert

# Objectclass mapping information

# Mapping for the person (account) object class
#NFSv4_person_objectclass = NFSv4RemotePerson

# Mapping for the nfsv4name attribute the person object
#NFSv4_name_attr = NFSv4Name

# Mapping for the UID number
#NFSv4_uid_attr = UIDNumber

# Mapping for the GSSAPI Principal name
#GSS_principal_attr = GSSAuthName

# Mapping for the account name attribute (usually uid)
# The value for this attribute must match the value of 
# the group member attribute - NFSv4_member_attr
#NFSv4_acctname_attr = uid

# Mapping for the group object class
#NFSv4_group_objectclass = NFSv4RemoteGroup

# Mapping for the GID attribute
#NFSv4_gid_attr = GIDNumber

# Mapping for the Group NFSv4 name
#NFSv4_group_attr = NFSv4Name

# Mapping for the Group member attribute (usually memberUID)
# The value of this attribute must match the value of NFSv4_acctname_attr
#NFSv4_member_attr = memberUID

Comment 8 dag 2020-09-01 21:13:32 UTC
As this was a mystery I could not resist some debugging:

in idmap.c on line 310 there is a call
to conf_cleanup() which will assumingly delete the conf file parsing structure.

As nobodyuser is just a pointer into this space this cleanup
will delete the variable that is pointed to.

Confirmed this by commenting out the conf_cleanup() and everything works.
There will be no problem iff /etc/idmapd.conf doesn't contain any setting for Nobody-User or Nobody-Group.

So mystery solved.

Didn't take a closer look at CONF_SAVE() nor conf_cleanup() for a real solution but I guess that is trivial for someone with more familiarity with the code?

Comment 9 Steve Dickson 2020-09-03 18:49:31 UTC
(In reply to dag from comment #8)
> As this was a mystery I could not resist some debugging:
> 
> in idmap.c on line 310 there is a call
> to conf_cleanup() which will assumingly delete the conf file parsing
> structure.
> 
> As nobodyuser is just a pointer into this space this cleanup
> will delete the variable that is pointed to.
> 
> Confirmed this by commenting out the conf_cleanup() and everything works.
> There will be no problem iff /etc/idmapd.conf doesn't contain any setting
> for Nobody-User or Nobody-Group.
> 
> So mystery solved.
Nice work!!! Thanks you!

> 
> Didn't take a closer look at CONF_SAVE() nor conf_cleanup() for a real
> solution but I guess that is trivial for someone with more familiarity with
> the code?
Yes... I will look into this...

Comment 10 Steve Dickson 2020-09-08 14:46:11 UTC
commit 086e6fdce887dd68e51b7bac4a2f21cea9a4fe01
Author: Steve Dickson <steved>
Date:   Fri Sep 4 14:15:53 2020 -0400

    rpc.idmapd: Do not free config variables

Comment 11 Fedora Update System 2020-09-09 14:28:41 UTC
FEDORA-2020-1ce9fdf311 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-1ce9fdf311

Comment 12 Fedora Update System 2020-09-09 14:28:41 UTC
FEDORA-2020-64f1b7b1e8 has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2020-64f1b7b1e8

Comment 13 Fedora Update System 2020-09-09 14:28:42 UTC
FEDORA-2020-1a4950ff9c has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-1a4950ff9c

Comment 14 Fedora Update System 2020-09-10 15:48:22 UTC
FEDORA-2020-64f1b7b1e8 has been pushed to the Fedora 33 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-64f1b7b1e8`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-64f1b7b1e8

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 15 Fedora Update System 2020-09-10 18:25:54 UTC
FEDORA-2020-1a4950ff9c has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-1a4950ff9c`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-1a4950ff9c

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2020-09-10 18:28:55 UTC
FEDORA-2020-1ce9fdf311 has been pushed to the Fedora 31 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-1ce9fdf311`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-1ce9fdf311

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Fedora Update System 2020-09-12 16:34:25 UTC
FEDORA-2020-1a4950ff9c has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 18 Fedora Update System 2020-09-25 16:53:34 UTC
FEDORA-2020-64f1b7b1e8 has been pushed to the Fedora 33 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 19 Fedora Update System 2020-09-25 17:46:15 UTC
FEDORA-2020-1ce9fdf311 has been pushed to the Fedora 31 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.