Bug 847878 - find -nouser reports files owned by valid users
find -nouser reports files owned by valid users
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: glibc (Show other bugs)
6.1
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Carlos O'Donell
qe-baseos-tools
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-13 17:51 EDT by Pedro
Modified: 2016-11-24 07:19 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-22 23:14:42 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pedro 2012-08-13 17:51:14 EDT
Description of problem:

Running the command: "find / -mount \( -nouser -o -nogroup \)" returns a *long* list of files. When I take a closer look, those files are owned by valid users. For example, if I then do "ls -l" on the files, it shows up that they are owned by the "oracle" user (it happens with other users also). What's more, some files are owned by "1041", which turns out to be "oracle" once I do a "ypcat".

I took a look at system-config-authentication, and things seem to be in order. Errors are not being reported to /var/log/messages. I can actually "su" to the users that find says don't exist.

Version-Release number of selected component (if applicable):
find --version:
find (GNU findutils) 4.4.2
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION SELINUX FTS() CBO(level=0)

How reproducible:
It happens every time I run the find command as specified above.


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 2 Kamil Dudka 2012-08-26 18:41:51 EDT
Could you please attach the contents of your /etc/nsswitch.conf?
Comment 3 Pedro 2012-08-29 09:27:38 EDT
(In reply to comment #2)
> Could you please attach the contents of your /etc/nsswitch.conf?

I can't attach the file directly, but I can include what's on there:

passwd: files nis
shadow: files nis
group: files nis
hosts: files nis dns
bootparams: nis [NOTFOUND=return] files
ethers: files
netmasks: files
networks: files
protocols: files
rpc: files
services: files
netgroup: files nis
publickey: nis
automount: files nis
aliases: files nis


Thanks,
Pedro
Comment 4 Kamil Dudka 2012-09-02 10:05:27 EDT
What does the following command say on your box?

getent passwd 1041; echo $?
Comment 5 Pedro 2012-09-04 17:26:45 EDT
oracle:xxx:1041:1001:Oracle Admin:/home/oracle:/bin/csh
0

(where xxx is the shadow password)


Thanks again,
Pedro
Comment 6 Kamil Dudka 2012-09-04 18:26:06 EDT
Then uid 1041 is likely not the reason why those files were listed...

Please try the following two commands:

find / -xdev -nouser -printf "%U\n"
find / -xdev -nogroup -printf "%G\n"
Comment 7 Pedro 2012-09-05 09:12:47 EDT
I think some work was done on the system since I first posted (I haven't really been working on them for a while now). It used to be that there were a bunch of files under /tmp that had the issue, but now there are none. There's only one mount that has the issue now, it seems like. 

Running the first command returns a LOT of results, all saying "1041", and the second one also returns a lot of results, all "1001".
Comment 8 Kamil Dudka 2012-09-05 09:50:57 EDT
The implementation of the -nouser predicate is so trivial that it cannot be wrong.  It does exactly what POSIX asks for:

    "The primary shall evaluate as true if the file belongs to a user ID for which the getpwuid() function defined in the System Interfaces volume of POSIX.1-2008 (or equivalent) returns NULL."

The only problem I can think of is that getpwuid() returns inconsistent results on subsequent calls in your case.  Please attach the following ltrace output:

ltrace -e getpwuid find /tmp -xdev -nouser -prune
Comment 9 RHEL Product and Program Management 2012-09-07 01:33:47 EDT
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.

Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.
Comment 10 Pedro 2012-09-07 09:50:51 EDT
I don't know if this is what you mean by inconsistent results, but after running your command in the directories where I was seeing the issue, I got one of two results:

getpwuid(1041, 0x7fffeb17b010, 0x9f83c0, -1, 256)     = 0x387cf8dda0
getpwuid(1041, 0x7fffeb17b010, 0x9f83c0, -1, 0)     = 0x387cf8dda0

From what I could tell, most of the results were the "256" kind.

I am sorry I can't provide the actual file with results, but the hosts in question are on a separate network with no internet access, and it's not an easy task to copy the things over. :/

Thanks,
Pedro

(In reply to comment #8)
> The implementation of the -nouser predicate is so trivial that it cannot be
> wrong.  It does exactly what POSIX asks for:
> 
>     "The primary shall evaluate as true if the file belongs to a user ID for
> which the getpwuid() function defined in the System Interfaces volume of
> POSIX.1-2008 (or equivalent) returns NULL."
> 
> The only problem I can think of is that getpwuid() returns inconsistent
> results on subsequent calls in your case.  Please attach the following
> ltrace output:
> 
> ltrace -e getpwuid find /tmp -xdev -nouser -prune
Comment 11 Kamil Dudka 2012-09-07 10:26:50 EDT
Both the calls you captured return non-zero value, which means that a user entry was found.  You can easily check how a result for the non-existing user looks like:

# install -o 777 -d xxx
# ltrace -e getpwuid find xxx -xdev -nouser -prune
getpwuid(777, 0x7fff3ebee350, 0x82f590, 0x7fff3ebee350, 0)       = 0
xxx
+++ exited (status 0) +++
Comment 12 Kamil Dudka 2012-09-08 17:40:33 EDT
As for the extra four arguments given to getpwuid, they are just unrelated values grabbed from stack.  You can get rid of them by the following command prior to running ltrace:

echo 'addr getpwuid(uint);' >> ~/.ltrace.conf
Comment 13 Pedro 2012-09-13 13:46:57 EDT
(In reply to comment #11)
> Both the calls you captured return non-zero value, which means that a user
> entry was found.  You can easily check how a result for the non-existing
> user looks like:
> 
> # install -o 777 -d xxx
> # ltrace -e getpwuid find xxx -xdev -nouser -prune
> getpwuid(777, 0x7fff3ebee350, 0x82f590, 0x7fff3ebee350, 0)       = 0
> xxx
> +++ exited (status 0) +++

That's the weird part. It claims to have files that are "unowned", but I am clearly able to find the users that it says don't exist.

I'm grasping for straws here, but could it have something to do with the NIS server, and the fact that it's running Solaris? I mean, it wouldn't make much sense since the files are local, and it doesn't happen with ALL files (it doesn't complain with all files owned by me, for example), but I thought it wouldn't hurt to ask.
Comment 14 Pedro 2012-09-13 14:10:43 EDT
Actually, nevermind. I was able to do the same test on a RHEL 5.6 server that we have (using the same NIS server), and didn't get the findings for the same files.
Comment 15 Kamil Dudka 2012-09-13 14:40:54 EDT
It does not really matter whether the files are local or remote since the file system provides only their UIDs anyway.  It is the getpwuid() function what looks for the corresponding user entry.  Since both the NSS infrastructure and the NIS lookup provider belong to the glibc package, I am switching the component such.
Comment 16 Jeff Law 2012-09-13 16:10:45 EDT
This sounds similar to something we fixed in RHEL 6.3.  Any chance you could try your test on a RHEL 6.3 system talking to the Solaris NIS server?
Comment 17 Pedro 2012-09-14 10:43:50 EDT
Unfortunately, I don't think that's an option. :/ If you could tell me what packages need updating (and where I could find them) I can try passing that on to the system owners. I think maybe in the future we'll move on to a newer RH release, but for now we're locked onto 6.1.

Thanks,
Pedro
Comment 18 Jeff Law 2012-09-14 14:24:55 EDT
I wasn't suggesting you update any machines to a newer release-- just to run the test from a RHEL 6.3 box (if you have one) to confirm/deny that you're bumping against the same problem we fixed in 6.3.

If you don't have a 6.3 machine handy, you might be able to try using a scratch 6.1 machine after first confirming the incorrect behaviour, then updating glibc* and nscd* and testing again.  Note however, that if you take this approach, make sure it's a scratch machine.

While I'm not immediately aware of any problems running a RHEL 6.3 glibc on a RHEL 6.1 system, it's not a configuration we test or support.
Comment 19 Jean-PIerre Melkonian 2013-07-31 02:25:46 EDT
I have a similar problem since i use redhat 6
I have noticed with netstat that the number of ports in time_wait state increase dramatically during the find -nouser. (foreign adress 127.0.0.1:111 and also 127.0.0.1:148, my host is it's own nis server).I suspect that a new port is open for each file scanned to check the user in the NIS.
find does not report errors, but obviously the owner of the file is not found in the nis

setting
net.ipv4.tcp_tw_recycle = 1
seems to correct the problem, but it is not always sufficient.

this does not occur on redhat5, 4, 3  or other os
Comment 21 Carlos O'Donell 2014-06-22 23:14:42 EDT
We don't have sufficient data to figure out what's going on here. I have never seen a system behave as described by the original bug reporter. We would need a lot more data and testing to determine what's wrong. For starters you want to isolate yourself from the NIS server and try using local accounts only, and then work up slowly to determine exactly what causes the problem with the users. 

If this is still a problem please reopen the bug.
Comment 22 Pedro 2015-03-02 12:26:35 EST
I've moved to a different company so I can't provide further info on this bug. I still get emails from bugzilla in regards to this bug, so I'll try closing this to see if the emails go away. Thanks!

Note You need to log in before you can comment on or make changes to this bug.