Bug 110895

Summary: running processes are not listed in /proc, with ps or top
Product: Red Hat Enterprise Linux 3 Reporter: Martin Grimm <martin.grimm>
Component: kernelAssignee: Pete Zaitcev <zaitcev>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: high    
Version: 3.0CC: petrides, pstadt, riel
Target Milestone: ---   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-11-08 16:55:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Simplistic fix (tested)
none
Take 2, telldir works none

Description Martin Grimm 2003-11-25 12:16:44 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5)
Gecko/20031120 Firebird/0.7

Description of problem:
After installation of the system some (not all) running processes are
not listed with ps aux or top. They are not visible in the
/proc-Filesystem either. However they are running, as can be proven
with netstat:

[root@host proc]# netstat -anp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address        
State       PID/Program
+name
tcp        0      0 0.0.0.0:32768           0.0.0.0:*              
LISTEN      -
tcp        0      0 0.0.0.0:111             0.0.0.0:*              
LISTEN      -
tcp        0      0 0.0.0.0:22              0.0.0.0:*              
LISTEN      -
tcp        0      0 0.0.0.0:631             0.0.0.0:*              
LISTEN      -
tcp        0      0 127.0.0.1:25            0.0.0.0:*              
LISTEN      -
tcp        0      0 172.17.15.2:22          172.17.15.1:2095       
ESTABLISHED 766/0
udp        0      0 0.0.0.0:32768           0.0.0.0:*                
          -    
udp        0      0 0.0.0.0:772             0.0.0.0:*                
          -
udp        0      0 0.0.0.0:111             0.0.0.0:*                
          -
udp        0      0 0.0.0.0:631             0.0.0.0:*                
          -
Active UNIX domain sockets (servers and established)                 
           
Proto RefCnt Flags       Type       State         I-Node PID/Program
name    Path
unix  7      [ ]         DGRAM                    761    -           
       /dev/log
unix  2      [ ]         DGRAM                    1096   742/crond
unix  2      [ ]         DGRAM                    1079   -
unix  2      [ ]         DGRAM                    1065   -
unix  2      [ ]         DGRAM                    813    -
unix  2      [ ]         DGRAM                    769    -


As you can see, there are no PIDs for the listening processes. But
they are listening and answering on the open ports (verified mit cupsd
and ssh).


Version-Release number of selected component (if applicable):
kernel-2.4.21-4.EL

How reproducible:
Always

Steps to Reproduce:
1. install RHAS3 64bit on IBM zSeries
2. select only 'system-tools', 'admin-tools', 'network-server'
3. reboot as often as you want


Additional info:

Comment 1 Pete Zaitcev 2003-11-25 20:48:48 UTC
*** Bug 110943 has been marked as a duplicate of this bug. ***

Comment 2 Pete Zaitcev 2003-11-26 05:20:01 UTC
strace ls /proc

open("/proc", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
getdents(3, /* 35 entries */, 993)      = 992
lseek(3, 28, SEEK_SET)                  = 28
getdents(3, /* 37 entries */, 993)      = 984
lseek(3, 267, SEEK_SET)                 = 267
getdents(3, /* 20 entries */, 993)      = 480
getdents(3, /* 1 entries */, 993)       = 24
getdents(3, /* 0 entries */, 993)       = 0
close(3)                                = 0


Comment 3 Pete Zaitcev 2003-11-26 09:26:17 UTC
The root cause is that root directory of /proc defaults to generic
lseek, which simply changes filp->f_pos. However,
fs/proc/base.c:get_pid_list caches current position in a
private cursor, which generic lseek does not clear.

Seeking the root of /proc is broken thus on all architectures,
but only s390x seeks when reading. This is something to be investigated.

A simplistic fix would be to reset the cursor on any seek,
but it may have performance implications.


Comment 4 Pete Zaitcev 2003-11-26 09:27:40 UTC
Created attachment 96206 [details]
Simplistic fix (tested)

Comment 5 Pete Zaitcev 2003-11-26 21:46:28 UTC
Created attachment 96218 [details]
Take 2, telldir works

Comment 6 Pete Zaitcev 2003-12-03 00:38:28 UTC
Modified in 2.4.21-5.EL


Comment 9 Martin Grimm 2003-12-08 13:01:55 UTC
Where can I find 2.4.21-5.EL to test / when and where will it be
available?

Comment 10 Pete Zaitcev 2003-12-10 22:26:54 UTC
All errata tests are available in the "Sushi" RHN channel.
Please contanct your TAM for the access details.

Additionally, this fix is included into a periodic update
RHEL 3 U1, which will be available shortly.

If urgent resulution is needed and RHN access is not practical,
I suggest rebuilding from source, using the patch attached
to the comment #5.


Comment 11 Martin Grimm 2004-11-08 16:55:49 UTC
Tested: Problem is fixed in RHEL 3 U1.
Sorry for the long delay, thought this was closed long ago :-(

Comment 12 Ernie Petrides 2004-12-03 02:08:30 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-017.html