Bug 140410 (IT#54817)

Summary: 2.1SA: ps sometimes shows a wrong value for VSZ
Product: Red Hat Enterprise Linux 2.1 Reporter: Steve Conklin <sconklin>
Component: procpsAssignee: Karel Zak <kzak>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-28 15:51:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 132992    

Description Steve Conklin 2004-11-22 20:11:34 UTC
*** This bug has been split off bug 140409 ***

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20041001 Firefox/0.10.1

Description of problem:
From IT#54817:

customer see a wrong value for VSZ in ps output.
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
...
mmsse     1216  0.0  0.0 589505315 0 ?       ZL   04:17   0:00 [cat <defunct>]
...

The wrong value is always 589505315. (= 0x23232323 = "####")
This must be a defect of ps command.The reason why...

ps command initializes a buffer to collect /proc information as '#'  in
simple_spew().
ps command gets a /proc information by ps_readproc(), and ps_readproc() gets the
values as follows:

   433     if ((file2str(path, "stat", sbuf, sizeof sbuf)) == -1)
   434         goto next_proc;                 /* error reading /proc/#/stat */
   435     stat2proc(sbuf, p);                         /* parse /proc/#/stat */
...
   438         if ((file2str(path, "statm", sbuf, sizeof sbuf)) != -1 )
   439             statm2proc(sbuf, p);                /* ignore statm errors h
       ere */
...
   443        if ((file2str(path, "status", sbuf, sizeof sbuf)) != -1 ){
   444            status2proc(sbuf, p, 0 /*FIXME*/);
   445        }
...

stat2proc() gets p->vsize , and status2proc() gets p->vm_size as VSZ value.

Depending on how things go, ps_readproc() does not read the values from
/proc/##/statm and /proc/##/status.

However, ps command always shows VSZ value with using vm_size.
   429 static int pr_vsz(void){
   430   return sprintf(outbuf, "%lu", pp->vm_size);
   431 }

Therefore, this problem is caused by ps command's bug.
pr_vsz() should use not vm_size but vsize.

I've checked the latest procps source both RHEL3 and RHEL2.1. But this bug still
remains.

Version-Release number of selected component (if applicable):
procps-2.0.17-10

How reproducible:
Sometimes

Steps to Reproduce:
1. execute ps ux
2.
3.
  

Actual Results:  Sometimes, vsz is incorrect

Additional info:

Comment 1 Steve Conklin 2004-11-22 20:12:55 UTC
procps version for 2.1 is currently procps-2.0.7-11.21AS.4

Comment 2 Karel Zak 2004-11-23 10:56:49 UTC
I checked code and it seems it's without bugs. If the files are in
/proc it always read data or it fill zeroes to vm_xxx fields. For
processes with 'Z' status (like your "[cat <defunct>]") should be
kernel keep zeroes in the statm file and "VmXXX" should be missing in
the status file.

I think basic question is why ps_readproc() doesn't read statm and
status files. It can happen only if kernel remove information about
process from /proc during time when ps_readproc() read it. But this
function check it by stat().

Please, can you send results of 
  cat /proc/#/status 
  cat /proc/#/statm
for your [cat <defunct>]?

I think I can add small patch that zeroize vm_xxx and other stuff if
file2str() for stat and statm failed.

I'm sure that we want to use vm_size from /proc/#/status in "ps". It's
used long time and it's in all versions (lates too). The "top" uses
vsize from /proc/#/statm only.






Comment 4 John Flanagan 2005-04-28 15:51:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-024.html