*** This bug has been split off bug 140409 *** From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20041001 Firefox/0.10.1 Description of problem: From IT#54817: customer see a wrong value for VSZ in ps output. USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ... mmsse 1216 0.0 0.0 589505315 0 ? ZL 04:17 0:00 [cat <defunct>] ... The wrong value is always 589505315. (= 0x23232323 = "####") This must be a defect of ps command.The reason why... ps command initializes a buffer to collect /proc information as '#' in simple_spew(). ps command gets a /proc information by ps_readproc(), and ps_readproc() gets the values as follows: 433 if ((file2str(path, "stat", sbuf, sizeof sbuf)) == -1) 434 goto next_proc; /* error reading /proc/#/stat */ 435 stat2proc(sbuf, p); /* parse /proc/#/stat */ ... 438 if ((file2str(path, "statm", sbuf, sizeof sbuf)) != -1 ) 439 statm2proc(sbuf, p); /* ignore statm errors h ere */ ... 443 if ((file2str(path, "status", sbuf, sizeof sbuf)) != -1 ){ 444 status2proc(sbuf, p, 0 /*FIXME*/); 445 } ... stat2proc() gets p->vsize , and status2proc() gets p->vm_size as VSZ value. Depending on how things go, ps_readproc() does not read the values from /proc/##/statm and /proc/##/status. However, ps command always shows VSZ value with using vm_size. 429 static int pr_vsz(void){ 430 return sprintf(outbuf, "%lu", pp->vm_size); 431 } Therefore, this problem is caused by ps command's bug. pr_vsz() should use not vm_size but vsize. I've checked the latest procps source both RHEL3 and RHEL2.1. But this bug still remains. Version-Release number of selected component (if applicable): procps-2.0.17-10 How reproducible: Sometimes Steps to Reproduce: 1. execute ps ux 2. 3. Actual Results: Sometimes, vsz is incorrect Additional info:
procps version for 2.1 is currently procps-2.0.7-11.21AS.4
I checked code and it seems it's without bugs. If the files are in /proc it always read data or it fill zeroes to vm_xxx fields. For processes with 'Z' status (like your "[cat <defunct>]") should be kernel keep zeroes in the statm file and "VmXXX" should be missing in the status file. I think basic question is why ps_readproc() doesn't read statm and status files. It can happen only if kernel remove information about process from /proc during time when ps_readproc() read it. But this function check it by stat(). Please, can you send results of cat /proc/#/status cat /proc/#/statm for your [cat <defunct>]? I think I can add small patch that zeroize vm_xxx and other stuff if file2str() for stat and statm failed. I'm sure that we want to use vm_size from /proc/#/status in "ps". It's used long time and it's in all versions (lates too). The "top" uses vsize from /proc/#/statm only.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-024.html