Bug 433266

Summary: negative ETIME field in ps
Product: Red Hat Enterprise Linux 4 Reporter: Jose Plans <jplans>
Component: procpsAssignee: Daniel Novotny <dnovotny>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: albert, psplicha, rvokal, syeghiay, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 556762 580877 871819 (view as bug list) Environment:
Last Closed: 2009-05-18 16:04:13 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 391511, 456484    
Description Flags
a new patch none

Description Jose Plans 2008-02-18 03:44:37 EST
Description of problem:

This is the description of the problem from a customer, after verifying it I
couldn't find anything wrong.
ETIME field is not obtained from procps but rather computed as a difference
between current time and process start time.

/* elapsed wall clock time, [[dd-]hh:]mm:ss format (not same as "time") */
static int pr_etime(char *restrict const outbuf, const proc_t *restrict const
 unsigned long t;
 unsigned dd,hh,mm,ss;
 char *cp = outbuf;
 t = seconds_since_boot - (unsigned long)(pp->start_time / Hertz);
 ss = t%60;
 t /= 60;
 mm = t%60;
 t /= 60;
 hh = t%24;
 t /= 24;
 dd = t;
 cp +=(     dd      ?  snprintf(cp, COLWID, "%u-", dd)           :  0 );
 cp +=( (dd || hh)  ?  snprintf(cp, COLWID, "%02u:", hh)         :  0 );
 cp +=                 snprintf(cp, COLWID, "%02u:%02u", mm, ss)       ;
 return (int)(cp-outbuf);

If t = -1, it will print exactly the value customer sees. I copied the sources
above to a standalone C-program, set t=-1 and the output is

{alexs 10:17:55} etime

How can we obtain t=-1? The value 'seconds_since_boot' is global and obtained
after parsing the arguments:

ps/global.c:  seconds_since_boot    = uptime(0,0);

After that, 'ps' starts to read from /proc directory the processes info and
parse/format/print it. I can see a possibility of the following race:

(a) ps starts running and sets seconds_since_boot
(b) just after that a new process is created. As a result, it will have
  start_time > seconds_since_boot

(c) when we eventually obtain and print info for that process,

seconds_since_boot - (unsigned long)(pp->start_time / Hertz)

is negative. Obviously, for this to occur we need a heavily loaded system (so
that there are many processes - this will make 'ps' run for a longer time) and
many new processes per second.

An obvious fix would be to add an explicit test to pr_etime() so that if we
see a negative value, we'll replace it by 0. E.g.:

 t = seconds_since_boot - (unsigned long)(pp->start_time / Hertz);
 if (t < 0) t = 0;

The problem is not RH-specific and is even workarounded in some tools, e.g.
       if (hyphcnt > 0) {
               sscanf(etime, "%d-%d:%d:%d",
                               &days, &hours, &minutes, &seconds);
               /* linux 2.6.5/2.6.6 reporting some processes with infinite
                * elapsed times for some reason */
               if (days == 49710) {
                       return 0;

Version-Release number of selected component (if applicable):

How reproducible:
Not easily, not often. Please read below a comment explaining the way it's
suspected to be reproduced:

"Obviously, for this to occur we need a heavily loaded system (so
that there are many processes - this will make 'ps' run for a longer time) and
many new processes per second."

Steps to Reproduce:
1. Unknown
2. ps -eo uname,pid,ppid,stime,etime,cmd
Actual results:

% ps -eo uname,pid,ppid,stime,etime,cmd
ETIME shows: 49710-06:28:15

Expected results:
Correct ETIME.

Additional info:
Proposed patch attached.
Comment 1 Jose Plans 2008-02-18 03:48:14 EST
Created attachment 295142 [details]
Comment 2 RHEL Product and Program Management 2008-09-05 13:22:46 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 8 Daniel Novotny 2009-03-20 08:13:39 EDT
Created attachment 336033 [details]
a new patch
Comment 12 errata-xmlrpc 2009-05-18 16:04:13 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.