Bug 143634

Summary: zombie process or segmentation fault in command spawned in popen()
Product: Red Hat Enterprise Linux 2.1 Reporter: Isabel Lin <isabel_lin2003>
Component: kernelAssignee: Jim Paradis <jparadis>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: peterm, riel
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-14 20:17:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Isabel Lin 2004-12-23 03:18:14 UTC
Description of problem:
Got zombie process or segmentation fault in command spawned in popen
().

Version-Release number of selected component (if applicable):
Red Hat Linux Advanced Server release 2.1AS/i686
2.4.9-e.3smp #1 SMP Fri May 3 16:48:54 EDT 2002 i686 unknown
8 cpus

How reproducible:
If a process (e.g. 12345) no longer exists, then I get [sh <defunct>] 
in "top" when the following function does popen() 
to "cat /proc/12345/stat 2> /dev/null". Why would popen() cause a 
zombie process? Sometimes I even see core dump generated with the 
following info:
"Core was generated by `cat /proc/6101/stat'.
Program terminated with signal 11, Segmentation fault.
#0  0x40009e42 in ?? ()"

static
int getstates(struct ux_processor *new)
{
    FILE *fdir = NULL;
    char *dn = NULL;
    char dstr[256];
    int total_procs=0;
    int process_states[NPROCSTATES] = {0};

    if (!(fdir = popen("ls /proc/[1-9]*/stat 2> /dev/null", "r"))) {
      printf("Problem opening the directory /proc\n");
      return 0;  /* this stands for error for this function */
    }
      
    while ((dn = fgets(dstr, sizeof(dstr)-1, fdir)) != NULL) {
        FILE *fp = NULL;
        char buffer[1024];
	char filename[256];
	char stat = 0;
	int n = 0;
	unsigned int size = 0;
 
        if (dn[strlen(dn)-1] == '\n') dn[strlen(dn)-1] = 0;
        sprintf(filename, "cat %s 2> /dev/null", dn);

==>        fp = popen(filename, "r");
	if (fp == NULL){
	  continue;
	}

        if (fgets(buffer, sizeof(buffer)-1, fp)) {
	  
           if ((n=sscanf(buffer, "%*d %*s %c %*d %*d %*d %*d %*d %*u %
*u %*u %*u %*u %*d %*d %*d %*d %*d"\
                            " %*d %*u %*u %*d %*u %u %*u", &stat, 
&size)) !=2 ) continue;
           switch (stat) {
              case 'R': n = size>0 ? 1:6; break;        /* runnable 
(on run queue) */
              case 'S': n = 2; break;                   /* sleeping */
              case 'D': n = size>0 ? 3:6; break;        /* 
uninterruptible sleep (usually IO)*/
              case 'Z': n = 4; break;                   /* a defunct 
("zombie") process */
              case 'T': n = 5; break;                   /* traced or 
stopped */
              case 'W': n = 6; break;                   /* has no 
resident pages - bucket for swapped */
              }
           total_procs++;
           process_states[n]++;

         }
	 pclose(fp);
    }
    pclose(fdir);

    new->RunQL = process_states[1];
    new->BlkQL = process_states[3]+process_states[4]+process_states
[5];
    new->SwpQL = process_states[6];

    return (total_procs);
}



Steps to Reproduce:
1.
2.
3.
  
Actual results:

Comment 1 Jakub Jelinek 2004-12-23 09:12:51 UTC
If sscanf doesn't return 0, you forgot to call pclose (fp), so that can explain
the zombies.  With that fixc, I certainly don't see any zombies nor crashes.

Comment 2 Isabel Lin 2004-12-23 19:14:52 UTC
I already tried the code before with plose(fp) if sscanf() doesn't return 2 and 
still see the same zombie. The only workaround is to use fopen() and fclose() 
since the "cat /proc/12345/stat" is underneath opening the file of interest.
The reason I am opening the bug is find out why popen() would behave that way.
Another thing I observed is that printf() after the zombie occurred doesn't 
print any error on the stdout.

fp = popen(filename, "r");
if (fp == NULL){
   printf("error in popen(): errno=%d(%s)\n", errno, strerror(errno)); <===
   continue;
}
memset(buffer, 0, sizeof(buffer));
if (fgets(buffer, sizeof(buffer)-1, fp)) {
   if ((n=sscanf(buffer, "%*d %*s %c %*d %*d %*d %*d %*d %*u %
*u %*u %*u %*u %*d %*d %*d %*d %*d"\
                            " %*d %*u %*u %*d %*u %u %*u", &stat, 
&size)) !=2 ) {
     printf("sscanf for \"%s\" returned n=%d\n", buffer, n);      <===
    pclose(fp);
    continue;
}


Comment 4 Jim Paradis 2006-09-14 20:17:45 UTC
Since this issue is beyond the scope of the current support status of RHEL2.1, I
am closing it as WONTFIX.