Description of problem: Got zombie process or segmentation fault in command spawned in popen (). Version-Release number of selected component (if applicable): Red Hat Linux Advanced Server release 2.1AS/i686 2.4.9-e.3smp #1 SMP Fri May 3 16:48:54 EDT 2002 i686 unknown 8 cpus How reproducible: If a process (e.g. 12345) no longer exists, then I get [sh <defunct>] in "top" when the following function does popen() to "cat /proc/12345/stat 2> /dev/null". Why would popen() cause a zombie process? Sometimes I even see core dump generated with the following info: "Core was generated by `cat /proc/6101/stat'. Program terminated with signal 11, Segmentation fault. #0 0x40009e42 in ?? ()" static int getstates(struct ux_processor *new) { FILE *fdir = NULL; char *dn = NULL; char dstr[256]; int total_procs=0; int process_states[NPROCSTATES] = {0}; if (!(fdir = popen("ls /proc/[1-9]*/stat 2> /dev/null", "r"))) { printf("Problem opening the directory /proc\n"); return 0; /* this stands for error for this function */ } while ((dn = fgets(dstr, sizeof(dstr)-1, fdir)) != NULL) { FILE *fp = NULL; char buffer[1024]; char filename[256]; char stat = 0; int n = 0; unsigned int size = 0; if (dn[strlen(dn)-1] == '\n') dn[strlen(dn)-1] = 0; sprintf(filename, "cat %s 2> /dev/null", dn); ==> fp = popen(filename, "r"); if (fp == NULL){ continue; } if (fgets(buffer, sizeof(buffer)-1, fp)) { if ((n=sscanf(buffer, "%*d %*s %c %*d %*d %*d %*d %*d %*u % *u %*u %*u %*u %*d %*d %*d %*d %*d"\ " %*d %*u %*u %*d %*u %u %*u", &stat, &size)) !=2 ) continue; switch (stat) { case 'R': n = size>0 ? 1:6; break; /* runnable (on run queue) */ case 'S': n = 2; break; /* sleeping */ case 'D': n = size>0 ? 3:6; break; /* uninterruptible sleep (usually IO)*/ case 'Z': n = 4; break; /* a defunct ("zombie") process */ case 'T': n = 5; break; /* traced or stopped */ case 'W': n = 6; break; /* has no resident pages - bucket for swapped */ } total_procs++; process_states[n]++; } pclose(fp); } pclose(fdir); new->RunQL = process_states[1]; new->BlkQL = process_states[3]+process_states[4]+process_states [5]; new->SwpQL = process_states[6]; return (total_procs); } Steps to Reproduce: 1. 2. 3. Actual results:
If sscanf doesn't return 0, you forgot to call pclose (fp), so that can explain the zombies. With that fixc, I certainly don't see any zombies nor crashes.
I already tried the code before with plose(fp) if sscanf() doesn't return 2 and still see the same zombie. The only workaround is to use fopen() and fclose() since the "cat /proc/12345/stat" is underneath opening the file of interest. The reason I am opening the bug is find out why popen() would behave that way. Another thing I observed is that printf() after the zombie occurred doesn't print any error on the stdout. fp = popen(filename, "r"); if (fp == NULL){ printf("error in popen(): errno=%d(%s)\n", errno, strerror(errno)); <=== continue; } memset(buffer, 0, sizeof(buffer)); if (fgets(buffer, sizeof(buffer)-1, fp)) { if ((n=sscanf(buffer, "%*d %*s %c %*d %*d %*d %*d %*d %*u % *u %*u %*u %*u %*d %*d %*d %*d %*d"\ " %*d %*u %*u %*d %*u %u %*u", &stat, &size)) !=2 ) { printf("sscanf for \"%s\" returned n=%d\n", buffer, n); <=== pclose(fp); continue; }
Since this issue is beyond the scope of the current support status of RHEL2.1, I am closing it as WONTFIX.