Bug 10985 - smp, pthreads and stdio
smp, pthreads and stdio
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
4.2
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Michael K. Johnson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2000-04-22 09:42 EDT by simra
Modified: 2008-05-01 11:37 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2000-05-02 14:20:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description simra 2000-04-22 09:42:05 EDT
Hi,

I have a fresh RH 6.2 installation on a dual-cpu P3.
simra@Chess:[simra] 3>rpm -q glibc
glibc-2.1.3-15
simra@Chess:[simra] 4>ls -l /lib/libpthread*
-rwxr-xr-x    1 root     root       289906 Feb 29 16:58
/lib/libpthread-0.8.so*
lrwxrwxrwx    1 root     root           17 Apr 17 17:46
/lib/libpthread.so.0 -> libpthread-0.8.so*
simra@Chess:[simra] 5>uname -a
Linux Chess.McRCIM.McGill.EDU 2.2.14-5.0smp #1 SMP Tue Mar 7 21:01:40
EST 2000 i686 unknown


The following short program works fine on a single processor machine,
but aborts on a dual.  I haven't tried it yet w/ open(2) and read(2), so
I'm not sure if it's a libc problem or kernel problem.  In any case it's
a serious problem.

btw, if I guard the inside of the while loop with a mutex, the same
problem occurs, so it seems to have something to do with the thread
migrating between processors (if this does indeed occur), and also suggests
that it's not a libc problem.



/************************************************************************
gcc -Wall -o threadtest -D_REENTRANT threadtest.c
-lpthread

Usage: ./threadtest

***********************************************************************/

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>

#define maxthreads 4
void * runthread(void* args);

int
main(int argc, char ** argv) {
  int t;
  pthread_t threads[maxthreads];

  t=0;
  for (t=0; t<maxthreads; t++) {
    fprintf(stderr,"Spawning thread %d\n", t);
    pthread_create(&threads[t], NULL,
                   runthread, argv[0]);
  }
  sleep(10000);
  exit(0);
}

void * runthread(void* args) {
  char buffer[1024];
  while (1) {
    FILE* fp=fopen("threadtest.c","r");
    if (!fp) {
      perror((char*)args);
      exit(1);
    }
    if (!fgets(buffer, 1023, fp)) {
      perror("Failed fgets");
      exit(1);
    }
    if (strncmp(buffer, "/*",2)) {
      fprintf(stderr, "Failed on *%s*\n",buffer);
      abort();
    } else fprintf(stderr,"%ld Success..\n",(long)fp);
    fclose(fp);
  }
  pthread_exit(0);
}
Comment 1 simra 2000-04-22 16:15:59 EDT
Note:
I have since reimplemented the program using open(2) and read(2) and the program
does not abort- therefore the problem is with fgets in libc.  Also, I have seen
this problem with the gnu extension 'getline'.
Comment 2 simra 2000-05-02 12:25:59 EDT
I'm posting some addition comments from Ulrich Drepper re my bug report to the
libc people.  A second individual was unable to reproduce the bug with
libc-2.1.2 and his own custom-built 2.2.14 kernel.  I'm beginning to wonder if
it's a problem with the RH prebuilt SMP kernel or a HW problem.  If it's a HW
problem it's not specific to a single machine- I can reproduce the bug on 10
identical machines in our lab.

Date: 30 Apr 2000 12:18:50 -0700
Subject: Re: libc/1706: libpthread, multiprocessor linux and fgets/getline

Robert Sim  <simra@cim.mcgill.ca> writes:

> >Description:
>
> Compile the program supplied below as per the comment line and execute on a
> multi-processor machine.  It will eventually abort on the abort() instruction
> because fgets failed to read the expected bytes into the buffer (in spite of
> returning a success).  The program executes fine on a single-processor
> machine, and also works fine if I replace fopen and fgets with the equivalent
> open(2) and read(2) calls.  I have also observed this behaviour using the gnu
> getline extension.

I cannot reproduce this.  The setup is almost identical.  The only
notable difference is that I'm using a 2.3.99pre6 kernel.

I have the process running for more than an hour and everything works
fine.
Comment 3 simra 2000-05-02 14:20:59 EDT
RESOLVED: by upgrading the kernel to 2.2.14-6.1.1

it disturbs me that redhat is doing its own kernel patches- I can't report a bug
like this to the linux-kernel people because I'm not using a standard 2.2.14
kernel but some specially patched version of 2.2.14.
Comment 4 Alan Cox 2000-08-08 17:00:23 EDT
And yes there are folks at Red Hat who deal with both RH and standard tree bugs.
We pretty much have to ship a non default tree

Note You need to log in before you can comment on or make changes to this bug.