Bug 10985 - smp, pthreads and stdio
Summary: smp, pthreads and stdio
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 4.2
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Michael K. Johnson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2000-04-22 13:42 UTC by simra
Modified: 2008-05-01 15:37 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2000-05-02 18:20:48 UTC
Embargoed:


Attachments (Terms of Use)

Description simra 2000-04-22 13:42:05 UTC
Hi,

I have a fresh RH 6.2 installation on a dual-cpu P3.
simra@Chess:[simra] 3>rpm -q glibc
glibc-2.1.3-15
simra@Chess:[simra] 4>ls -l /lib/libpthread*
-rwxr-xr-x    1 root     root       289906 Feb 29 16:58
/lib/libpthread-0.8.so*
lrwxrwxrwx    1 root     root           17 Apr 17 17:46
/lib/libpthread.so.0 -> libpthread-0.8.so*
simra@Chess:[simra] 5>uname -a
Linux Chess.McRCIM.McGill.EDU 2.2.14-5.0smp #1 SMP Tue Mar 7 21:01:40
EST 2000 i686 unknown


The following short program works fine on a single processor machine,
but aborts on a dual.  I haven't tried it yet w/ open(2) and read(2), so
I'm not sure if it's a libc problem or kernel problem.  In any case it's
a serious problem.

btw, if I guard the inside of the while loop with a mutex, the same
problem occurs, so it seems to have something to do with the thread
migrating between processors (if this does indeed occur), and also suggests
that it's not a libc problem.



/************************************************************************
gcc -Wall -o threadtest -D_REENTRANT threadtest.c
-lpthread

Usage: ./threadtest

***********************************************************************/

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>

#define maxthreads 4
void * runthread(void* args);

int
main(int argc, char ** argv) {
  int t;
  pthread_t threads[maxthreads];

  t=0;
  for (t=0; t<maxthreads; t++) {
    fprintf(stderr,"Spawning thread %d\n", t);
    pthread_create(&threads[t], NULL,
                   runthread, argv[0]);
  }
  sleep(10000);
  exit(0);
}

void * runthread(void* args) {
  char buffer[1024];
  while (1) {
    FILE* fp=fopen("threadtest.c","r");
    if (!fp) {
      perror((char*)args);
      exit(1);
    }
    if (!fgets(buffer, 1023, fp)) {
      perror("Failed fgets");
      exit(1);
    }
    if (strncmp(buffer, "/*",2)) {
      fprintf(stderr, "Failed on *%s*\n",buffer);
      abort();
    } else fprintf(stderr,"%ld Success..\n",(long)fp);
    fclose(fp);
  }
  pthread_exit(0);
}

Comment 1 simra 2000-04-22 20:15:59 UTC
Note:
I have since reimplemented the program using open(2) and read(2) and the program
does not abort- therefore the problem is with fgets in libc.  Also, I have seen
this problem with the gnu extension 'getline'.

Comment 2 simra 2000-05-02 16:25:59 UTC
I'm posting some addition comments from Ulrich Drepper re my bug report to the
libc people.  A second individual was unable to reproduce the bug with
libc-2.1.2 and his own custom-built 2.2.14 kernel.  I'm beginning to wonder if
it's a problem with the RH prebuilt SMP kernel or a HW problem.  If it's a HW
problem it's not specific to a single machine- I can reproduce the bug on 10
identical machines in our lab.

Date: 30 Apr 2000 12:18:50 -0700
Subject: Re: libc/1706: libpthread, multiprocessor linux and fgets/getline

Robert Sim  <simra.ca> writes:

> >Description:
>
> Compile the program supplied below as per the comment line and execute on a
> multi-processor machine.  It will eventually abort on the abort() instruction
> because fgets failed to read the expected bytes into the buffer (in spite of
> returning a success).  The program executes fine on a single-processor
> machine, and also works fine if I replace fopen and fgets with the equivalent
> open(2) and read(2) calls.  I have also observed this behaviour using the gnu
> getline extension.

I cannot reproduce this.  The setup is almost identical.  The only
notable difference is that I'm using a 2.3.99pre6 kernel.

I have the process running for more than an hour and everything works
fine.

Comment 3 simra 2000-05-02 18:20:59 UTC
RESOLVED: by upgrading the kernel to 2.2.14-6.1.1

it disturbs me that redhat is doing its own kernel patches- I can't report a bug
like this to the linux-kernel people because I'm not using a standard 2.2.14
kernel but some specially patched version of 2.2.14.

Comment 4 Alan Cox 2000-08-08 21:00:23 UTC
And yes there are folks at Red Hat who deal with both RH and standard tree bugs.
We pretty much have to ship a non default tree



Note You need to log in before you can comment on or make changes to this bug.