Bug 57443

Summary: multithreaded program aborts when exception thrown
Product: [Retired] Red Hat Linux Reporter: hackenyo
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: pbrown
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-07-26 21:47:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description hackenyo 2001-12-12 17:40:31 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.3)
Gecko/20010801

Description of problem:
In a program that spins off a number of threads, where each thread loops  a
number of times, where each loop throws an exception to be immeadiately
caught -- it will randomly abort.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
Compile and run the following program:

#include <unistd.h> //looking for sleep()
#include <pthread.h>

extern "C" {
  void* routine1(void*);
}

int count;
pthread_mutex_t threadControl;

pthread_t threadHandles[50];

void* routine1(void *)
{
  pthread_mutex_lock(&threadControl);
  count++;
  pthread_mutex_unlock(&threadControl);
  
  for (unsigned j=0; j < 3100; j++) {
    try
    {
      if(0)
      {
	    /*should never get in here*/
      }
      else
      {
        throw(j);
      }
    }
    catch( unsigned&  j)
    {
      /*do nothing*/
    }   
  } 

  pthread_mutex_lock(&threadControl);
  count--;
  pthread_mutex_unlock(&threadControl);
  
  return NULL;
}

int main()
{
  int thread_cnt  = 49;
   
  pthread_mutex_lock(&threadControl);
  count = 0;
  pthread_mutex_unlock(&threadControl);
  
  //spin off 49 threads
  while(thread_cnt > 0) {
    pthread_create(threadHandles + 1, 0, routine1, NULL);
    --thread_cnt;
  }

  // wait for all the threads to finish
  while(count != thread_cnt)
  {
    sleep(2);
  }
  
  return 0;
}

	

Actual Results:  Running this program a number of times will result in an
abort in a random iteration of a random thread.

Expected Results:  Should run to completion with no abort.

Additional info:

The same program will run successfully on the 32 bit gcc 2.96 compiler.

Comment 1 Preston Brown 2002-01-07 20:17:31 UTC
Are you using the original gcc that shipped with Red HAt Linux 7.1/ia64 or the 
errata compiler?

Have you tried using Red Hat Linux 7.2/ia64?


Comment 2 hackenyo 2002-01-07 21:38:50 UTC
  We are using the original compiler of Redhat 7.1. By "errata compiler", are
you referring to the ia64 specific patches found in the 2001-10-19  glibc-common
(RHBA-2001-121) GNU C Library bugfix update found on
http://www.redhat.com/support/errata/rh71-errata.html? If not, could you point
out where I might find it?

Thanks for responding,
Dave

Comment 3 Preston Brown 2002-01-11 21:17:08 UTC
My mistake, there has not been an errata compiler release for Red Hat Linux 7.1
on the Itanium platform.

The gcc engineers are looking into this problem.



Comment 4 hackenyo 2002-01-11 21:25:18 UTC
We are moving ahead with installing and supporting Redhat 7.2. I will let you
know if the newer compiler fixes this mt problem.
Dave

Comment 5 hackenyo 2002-01-25 17:31:35 UTC
We have installed Redhat 7.2 and the test case still fails. Please let us know
if there will be a public patch provided for this problem. It will have direct
bearing on our ability to support multithreaded builds on this platform.

Comment 6 Jakub Jelinek 2002-03-13 20:58:32 UTC
The following patch should fix it:
2002-03-13  Jakub Jelinek  <jakub>

        * config/ia64/frame-ia64.c (execute_one_ia64_descriptor): Don't
        use static variables.
        (__build_ia64_frame_state): Add automatic region_header variable,
        initialize it and pass address of it to execute_one_ia64_descriptor.

--- gcc/config/ia64/frame-ia64.c.jj     Wed Mar 13 14:18:10 2002
+++ gcc/config/ia64/frame-ia64.c        Wed Mar 13 15:42:13 2002
@@ -650,20 +650,18 @@ init_ia64_unwind_frame (frame)
    the return value is a pointer to the start of the next descriptor.  */

 static void *
-execute_one_ia64_descriptor (addr, frame, len)
+execute_one_ia64_descriptor (addr, frame, len, header)
      void *addr;
      ia64_frame_state *frame;
      long *len;
+     unwind_record *header;
 {
   unwind_record r;
-  /* The last region_header.  Needed to distinguish between prologue and body
-     descriptors.  Also needed for length of P4 format.  */
-  static unwind_record region_header;
   ia64_reg_loc *loc_ptr = NULL;
   int grmask = 0, frmask = 0;

   *len = -1;
-  addr = get_unwind_record (&region_header, &r, addr);
+  addr = get_unwind_record (header, &r, addr);

   /* Process it in 2 phases, the first phase will either do the work,
      or set up a pointer to the records we care about
@@ -674,7 +672,7 @@ execute_one_ia64_descriptor (addr, frame
       case prologue:
       case body:
        *len = r.record.r.rlen;
-       memcpy (&region_header, &r, sizeof (unwind_record));
+       memcpy (header, &r, sizeof (unwind_record));
        break;
       case prologue_gr:
         {
@@ -707,7 +705,7 @@ execute_one_ia64_descriptor (addr, frame
              frame->pr.loc_type  = IA64_UNW_LOC_TYPE_GR;
              frame->pr.l.regno = reg++;
            }
-         memcpy (&region_header, &r, sizeof (unwind_record));
+         memcpy (header, &r, sizeof (unwind_record));
          break;
        }
       case mem_stack_f:
@@ -1263,6 +1261,7 @@ __build_ia64_frame_state (pc, frame, bsp
   void *pc_base;
   int pc_offset;
   struct unwind_info_ptr *unw_info_ptr;
+  unwind_record region_header;

   entry = find_fde (pc, &pc_base);
   if (!entry)
@@ -1277,6 +1276,7 @@ __build_ia64_frame_state (pc, frame, bsp
   init_ia64_unwind_frame (frame);
   frame->my_bsp = bsp;
   frame->my_sp = sp;
+  region_header.type = prologue;

   /* Stop when we get to the end of the descriptor list, or if we
      encounter a region whose initial offset is already past the
@@ -1285,7 +1285,7 @@ __build_ia64_frame_state (pc, frame, bsp
   while (addr < end && pc_offset > region_offset)
     {
       /* First one must be a record header.  */
-      addr = execute_one_ia64_descriptor (addr, frame, &len);
+      addr = execute_one_ia64_descriptor (addr, frame, &len, &region_header);
       if (len > 0)
         {
          region_offset += last_region_size;

I'll build gcc-2.96-107 later tonight.
Note that this file has been removed shortly before gcc 3.0 was released,
so this is not relevant to 3.0.x or 3.1.

Comment 7 hackenyo 2002-03-18 23:31:40 UTC
Hello,
  This is great news. I have some questions about availablity. Will we have to
wait for the next rev of the RedHat Installation package (7.3?) to see the fix
for this multithreaded issue? Or can one download the next "fixed" compiler? If
we can get the "fixed", what version would it be (gcc-2.96-107?) and when would
it be available? We are interested in providing our customers support for
multithreaded builds as soon as possible. Please let us know about when and how
your support for this multithread capable ia64 compiler will be available.
Thanks for your help in this matter,
Dave

Comment 8 Bill Nottingham 2002-07-26 21:47:26 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2002-055.html