Bug 82433

Summary: Pthread program brings Seg.fault on only IA64.
Product: [Retired] Red Hat Linux Reporter: Shinya Narahara <naraha_s>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: drepper, fweimer
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-11-10 23:24:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shinya Narahara 2003-01-22 03:14:58 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [ja] (WinNT; U)

Description of problem:
Simple program using pthread lib occurs Segmentation fault.
The program is multi thread and multi process.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.compile the test program.
2.run it
3.
    

Actual Results:  Segmentation fault on malloc() or free().

Expected Results:  work forever.

Additional info:

The test program is added last.
The components are:
    kernel: 2.4.9 or 2.4.18
    glibc : 2.2.4-19.3
    glibc-devel :2.2.4-19.3

This program works fine on i386 platform even if it has 4 processors.
If you comment out the DummyFork() function from this program,
It may work fine.
We have many huge products by using both multithread/multiprocess,
but can't port them onto IA64 platform by this issue.
Is this issue of this test program specific?
Do you have any countermeasures fot this issue?

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <string.h>

#define THREADS 10
#define _DEBUG_BLOCKS	0x100
#define LOCAL_RAND_MAX  0x10000

void DummyFork( void );		// fork(),waitpid(),sleep()
void ChunkTest( void );		// malloc(),free(),memset()/new,delete
void *TestThread(void *pVoid);	// thread function

int main()
{
	int iTest = 0;

	// Make test threads
	for(iTest = 0; iTest < THREADS; iTest++)
	{
		int iIsCreate = 0;
		pthread_t tTestThread;
		iIsCreate = pthread_create(&tTestThread, 0, TestThread, (void *)iTest);
		if(iIsCreate != 0)
			printf("%dth Thread create error.\n", iTest);
	}

	// Main loop, just do malloc/free and fork.
	TestThread( (void*)-1 );
	return 0;
}


void *TestThread(void *pVoid)
{
	while(1)
	{
		printf( "thread No= %d\n", (int)pVoid );
		ChunkTest();  // do malloc/memset/free temporarily
		DummyFork();  // do fork/wait temporarily
	}
	return NULL;  // Dummy
}


// This is just dummy func. 
// Make child process, sleep 1 sec, and return when child process is done.
void DummyFork()
{
	pid_t tPid = 0;
	tPid = fork();
	if(tPid == 0)
	{
		// Child
		sleep(1);
		exit(0);
	}
	if(tPid > 0)
		waitpid(tPid, NULL, 0);
}


// This func do "malloc()" and "free()" randomly.
void ChunkTest( void )
{
	int iIndex = 0;
	void *pTest[_DEBUG_BLOCKS] = { [0 ... _DEBUG_BLOCKS-1] = NULL };

	srand((unsigned int)time(NULL));	// Initialize
	for(iIndex = 0; iIndex < _DEBUG_BLOCKS; iIndex++)
	{
		unsigned long uSize = 0;
		uSize = (unsigned long)(1.0*LOCAL_RAND_MAX*(double)rand()/RAND_MAX+1.0);
		pTest[iIndex] = (void *)malloc(uSize);
		if(pTest[iIndex] != NULL)
			memset(pTest[iIndex], 0x00, uSize);
	}

	for(iIndex = 0; iIndex < _DEBUG_BLOCKS; iIndex++)
		if(pTest[iIndex] != NULL) {
			free(pTest[iIndex]);
		}
}

Comment 1 Ulrich Drepper 2003-04-23 18:20:05 UTC
What release are you using?  What updates?  This is obviously no retail product
so I need more information.

Comment 2 Shinya Narahara 2003-04-24 00:07:15 UTC
I've already written the version of retail product,
Red Hat Linux 7.2 for Itanium processor.
Actually, this issue occures on retail RH72 and
all of updated kernel/glibc, from 
ftp://updates.redhat.com/7.2/en/os/ia64/

Just try the test program above, you can see this issue soon.

If you've tested the program and not seen any errors,
please tell us your version of retail product, or kernel/glibc.

We guess this issue is caused by "thread un-safe" system call.
So, our silly countermeasure is add pthread_mutex_lock()/unlock()
before/after all of system calls that the test program calls. 
Then it works fine, although the performance is not pretty good...

I have some questions about this bugzilla #82433, from someone
who have same issue. This is not so local.


Comment 3 Ulrich Drepper 2003-10-03 21:37:14 UTC
Have you tried AS2.1 or even better, RHEL3?  RHEL3 features a new thread
library, the only one for which we can aim at providing standard compliance.

Comment 4 Shinya Narahara 2003-10-07 00:41:10 UTC
We've tried it on RHEL3 Beta2, kernel-2.4.21-1.1931.2.399.ent,
glibc-2.3.2-74, gcc-3.2.3-16. It seems to work fine. Thanks for
your information.

However the program with 100 threads, is very heavy even if on
8CPU machine... ;-)



Comment 5 Ulrich Drepper 2003-11-10 23:24:36 UTC
What do you mean by "heavy"?  Do you see delays?

I have fixed one bug which is especially noticable on machines with >=
4 processors after beta 2.  It is in the RHEL release.  Please give
that version a try.  In any case, the original problem is fixed in
RHEL3 which is the first version we truly can say threads are working
(especially on ia64).  We won't backport any of the changes.  So I
close the bug.