Bug 1825159

Summary: ltrace occasionally reports unexpected SIGILL on multi-threaded testcase on rhel-8 aarch64
Product: Red Hat Enterprise Linux 8 Reporter: Edjunior Barbosa Machado <emachado>
Component: ltraceAssignee: DJ Delorie <dj>
ltrace sub component: system-version QA Contact: qe-baseos-tools-bugs
Status: CLOSED WONTFIX Docs Contact:
Severity: unspecified    
Priority: unspecified CC: emachado, mcermak, ohudlick
Version: 8.3Keywords: Triaged
Target Milestone: rc   
Target Release: 8.0   
Hardware: aarch64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-04 18:12:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Edjunior Barbosa Machado 2020-04-17 09:05:12 UTC
Description of problem:
rhel-8 ltrace (ltrace-0.7.91-28.el8) sometimes traces an unexpected SIGILL along with "Couldn't read registers" messages when tracing the testcase below on aarch64:

[root@hpe-apollo-cn99xx-14-vm-10 random-apps]# cat mthd2.c 
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/types.h>
#include <signal.h>

#define NTHD 60

int thd_no;


long sub_func(void)
{
	uid_t uid;

	printf("sub-thread created: %d\n", ++thd_no);

	for (;;) {
		uid = getuid();
		sleep(1);
	}

	return (long)uid;
}

void *sub_thd(void *c)
{
	sub_func();
}

main(int argc, char *argv[])
{
	int i;
	pthread_t thd[NTHD];

	printf("test start...\n");

	for (i = 0; i < NTHD; i++) {
		pthread_create(&thd[i], NULL, sub_thd, NULL);
	}
}
[root@hpe-apollo-cn99xx-14-vm-10 random-apps]# gcc mthd2.c -lpthread -o mthd2
mthd2.c:31:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
 main(int argc, char *argv[])
 ^~~~
[root@hpe-apollo-cn99xx-14-vm-10 random-apps]# ltrace -f ./mthd2 
(...)
[pid 1646438] pthread_create(0xffffe89b1830, 0, 0x4007e8, 0 <unfinished ...>
[pid 1646497] printf(0x400928, 59, 0xffff6c41f1e0, 0x6ccc1f9df9b51a41 <unfinished ...>
[pid 1646496] getuid(23, 0, 0xebf26c795e99b600, 0xffff6cc2f8e0sub-thread created: 59
 <unfinished ...>
[pid 1646497] <... printf resumed> )                                                                                                = 23
[pid 1646497] getuid(23, 0, 0xebf26c795e99b600, 0xffff6c41f8e0 <unfinished ...>
[pid 1646438] <... pthread_create resumed> )                                                                                        = 0
get_instruction_pointer: Couldn't read registers of 1646498.
[pid 1646498] --- SIGILL (Illegal instruction) ---
get_instruction_pointer: Couldn't read registers of 1646488.
syscall_p: Couldn't read registers of 1646488.
[pid 1646496] +++ exited (status 0) +++
[pid 1646494] +++ exited (status 0) +++
[pid 1646491] +++ exited (status 0) +++
[pid 1646487] +++ exited (status 0) +++
(...)
[pid 1646438] +++ exited (status 0) +++
[root@hpe-apollo-cn99xx-14-vm-10 random-apps]#

Version-Release number of selected component (if applicable):
ltrace-0.7.91-28.el8.aarch64
RHEL-8.3.0-20200415.n.0

How reproducible:
Intermittently

Steps to Reproduce:
1. gcc mthd2.c -lpthread -o mthd2
2. ltrace -f ./mthd2

Comment 6 DJ Delorie 2021-10-04 18:12:00 UTC
I doubt we will have the resources to ever fix this, and given the lack of upstream resources, I doubt anyone else will fix it either.  As there doesn't appear to be a customer case attached to this, I resort to closing it WONTFIX.  Please update the test case to reflect that this spurious failure may happen in the future so this bug doesn't get continuously re-opened ;-)