Description of problem:
DB2 is crashing intermittently on the machine. This problem only happens if DB2 is configured to use Window's authetication via winbind. If Window's authentication is disabled, the problem does not occur.
Intermittently, random, multiple times per day.
Steps to Reproduce:
Customer is running DB2 9.5 on a RHEL 5.4 machine. They have it configured to use winbind for Windows authentication. During group lookups, DB2 will crash intermittently.
DB2 crashes, indicating that it was in the pthread library called from the winbind library.
DB2 should not crash.
Additional info: Our DB2 support team got a stack trace at the time of the failure:
Signal #7 (SIGBUS): si_addr is 0x0000000000000000, si_code is 0x00000080
(SI_KERNEL:Send by kernel.)
---FUNC-ADDR---- ------FUNCTION + OFFSET------
00002B7B3EADAF7F ossDumpStackTraceEx + 0x01f7
00002B7B3EAD6ABA _ZN11OSSTrapFile6dumpExEmiP7siginfoPvm + 0x00b4
00002B7B3EAD6B81 _ZN11OSSTrapFile4dumpEmiP7siginfoPv + 0x0009
00002B7B3B04D103 sqlo_trce + 0x03f3
00002B7B3B08CFC5 sqloEDUCodeTrapHandler + 0x0101
0000003303E0E930 address: 0x0000003303E0E930 ; dladdress:
0x0000003303E00000 ; offset in lib: 0x000000000000E930 ;
00002B7B5C1426B0 address: 0x00002B7B5C1426B0 ; dladdress:
0x00002B7B5C141000 ; offset in lib: 0x00000000000016B0 ;
00002B7B5C14276D read_reply + 0x003d
00002B7B5C142803 winbindd_get_response + 0x0033
00002B7B5C142E70 winbindd_request_response + 0x0040
00002B7B5C143E65 _nss_winbind_getpwnam_r + 0x00a5
0000003303299215 getpwnam_r + 0x00a5
00002B7B39DC6C9E sqloGetUserAttribByName + 0x00ec
_Z30sqloGetUserAttribByNameWrapperPcP21SQLO_USER_ATTRIB_DATA + 0x0006
It appears that SIGBUS is getting returned after going into the pthread library. This led to concerns that the nss_winbind library is not thread safe (DB2 is multithreaded). Hits on the Samba mailing list seem to indicate that it may not be:
This problem is impacting the customer greatly as they are forced to restart DB2 numerous times during the day.
Version-Release number of selected component (if applicable):
Can you please test with the samba3x packages?
In samba3x we have thread safety patches for nss_winbindd.
This bug can't be addressed in samba only in samab3x.
We created a separate bug for that https://bugzilla.redhat.com/show_bug.cgi?id=599051
While testing with the samba3x packages, could we get more information as to why this bug can't be addressed in samba and only in samba3x? IBM will need to explain in detail to the customer why that is.
Thanks for the responses!
We do not have enough infrastructure in 3.0.x to address this problem.