Bug 506772 - SAF Test lck: saLckResourceUnlockAsync/3
SAF Test lck: saLckResourceUnlockAsync/3
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: openais (Show other bugs)
14
All Linux
low Severity low
: ---
: ---
Assigned To: Ryan O'Hara
Fedora Extras Quality Assurance
: Triaged
Depends On:
Blocks: 561190
  Show dependency treegraph
 
Reported: 2009-06-18 11:56 EDT by Jan Friesse
Modified: 2011-10-03 11:33 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 561190 (view as bug list)
Environment:
Last Closed: 2011-10-03 11:33:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
3.c (7.04 KB, text/x-csrc)
2009-06-18 11:56 EDT, Jan Friesse
no flags Details
Return SA_AIS_OK when unlocking a pending lock request. (473 bytes, patch)
2009-06-29 19:10 EDT, Ryan O'Hara
no flags Details | Diff

  None (edit)
Description Jan Friesse 2009-06-18 11:56:49 EDT
Created attachment 348509 [details]
3.c

Description of problem:
SUBJ doesn't work and falls with valgrind

Version-Release number of selected component (if applicable):
Trunk

How reproducible:
Run included file

Steps to Reproduce:
1.
2.
3.
  
Actual results:
Fall in valgrind

Expected results:
Pass test

Additional info:
This test looks like have two problems:
1. saLckResourceLock returns 5 (TIMEOUT). Really don't know why
2. Valgrind detect error in saLckResourceUnlockAsync
Comment 1 Ryan O'Hara 2009-06-29 17:44:13 EDT
1. I fixed the problem with the return value. The patch will be attached below.
2. What error(s) do you see when using valgrind? I don't see any valgrind errors related to any of the lock service API calls.
Comment 2 Ryan O'Hara 2009-06-29 19:10:29 EDT
Created attachment 349889 [details]
Return SA_AIS_OK when unlocking a pending lock request.

This should fix the problem with the return value.
Comment 3 Jan Friesse 2009-06-30 06:04:14 EDT
Ryan,
second need info, second bad news. Valgrind still shows error:

[root@node-06 saLckResourceUnlockAsync]# valgrind ./3.test
==5275== Memcheck, a memory error detector.
==5275== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==5275== Using LibVEX rev 1884, a library for dynamic binary translation.
==5275== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==5275== Using valgrind-3.4.1, a dynamic binary instrumentation framework.
==5275== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==5275== For more details, rerun with: -v
==5275==
[DEBUG]: saLckInitialize
==5275== Syscall param socketcall.sendmsg(msg.msg_iov[i]) points to uninitialised byte(s)
==5275==    at 0x4046451: sendmsg (in /lib/libpthread-2.10.1.so)
==5275==    by 0x41C8132: coroipcc_service_connect (coroipcc.c:642)
==5275==    by 0x4035753: saLckInitialize (lck.c:194)
==5275==    by 0x8048C14: main (3.c:125)
==5275==  Address 0xbec5221c is on thread 1's stack
[DEBUG]: saLckResourceOpen
[DEBUG]: saLckResourceLock
==5275==
==5275== Syscall param socketcall.sendmsg(msg.msg_iov[i]) points to uninitialised byte(s)
==5275==    at 0x4046451: sendmsg (in /lib/libpthread-2.10.1.so)
==5275==    by 0x41C8132: coroipcc_service_connect (coroipcc.c:642)
==5275==    by 0x4036955: saLckResourceLock (lck.c:913)
==5275==    by 0x8048DC3: main (3.c:155)
==5275==  Address 0xbec5207c is on thread 1's stack
[DEBUG]: saLckResourceLock
==5275==
==5275== Thread 2:
==5275== Syscall param socketcall.sendmsg(msg.msg_iov[i]) points to uninitialised byte(s)
==5275==    at 0x4046478: sendmsg (in /lib/libpthread-2.10.1.so)
==5275==    by 0x41C8132: coroipcc_service_connect (coroipcc.c:642)
==5275==    by 0x4036955: saLckResourceLock (lck.c:913)
==5275==    by 0x8048A03: lock_thread (3.c:48)
==5275==    by 0x403E934: start_thread (in /lib/libpthread-2.10.1.so)
==5275==    by 0x413282D: clone (in /lib/libc-2.10.1.so)
==5275==  Address 0x53ccfbc is on thread 2's stack
[DEBUG]: saLckResourceUnlockAsync
==5275==
==5275== Thread 1:
==5275== Invalid write of size 4
==5275==    at 0x403546B: list_del (list.h:71)
==5275==    by 0x4035543: lckLockIdInstanceFinalize (lck.c:113)
==5275==    by 0x4035E0A: saLckDispatch (lck.c:494)
==5275==    by 0x8049083: main (3.c:213)
==5275==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
==5275==
==5275== Process terminating with default action of signal 11 (SIGSEGV)
==5275==  Access not within mapped region at address 0x4
==5275==    at 0x403546B: list_del (list.h:71)
==5275==    by 0x4035543: lckLockIdInstanceFinalize (lck.c:113)
==5275==    by 0x4035E0A: saLckDispatch (lck.c:494)
==5275==    by 0x8049083: main (3.c:213)
==5275==  If you believe this happened as a result of a stack overflow in your
==5275==  program's main thread (unlikely but possible), you can try to increase
==5275==  the size of the main thread stack using the --main-stacksize= flag.
==5275==  The main thread stack size used in this run was 10485760.
==5275==
==5275== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 19 from 1)
==5275== malloc/free: in use at exit: 844 bytes in 11 blocks.
==5275== malloc/free: 14 allocs, 3 frees, 952 bytes allocated.
==5275== For counts of detected errors, rerun with: -v
==5275== Use --track-origins=yes to see where uninitialised values come from
==5275== searching for pointers to 11 not-freed blocks.
==5275== checked 18,961,252 bytes.
==5275==
==5275== LEAK SUMMARY:
==5275==    definitely lost: 0 bytes in 0 blocks.
==5275==      possibly lost: 136 bytes in 1 blocks.
==5275==    still reachable: 708 bytes in 10 blocks.
==5275==         suppressed: 0 bytes in 0 blocks.
==5275== Rerun with --leak-check=full to see details of leaked memory.
Killed

Problem is, it's not 100% reproducible (I must run it 20 times before I reached this)
Comment 4 Steven Dake 2009-09-28 11:43:28 EDT
Honza,

Please retry with openais 1.1.0.

Regards
-steve
Comment 5 Jan Friesse 2009-09-29 05:03:06 EDT
Retry with today TRUNK of corosync and openais,
[root@node-06 saLckResourceUnlockAsync]# valgrind ./3.test
==23815== Memcheck, a memory error detector.
==23815== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==23815== Using LibVEX rev 1884, a library for dynamic binary translation.
==23815== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==23815== Using valgrind-3.4.1, a dynamic binary instrumentation framework.
==23815== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==23815== For more details, rerun with: -v
==23815==
==23815== Syscall param socketcall.sendmsg(msg.msg_iov[i]) points to uninitialised byte(s)
==23815==    at 0x4046451: sendmsg (in /lib/libpthread-2.10.1.so)
==23815==    by 0x41C8035: coroipcc_service_connect (coroipcc.c:697)
==23815==    by 0x4035715: saLckInitialize (lck.c:191)
==23815==    by 0x8048C14: main (3.c:125)
==23815==  Address 0xbeab9204 is on thread 1's stack
==23815==
==23815== Syscall param socketcall.sendmsg(msg.msg_iov[i]) points to uninitialised byte(s)
==23815==    at 0x4046451: sendmsg (in /lib/libpthread-2.10.1.so)
==23815==    by 0x41C8035: coroipcc_service_connect (coroipcc.c:697)
==23815==    by 0x40368E5: saLckResourceLock (lck.c:884)
==23815==    by 0x8048DC3: main (3.c:155)
==23815==  Address 0xbeab9064 is on thread 1's stack
==23815==
==23815== Thread 2:
==23815== Syscall param socketcall.sendmsg(msg.msg_iov[i]) points to uninitialised byte(s)
==23815==    at 0x4046478: sendmsg (in /lib/libpthread-2.10.1.so)
==23815==    by 0x41C8035: coroipcc_service_connect (coroipcc.c:697)
==23815==    by 0x40368E5: saLckResourceLock (lck.c:884)
==23815==    by 0x8048A03: lock_thread (3.c:48)
==23815==    by 0x403E934: start_thread (in /lib/libpthread-2.10.1.so)
==23815==    by 0x413282D: clone (in /lib/libc-2.10.1.so)
==23815==  Address 0x53ccfc4 is on thread 2's stack
==23815==
==23815== Thread 1:
==23815== Invalid write of size 4
==23815==    at 0x403543B: list_del (list.h:71)
==23815==    by 0x4035513: lckLockIdInstanceFinalize (lck.c:113)
==23815==    by 0x4035DD2: saLckDispatch (lck.c:491)
==23815==    by 0x8049083: main (3.c:213)
==23815==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
==23815==
==23815== Process terminating with default action of signal 11 (SIGSEGV)
==23815==  Access not within mapped region at address 0x4
==23815==    at 0x403543B: list_del (list.h:71)
==23815==    by 0x4035513: lckLockIdInstanceFinalize (lck.c:113)
==23815==    by 0x4035DD2: saLckDispatch (lck.c:491)
==23815==    by 0x8049083: main (3.c:213)
==23815==  If you believe this happened as a result of a stack overflow in your
==23815==  program's main thread (unlikely but possible), you can try to increase
==23815==  the size of the main thread stack using the --main-stacksize= flag.
==23815==  The main thread stack size used in this run was 10485760.
==23815==
==23815== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 19 from 1)
==23815== malloc/free: in use at exit: 828 bytes in 11 blocks.
==23815== malloc/free: 14 allocs, 3 frees, 928 bytes allocated.
==23815== For counts of detected errors, rerun with: -v
==23815== Use --track-origins=yes to see where uninitialised values come from
==23815== searching for pointers to 11 not-freed blocks.
==23815== checked 18,961,268 bytes.
==23815==
==23815== LEAK SUMMARY:
==23815==    definitely lost: 0 bytes in 0 blocks.
==23815==      possibly lost: 136 bytes in 1 blocks.
==23815==    still reachable: 692 bytes in 10 blocks.
==23815==         suppressed: 0 bytes in 0 blocks.
==23815== Rerun with --leak-check=full to see details of leaked memory.
Killed

So yes, bug is still there.
Comment 6 Ryan O'Hara 2009-09-29 14:30:05 EDT
It is possible that this is due to differences in the type definitions in saAis.h. Steve and I discussed this while I was testing the MSG service with saftest. If I recall, saftest uses its own type definitions for various integers, etc. and I believe they were different that the type definitions that the openais services are compiled with. This caused problems with a few tests, and Steve and I wondered if perhaps it is the cause of subtle problems like this.

This problem seen when using valgrind only seems to exist on i386 architecture. In other words, I cannot recreate it on x86_64. Steve, do you remember what/how we fixed the header file in saftest to make this work?
Comment 7 Bug Zapper 2009-11-16 05:14:22 EST
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 8 Bug Zapper 2010-07-30 06:41:16 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 9 Ryan O'Hara 2011-10-03 11:33:51 EDT
Closing WONTFIX since openais will be going away in F17.

Note You need to log in before you can comment on or make changes to this bug.