Bug 579081

Summary: Openais client blocked indefinitely on semaphore when the server goes down
Product: Red Hat Enterprise Linux 5 Reporter: lech.pofelski
Component: openaisAssignee: Jan Friesse <jfriesse>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.5CC: bzeranski, cluster-maint, iannis, jkortus, philippe.eveque, sdake, snagar, tao
Target Milestone: rcKeywords: ZStream
Target Release: 5.6   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openais-0.80.6-20.el5 Doc Type: Bug Fix
Doc Text:
When an Openais client process sent an event to the server (aisexec), the server stopped working. When the client process tried to send the event using the saEvtEventPublish() function, it hung indefinitely on a semaphore (semop()) operation. Even when the server was restarted, the blocked client process was still blocked. With this update, the client process no longer hangs indefinitely and is unblocked after a set period of time.
Story Points: ---
Clone Of:
: 596359 (view as bug list) Environment:
Last Closed: 2011-01-13 23:56:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 596359, 603615, 604795    
Attachments:
Description Flags
semop replaced by timeouted semop
none
Fix for segmentation violation in saEvtDispatch()
none
Proposed patch for Z-stream
none
Proposed patch for Z-stream - try 2
none
Proposed patch for evt dispatch
none
Proposed patch for Z-stream - try 3
none
Improved proposal for evt dispatch fix - please remove printf(s) after testing!
none
evt.c - testing version none

Description lech.pofelski 2010-04-02 13:59:58 UTC
Description of problem:
Openais client process is going to send un event to the server (aisexec).
The communication channel with the server is open via saEvtChannelOpen().
Before sending the event to server, the server is going down (stopped
or killed).
When the client process tries to send the event using saEvtEventPublish(),
it hangs indefinitely on a semop() operation in openais_msg_send_reply_receive().
Even when the server aisexec is restarted, the blocked client process is NOT
unblocked.

This situation is not acceptable, esp. for real-time client processes, but 
even a non-real-time process should not block indefinitely.

The client stack shows the following blocked thread:
...
Thread 15 (Thread 0xb7f43b90 (LWP 23834)):
#0  0x003e5410 in __kernel_vsyscall ()
#1  0x00a265bb in semop () from /lib/libc.so.6
#2  0x001fd94a in openais_msg_send_reply_receive () from /usr/lib/openais/libSaEvt.so.2
#3  0x002011b0 in saEvtEventPublish () from /usr/lib/openais/libSaEvt.so.2
...

Version-Release number of selected component (if applicable):
openais-0.80.6-6 (RHEL5.4) and openais-0.80.6-16 (RHEL5.5)

How reproducible:
 e.g. using the openais test program publish.c

Steps to Reproduce:
- editthe openais test program publish.c
- add sleep(10) before saEvtEventPublish(), to be able to stop/kill 
  aisexec before this function is executed
- start aisexec
- start 'publish' executable
- when 'publish' executable is waiting on sleep(10), kill -9 aisexec process
- the 'publish' executable will hang 
- restart aisexec
- the 'publish' will still remain hung
- check details of 'publish' threads stack using a debugger (e.g. gdb)

  
Actual results:
 client hung indefinitely on semop(), even when servers is restarted

Expected results:
 - client must never hang indefinitely
   - either unblocked after some (short) timeout (e.g. 1 second) - in
     such a case it should close and reopen the communication channel
     with the server
   - or unblocked when the server is restarted, the communication 
     channel with the server is still valid and the blocked event is finally
     sent to the server without any additional action of the client

Additional info:

Comment 1 Steven Dake 2010-04-02 17:01:14 UTC
openais does not support restart of the server with connection recovery.  In the case aisexec is stopped and restarted, clients should not block but return immediately, returning SA_AIS_ERR_LIBRARY.

I can't think of a valid use case for the type of model described in expected results except live upgrade of running cluster nodes.  After long discussion at Red Hat, we don't support online upgrades of cluster software, but only rolling upgrades in Y versions versions.

Even in upstream corosync + openais service engines, this is not a model the community supports.  We require a new <area>_initialize operation of applications using corosync if the corosync executive is restarted. 

I will fix the blocking semop though, your right that is unacceptable.

Regards
-steve

Comment 2 lech.pofelski 2010-04-27 09:08:19 UTC
Created attachment 409403 [details]
semop replaced by timeouted semop

Comment 3 lech.pofelski 2010-04-27 09:09:57 UTC
Created attachment 409404 [details]
Fix for segmentation violation in saEvtDispatch()

Comment 4 lech.pofelski 2010-04-27 09:13:49 UTC
(In reply to comment #1)
> openais does not support restart of the server with connection recovery.  In
> the case aisexec is stopped and restarted, clients should not block but return
> immediately, returning SA_AIS_ERR_LIBRARY.
> I can't think of a valid use case for the type of model described in expected
> results except live upgrade of running cluster nodes.  After long discussion at
> Red Hat, we don't support online upgrades of cluster software, but only rolling
> upgrades in Y versions versions.
> Even in upstream corosync + openais service engines, this is not a model the
> community supports.  We require a new <area>_initialize operation of
> applications using corosync if the corosync executive is restarted. 
> I will fix the blocking semop though, your right that is unacceptable.
> Regards
> -steve    


Hello Steve,

OK for your proposal of the fix, i.e. return of SA_AIS_ERR_LIBRARY by clients
when the server is stopped or killed.

I have "simulated" this fix by using semop() with timeout.
This was successful for the client "producer", 
but in the client "consumer" I have encountered a problem
with segmentation violation in saEvtDispatch(), when
the dispatcher was trying to dispatch an empty message,
just after the server was killed. I have made
a quick fix for this (a test of a NULL pointer in saEvtDispatch()). 
Of course, my "fixes" are just workarounds and your fix
will probably have a clean behavior, without any impact on dispatch.
Anyway, I attach the source files I have modified in case
you want to have a look on them. 

Regards,

Lech

Comment 5 Steven Dake 2010-05-04 20:36:16 UTC
Honza,

Please look at resolving this for whitetank.

Comment 8 Jan Friesse 2010-05-24 13:36:26 UTC
Created attachment 416124 [details]
Proposed patch for Z-stream

Patch is based on very same idea. Of course, semtimedop is not available on *BSD systems, so new code is #ifdefed.

Patch is intended for Z-stream

Comment 9 lech.pofelski 2010-05-26 12:52:16 UTC
Hello, 

Thanks for providing the unified diff for changes. 
I would like to perform some tests with the new source - could you provide me 
also the URL to the sources over which the fix was done? 
Or, maybe you have and rpm installable on RHEL 5.5 ?

By the way, I have noticed that in your patch only util.c is modified,
while I had reported also a problem with saEvtDispatch() in evt.c
Does your fix solves also the problem with saEvtDispatch().

Thanks in advance,

Best Regards,

Lech Pofelski

Comment 10 Jan Friesse 2010-05-26 13:30:07 UTC
Created attachment 416832 [details]
Proposed patch for Z-stream - try 2

Patch is based on very same idea. Of course, semtimedop is not available on
*BSD systems, so new code is #ifdefed.

Patch is intended for Z-stream 

This one also handles very low probability problem in openais_dispatch_recv.

Comment 11 Jan Friesse 2010-05-26 13:31:46 UTC
Created attachment 416834 [details]
Proposed patch for evt dispatch

Fixes the second problem found in 579081

Also fixes some other services AMF, CFG, EVT, LCK, MSG. Other (CPG, ...) seems to handle dispatch_avail == -1 correctly.

Comment 12 Jan Friesse 2010-05-26 13:41:12 UTC
(In reply to comment #9)
> Hello, 
> 
> Thanks for providing the unified diff for changes. 
> I would like to perform some tests with the new source - could you provide me 
> also the URL to the sources over which the fix was done? 
> Or, maybe you have and rpm installable on RHEL 5.5 ?
> 
> By the way, I have noticed that in your patch only util.c is modified,
> while I had reported also a problem with saEvtDispatch() in evt.c
> Does your fix solves also the problem with saEvtDispatch().
> 
> Thanks in advance,
> 
> Best Regards,
> 
> Lech Pofelski    

Hi,
it is based on top of whitetank branch of upstream openais. I was able to apply it on openais 0.80.6-16 srpm (you can try honzaf.fedorapeople.org/openais-0.80.6-16.1.jf.src.rpm) without problem.

Included are two new patches. First is better version of first patch and second one should fix problem with dispatch you had (but I was never able to get segfault.

Comment 13 Jan Friesse 2010-05-26 13:57:54 UTC
Created attachment 416843 [details]
Proposed patch for Z-stream - try 3

Patch is based on very same idea. Of course, semtimedop is not available on
*BSD systems, so new code is #ifdefed.

Patch is intended for Z-stream 

This one also handles very low probability problem in openais_dispatch_recv.

(Previous patch was incorrect old version)

Comment 14 Steven Dake 2010-05-26 17:33:06 UTC
patch merged upstream.

Comment 15 lech.pofelski 2010-05-28 14:02:48 UTC
I have installed a patched source rpm (using make install) and run my application level tests.
Unfortunately, my "consumer" was not receiving the events from "producer" anymore.
So I tried, using a dychotomic approach find out which of the 30 patches defined in the spec file
is responsible for this regression.
I found that up to the patch 28 (included) the regression does not appear, i.e. the consumer receives events
send by producer. The patch 29 is the first which introduces the regression.
Please, try to understand where the problem comes from.
Let me know if you need I activate some openais traces or run some additional tests.

Regards,

Lech Pofelski

Comment 16 Jan Friesse 2010-05-31 08:08:58 UTC
Hi,
because you are able to compile/install openais by yourself.

Can you please try to install latest openais whitetank branch from svn (http://openais.org/doku.php?id=developers)? There is included patch 29 and for me, everything works as expected. I'm using publish/subscription from test directory for testing.

Also note, that Proposed patch for Z-stream - try 2 was incorrect (but in my src rpm is correct try 3) and had this behavior.

Regards,
  Honza

(In reply to comment #15)
> I have installed a patched source rpm (using make install) and run my
> application level tests.
> Unfortunately, my "consumer" was not receiving the events from "producer"
> anymore.
> So I tried, using a dychotomic approach find out which of the 30 patches
> defined in the spec file
> is responsible for this regression.
> I found that up to the patch 28 (included) the regression does not appear, i.e.
> the consumer receives events
> send by producer. The patch 29 is the first which introduces the regression.
> Please, try to understand where the problem comes from.
> Let me know if you need I activate some openais traces or run some additional
> tests.
> 
> Regards,
> 
> Lech Pofelski

Comment 17 lech.pofelski 2010-05-31 13:25:15 UTC
Hi,

I have retrieved the latest whitetank release from 

http://svn.fedorahosted.org/svn/openais/branches/whitetank/

Then, I have performed as usual make and make install (just changing
in Makefile.inc PREFIX from /usr/local to /usr, the same change I have done for
other relases).
Unfortunately, aisexec does not start. 
After:
 /usr/sbin/aisexec

there is no process aisexec started and no diagnostics appear in the command line.  I have even rebooted my test bed - no change.
How can I get some information from aisexec to know why it does not start?

Regards,

Lech

Comment 18 Jan Friesse 2010-05-31 13:33:52 UTC
I'm pretty sure that make install replaced your openais.conf file. Please try to restore that file with correct log output and run aisexec -f.

Regards,
  Honza

(In reply to comment #17)
> Hi,
> 
> I have retrieved the latest whitetank release from 
> 
> http://svn.fedorahosted.org/svn/openais/branches/whitetank/
> 
> Then, I have performed as usual make and make install (just changing
> in Makefile.inc PREFIX from /usr/local to /usr, the same change I have done for
> other relases).
> Unfortunately, aisexec does not start. 
> After:
>  /usr/sbin/aisexec
> 
> there is no process aisexec started and no diagnostics appear in the command
> line.  I have even rebooted my test bed - no change.
> How can I get some information from aisexec to know why it does not start?
> 
> Regards,
> 
> Lech

Comment 19 lech.pofelski 2010-05-31 14:17:51 UTC
Hi,

You are right, openais.conf was overridden.
So, I have replaced it by the one I used previously.
Tried to start /usr/sbin/aisexec -f, but this block infinitely,
however /usr/sbin/aisexec works.
Some good news - the 1st phase of my tests is OK (i.e. before killing aisexe).
After kill of aisexec, I get the core dump (same I have already seen), pointing
to saEvtDispatch:

"Thread-0" id=10 idx=0x38 tid=23418 lastJavaFrame=0xb58890f0

Stack 0: start=0xb5868000, end=0xb588a000, guards=0xb586d000 (ok), forbidden=0xb
586b000
Thread Stack Trace:
    at saEvtDispatch+664()@0xb7196f58
    -- Java stack --
    at com/hp/opencall/coam/ntf/consumerImpl/ConsumerImpl.dispatchNativeC(Lcom/h
p/opencall/ais/DispatchFlags;)I(Native Method)
    at com/hp/opencall/coam/ntf/consumerImpl/ConsumerImpl.dispatch(ConsumerImpl.
java:209)
    at com/hp/opencall/coam/ntf/test/Subscribe4jWithFailure.receiveNotifications
(Subscribe4jWithFailure.java:134)
    at com/hp/opencall/coam/ntf/test/Subscribe4jWithFailure.run(Subscribe4jWithF
ailure.java:162)
    at jrockit/vm/RNI.c2java(IIIII)V(Native Method)
    -- end of trace

Extended, platform specific info:
libc release: 2.5-stable
Elf headers:
libc       ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY: 0
0968fe0 PHOFF: 00000034 SHOFF: 00188b74 EF: 0x0 HS: 52 PS: 32 PHN; 10 SS: 40 SHN
: 75 STIDX: 74
libpthread ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY: 0
0aa5840 PHOFF: 00000034 SHOFF: 0001f474 EF: 0x0 HS: 52 PS: 32 PHN; 9 SS: 40 SHN:
 40 STIDX: 39
libjvm     ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY: 0
004c460 PHOFF: 00000034 SHOFF: 012caf68 EF: 0x0 HS: 52 PS: 32 PHN; 4 SS: 40 SHN:
 29 STIDX: 26

    **********************************************************
    *  If you see this dump, please go to                    *
    *  http://edocs.bea.com/jrockit/go2troubleshooting.html  *
    *  for troubleshooting information.                      *
    **********************************************************

===== END DUMP ===============================================================


Regards,

Lech

Comment 20 Jan Friesse 2010-05-31 14:41:05 UTC
Hi,
good news.

Please try to apply https://bugzilla.redhat.com/attachment.cgi?id=409404 if it helps or not.

btw. Did you changed Makefile.inc or not? Because by default, openais will install to /usr/local/sbin, so maybe you are running old version (I hope not, but ... for sure).

Regards,
  Honza

(In reply to comment #19)
> Hi,
> 
> You are right, openais.conf was overridden.
> So, I have replaced it by the one I used previously.
> Tried to start /usr/sbin/aisexec -f, but this block infinitely,
> however /usr/sbin/aisexec works.
> Some good news - the 1st phase of my tests is OK (i.e. before killing aisexe).
> After kill of aisexec, I get the core dump (same I have already seen), pointing
> to saEvtDispatch:
> 
> "Thread-0" id=10 idx=0x38 tid=23418 lastJavaFrame=0xb58890f0
> 
> Stack 0: start=0xb5868000, end=0xb588a000, guards=0xb586d000 (ok),
> forbidden=0xb
> 586b000
> Thread Stack Trace:
>     at saEvtDispatch+664()@0xb7196f58
>     -- Java stack --
>     at
> com/hp/opencall/coam/ntf/consumerImpl/ConsumerImpl.dispatchNativeC(Lcom/h
> p/opencall/ais/DispatchFlags;)I(Native Method)
>     at
> com/hp/opencall/coam/ntf/consumerImpl/ConsumerImpl.dispatch(ConsumerImpl.
> java:209)
>     at
> com/hp/opencall/coam/ntf/test/Subscribe4jWithFailure.receiveNotifications
> (Subscribe4jWithFailure.java:134)
>     at
> com/hp/opencall/coam/ntf/test/Subscribe4jWithFailure.run(Subscribe4jWithF
> ailure.java:162)
>     at jrockit/vm/RNI.c2java(IIIII)V(Native Method)
>     -- end of trace
> 
> Extended, platform specific info:
> libc release: 2.5-stable
> Elf headers:
> libc       ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY:
> 0
> 0968fe0 PHOFF: 00000034 SHOFF: 00188b74 EF: 0x0 HS: 52 PS: 32 PHN; 10 SS: 40
> SHN
> : 75 STIDX: 74
> libpthread ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY:
> 0
> 0aa5840 PHOFF: 00000034 SHOFF: 0001f474 EF: 0x0 HS: 52 PS: 32 PHN; 9 SS: 40
> SHN:
>  40 STIDX: 39
> libjvm     ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY:
> 0
> 004c460 PHOFF: 00000034 SHOFF: 012caf68 EF: 0x0 HS: 52 PS: 32 PHN; 4 SS: 40
> SHN:
>  29 STIDX: 26
> 
>     **********************************************************
>     *  If you see this dump, please go to                    *
>     *  http://edocs.bea.com/jrockit/go2troubleshooting.html  *
>     *  for troubleshooting information.                      *
>     **********************************************************
> 
> ===== END DUMP ===============================================================
> 
> 
> Regards,
> 
> Lech

Comment 21 Jan Friesse 2010-05-31 14:43:06 UTC
Hi,
sorry I selected bad link.

Can you please apply https://bugzilla.redhat.com/attachment.cgi?id=416834 and test?

Regards,
  Honza

(In reply to comment #20)
> Hi,
> good news.
> 
> Please try to apply https://bugzilla.redhat.com/attachment.cgi?id=409404 if it
> helps or not.
> 
> btw. Did you changed Makefile.inc or not? Because by default, openais will
> install to /usr/local/sbin, so maybe you are running old version (I hope not,
> but ... for sure).
> 
> Regards,
>   Honza
> 
> (In reply to comment #19)
> > Hi,
> > 
> > You are right, openais.conf was overridden.
> > So, I have replaced it by the one I used previously.
> > Tried to start /usr/sbin/aisexec -f, but this block infinitely,
> > however /usr/sbin/aisexec works.
> > Some good news - the 1st phase of my tests is OK (i.e. before killing aisexe).
> > After kill of aisexec, I get the core dump (same I have already seen), pointing
> > to saEvtDispatch:
> > 
> > "Thread-0" id=10 idx=0x38 tid=23418 lastJavaFrame=0xb58890f0
> > 
> > Stack 0: start=0xb5868000, end=0xb588a000, guards=0xb586d000 (ok),
> > forbidden=0xb
> > 586b000
> > Thread Stack Trace:
> >     at saEvtDispatch+664()@0xb7196f58
> >     -- Java stack --
> >     at
> > com/hp/opencall/coam/ntf/consumerImpl/ConsumerImpl.dispatchNativeC(Lcom/h
> > p/opencall/ais/DispatchFlags;)I(Native Method)
> >     at
> > com/hp/opencall/coam/ntf/consumerImpl/ConsumerImpl.dispatch(ConsumerImpl.
> > java:209)
> >     at
> > com/hp/opencall/coam/ntf/test/Subscribe4jWithFailure.receiveNotifications
> > (Subscribe4jWithFailure.java:134)
> >     at
> > com/hp/opencall/coam/ntf/test/Subscribe4jWithFailure.run(Subscribe4jWithF
> > ailure.java:162)
> >     at jrockit/vm/RNI.c2java(IIIII)V(Native Method)
> >     -- end of trace
> > 
> > Extended, platform specific info:
> > libc release: 2.5-stable
> > Elf headers:
> > libc       ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY:
> > 0
> > 0968fe0 PHOFF: 00000034 SHOFF: 00188b74 EF: 0x0 HS: 52 PS: 32 PHN; 10 SS: 40
> > SHN
> > : 75 STIDX: 74
> > libpthread ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY:
> > 0
> > 0aa5840 PHOFF: 00000034 SHOFF: 0001f474 EF: 0x0 HS: 52 PS: 32 PHN; 9 SS: 40
> > SHN:
> >  40 STIDX: 39
> > libjvm     ehdrs: EI: 7f454c46010101000000000000000000 ET: 3 EM: 3 V: 1 ENTRY:
> > 0
> > 004c460 PHOFF: 00000034 SHOFF: 012caf68 EF: 0x0 HS: 52 PS: 32 PHN; 4 SS: 40
> > SHN:
> >  29 STIDX: 26
> > 
> >     **********************************************************
> >     *  If you see this dump, please go to                    *
> >     *  http://edocs.bea.com/jrockit/go2troubleshooting.html  *
> >     *  for troubleshooting information.                      *
> >     **********************************************************
> > 
> > ===== END DUMP ===============================================================
> > 
> > 
> > Regards,
> > 
> > Lech

Comment 22 lech.pofelski 2010-05-31 15:24:56 UTC
Created attachment 418327 [details]
Improved proposal for evt dispatch fix - please remove printf(s) after testing!

This improvements prevents an infinite loop in dispatching in some situations

Comment 23 lech.pofelski 2010-05-31 15:30:28 UTC
Hello,

It was better now, but I had still a problem of infinite looping in evt dispatching, so I have added a line into evt.c  to prevent it (see my new attachement). Also, please remove the printf() functions which were used just for debugging, once the tests are finished. 
With the source evt.c as in the new attachement, my tests are OK.
Please, let me know once you have finished put this fix into the official rpm archive, to let me re-do the tests with the rpm as it will be in the official delivery.

Thanks a lot.

Regards,

Lech

Comment 24 Jan Friesse 2010-05-31 15:50:36 UTC
Hi,
problem is, that I'm not able to find my proposal patch in file you sent. I'm pretty sure that main problem is that in code there is nowhere test of dispatch_avail for -1. If it is -1, it means that something bad happened in openais_dispatch_recv . Because it is never tested, it will loop forever.

I think that with my patch https://bugzilla.redhat.com/attachment.cgi?id=416834 and correct test of value, everything should work as expected.

I'm not saying that simple applying of your patch doesn't work, but I'm trying to find real reason why it is looping.

I don't understand how it is possible to get evt == NULL.

Regards,
  Honza

(In reply to comment #23)
> Hello,
> 
> It was better now, but I had still a problem of infinite looping in evt
> dispatching, so I have added a line into evt.c  to prevent it (see my new
> attachement). Also, please remove the printf() functions which were used just
> for debugging, once the tests are finished. 
> With the source evt.c as in the new attachement, my tests are OK.
> Please, let me know once you have finished put this fix into the official rpm
> archive, to let me re-do the tests with the rpm as it will be in the official
> delivery.
> 
> Thanks a lot.
> 
> Regards,
> 
> Lech

Comment 26 Jan Friesse 2010-06-07 11:26:37 UTC
Created attachment 421791 [details]
evt.c - testing version

Hi,
can you please try attached evt.c and send me output from your application. I would really like to find out, where is REAL problem.

Thanks,
  Honza

Comment 27 lech.pofelski 2010-06-07 13:11:38 UTC
Hi,

Last Saturday (i.e. 2010-06-05) I have performed a number of basic functional tests of our application using openais, with the patch you recommended me in your Comments 24, i.e.:

https://bugzilla.redhat.com/attachment.cgi?id=416834

The good news that the tests were OK.
Now I plan to perform later this week a large scale tests using our application integrated with some telecom applications.
It they pass, these tests will be for me a ultimate confirmation that the latest fixes on openais are correct.
I will keep you informed about the results of these large scale tests.

By the way, in your today's enclosure Comments 26 you ask me to perform tests with an attached file evt.c

(https://bugzilla.redhat.com/attachment.cgi?id=421791)

but note that this file contains my temporary fixes I proposed in (https://bugzilla.redhat.com/attachment.cgi?id=418327), so I don't see
any interest to test it once again), esp. in the context where the patch you have proposd in Comments 24 seems to be working.

Will keep you informed.

Regards,

Lech

Comment 28 Jan Friesse 2010-06-07 14:25:46 UTC
Hi,
because I didn't saw any reaction(s), I tried to take your patched file and add there one printf with dispatch_avail and my patch on dispatch_avail to ensure how/if it works or not.

But because it looks like dispatch_avail == -1 patch works for you, please just ignore evt.c I sent.

Thanks for good news (and hopefully for future good news),
  Honza

(In reply to comment #27)
> Hi,
> 
> Last Saturday (i.e. 2010-06-05) I have performed a number of basic functional
> tests of our application using openais, with the patch you recommended me in
> your Comments 24, i.e.:
> 
> https://bugzilla.redhat.com/attachment.cgi?id=416834
> 
> The good news that the tests were OK.
> Now I plan to perform later this week a large scale tests using our application
> integrated with some telecom applications.
> It they pass, these tests will be for me a ultimate confirmation that the
> latest fixes on openais are correct.
> I will keep you informed about the results of these large scale tests.
> 
> By the way, in your today's enclosure Comments 26 you ask me to perform tests
> with an attached file evt.c
> 
> (https://bugzilla.redhat.com/attachment.cgi?id=421791)
> 
> but note that this file contains my temporary fixes I proposed in
> (https://bugzilla.redhat.com/attachment.cgi?id=418327), so I don't see
> any interest to test it once again), esp. in the context where the patch you
> have proposd in Comments 24 seems to be working.
> 
> Will keep you informed.
> 
> Regards,
> 
> Lech

Comment 29 lech.pofelski 2010-06-11 09:02:05 UTC
Hello,

I have large scale tests using the openais with the provided fixes, integrated with a telecom application. The test were successful.

Regards,

Lech

Comment 30 Jan Friesse 2010-06-11 09:23:42 UTC
(In reply to comment #29)
> Hello,
> 
> I have large scale tests using the openais with the provided fixes, integrated
> with a telecom application. The test were successful.
> 
> Regards,
> 
> Lech    

Thanks for good news.

Currently it's on Steve to make new package with both patches applied.

Regards,
  Honza

Comment 44 Douglas Silas 2011-01-11 23:15:39 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
When the openais server was stopped, server clients may have encountered a segmentation fault because of an invalid return code in an internal function. This crash no longer occurs.

Comment 45 Douglas Silas 2011-01-11 23:20:24 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-When the openais server was stopped, server clients may have encountered a segmentation fault because of an invalid return code in an internal function. This crash no longer occurs.+When an Openais client process sent an event to the server (aisexec), the server stopped working. When the client process tried to send the event using the saEvtEventPublish() function, it hung indefinitely on a semaphore (semop()) operation. Even when the server was restarted, the blocked client process was still blocked. With this update, the client process no longer hangs indefinitely and is unblocked after a set period of time.

Comment 47 errata-xmlrpc 2011-01-13 23:56:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0100.html