Hide Forgot
Description of problem: SSIA Version-Release number of selected component (if applicable): Everywhere How reproducible: 0.01% Steps to Reproduce: First apply patches from https://bugzilla.redhat.com/show_bug.cgi?id=830799. Because it introduces additional traffic in sync_* functions, it is little slower and chance that call sync_* function on already unloaded service is higher resulting in SEGFAULT in CPG service. Now run CTD StopAll test (or equivalent, so something what just start corosync on many nodes and stop corosync on many nodes in cycle). Actual results: Segfault when unloading service (corosync exit) Expected results: No Segfault. Additional info: We have patches fed7fc23e14e098dbb52842a4c79879a376f6ded and 6f6988afff632c6c5068becc855aa4a37a656183 in upstream. This must be backported.
Created attachment 629238 [details] 2012-10-18-0003-Make-service_build-contain-correct-number-of-msgs Make service_build contain correct number of msgs Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> (backported from commit a273be58ae97f192712661b2f5a19f8d89183065)
Created attachment 629239 [details] 2012-10-18-0002-Handle-sync-and-service-unload-correctly Handle sync and service unload correctly When sync started and service is unloaded in meantime, it can happen that sync will call sync_* functions on unloaded service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> (backported from commit 6f6988afff632c6c5068becc855aa4a37a656183)
Created attachment 629240 [details] 2012-10-18-0001-Don-t-call-sync_-funcs-for-unloaded-services Don't call sync_* funcs for unloaded services When service is unloaded, sync shouldn't call sync_init|process|activate and abort functions. It happens very rare, but in process of unloading all services, totem can recreate membership and bad things can happen (service is unloaded, so there may be access to already freed memory, ...) Solution is to fetch services sync handlers in every time when we are building service list instead of using precreated one. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> (backported from commit fed7fc23e14e098dbb52842a4c79879a376f6ded)
"Unit" test: https://github.com/jfriesse/csts/commit/0ce085de54ddefb249a684baf2079bbd815f5135 Before unit test, apply https://bugzilla.redhat.com/show_bug.cgi?id=830799 patches to easily reproduce segfault. Test is quiet reliable, but depends on HW (no cores, ...) because it's race condition. On one HW (VMs) success rate is 50%, on other (again VM) success rate is ~ 10%.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0497.html