Bug 784675

Summary: corosync master: regression(s) in service shutdown
Product: [Retired] Corosync Cluster Engine Reporter: Fabio Massimo Di Nitto <fdinitto>
Component: unknownAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: 1.4CC: asalkeld, jfriesse, sdake
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-31 09:21:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Fabio Massimo Di Nitto 2012-01-25 19:18:52 UTC
Description of problem:

exec_exit_fn is never executed when corosync is shutdown.

Version-Release number of selected component (if applicable):

git master
HEAD: f25d5829f25442e6364228ae65e6f74c020ee00b

How reproducible:

Always

Steps to Reproduce:
1. Either define a exec_exit_fn to do something or notice the missing logs entries on shutdown for:
Jan 25 20:09:27 notice  [SERV  ] Service engine unloaded: corosync vote quorum service v1.0

2. start corosync in foreground/debug mode
3. hit ctrl+c
  
Actual results:

^C
Jan 25 20:09:27 notice  [SERV  ] Unloading all Corosync service engines.
Jan 25 20:09:27 notice  [MAIN  ] Corosync Cluster Engine exiting normally


Expected results:

Jan 25 20:09:27 notice  [SERV  ] Unloading all Corosync service engines.
Jan 25 20:09:27 notice  [SERV  ] Service engine unloaded: corosync extended virtual synchrony service
[several services unload messages]
Jan 25 20:09:27 notice  [MAIN  ] Corosync Cluster Engine exiting normally

Additional info:

the problem is in corosync_service_unlink_priority

-                       snprintf(key_name, ICMAP_KEYNAME_MAXLEN,
-                                       "internal_configuration.service.%u.handle",
-                                       corosync_service[*current_service_engine]->id);
-                       if (icmap_get_uint64(key_name, &found_service_handle) == CS_OK) {

the .handle icmap key does not exists anymore, and there for the check is not required.

by commenting out the check (since we parse a list of only known objects that could also be improved to avoid unnecessary looping), we execute again exec_exit_fn.

also to notice that the service shutdown order is not correct.

If we load service 1, 2, 3, 4, the expected shutdown order should be 4, 3, 2, 1. I can see from the logs that the unload order is the same as load.

This can possibly lead to errors in shutting down 4 that depends on 3 or ...