Bug 490449 - domU's restart after cluster.conf update
domU's restart after cluster.conf update
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager (Show other bugs)
All Linux
urgent Severity high
: rc
: ---
Assigned To: Lon Hohberger
Cluster QE
: ZStream
Depends On:
Blocks: 491654
  Show dependency treegraph
Reported: 2009-03-16 10:27 EDT by Shane Bradley
Modified: 2010-10-23 04:21 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-09-02 07:03:37 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Patch ported from stable3 which disables status checks during reconfiguration (4.10 KB, patch)
2009-03-17 18:10 EDT, Lon Hohberger
no flags Details | Diff
debug log from ncldl38076 (54.10 KB, text/plain)
2009-03-18 12:38 EDT, Samuel Kielek
no flags Details
Patch encompassing 3 additional patches: (6.34 KB, patch)
2009-03-18 17:25 EDT, Lon Hohberger
no flags Details | Diff
debug log from ncldl38077 (26.19 KB, text/plain)
2009-03-18 18:26 EDT, Samuel Kielek
no flags Details
debug log from ncldl38077 (33.60 KB, text/plain)
2009-03-18 20:05 EDT, Samuel Kielek
no flags Details
Fake virtual machine agent which can be used to reproduce/test issue (10.76 KB, text/plain)
2009-03-20 13:27 EDT, Lon Hohberger
no flags Details
Simple utility to bump configuration version and print out the new version (1.13 KB, text/plain)
2009-03-20 13:28 EDT, Lon Hohberger
no flags Details
Script which can be used to reproduce above behavior; edit as necessary (182 bytes, text/plain)
2009-03-20 13:29 EDT, Lon Hohberger
no flags Details

  None (edit)
Description Shane Bradley 2009-03-16 10:27:52 EDT
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/2009030503 Fedora/3.0.7-1.fc10 Firefox/3.0.7

It's a 3 node cluster used exclusively for dom0 clustering. All nodes
are running RHEL 5.3 AP. The server hardware are HP DL-580, 4 socket
quad core (16 total cores), 64 GB of RAM, 2 4Gb/s Emulex HBA's. All
the VM's use block device pass through (no file backed guests).

If you look at the logs from one of the nodes (NCLDL58016) you will
see that the reconfig occurred on March 4th at 23:09:26

Mar 4 23:09:26 ncldl58016 ccsd[19819]: Update of cluster.conf complete
(version 1 -> 2).

And then 33 seconds later the vif's and block devs all start coming

And finally at 23:11:09 the VM services start recovering.

In our test cluster, which very closely mirrors the production cluster
except for the number of guests, we have found that we can reproduce
this issue by simply incrementing the config_version in cluster.conf
and then updating the cluster with that config. So *any* cluster
reconfig will cause the VMs to get restarted."

Reproducible: Always

Steps to Reproduce:
1.Update cluster.conf
2.Save changes and send to notes
3.Then vm will restart
Actual Results:  
The vm service will restart

Expected Results:  
The cluster.conf should be saved and sent to all nodes and vms that are running should not restart.
Comment 4 Lon Hohberger 2009-03-17 17:41:00 EDT

Here's something worth noting:

Mar  4 23:09:41 ncldl58016 clurgmgrd[21222]: <info> Stopping changed resources.
[VMs stop]
Mar  4 23:12:15 ncldl58016 clurgmgrd[21222]: <info> Restarting changed resources.
Mar  4 23:12:15 ncldl58016 clurgmgrd[21222]: <info> Starting changed resources.
Mar  4 23:12:15 ncldl58016 clurgmgrd[21222]: <notice> Initializing vm:lnxp0006
Mar  4 23:12:15 ncldl58016 clurgmgrd[21222]: <notice> vm:lnxp0006 was added to the config, but I am not initializing it.

That's 2.5 minutes of 'idling'.  If there's nothing to do in the stop-phase after a configuration update, rgmanager immediately moves to 'restarting' followed by 'starting'.

rgmanager *must* have gotten confused about the configuration update -- the question is how and why, and why does rg_test work?
Comment 5 Lon Hohberger 2009-03-17 18:09:44 EDT
There's a possibility that this particular issue was caused by a status-check vs. reconfiguration ordering problem, which I have fixed in the master and stable3 branches already.
Comment 6 Lon Hohberger 2009-03-17 18:10:42 EDT
Created attachment 335614 [details]
Patch ported from stable3 which disables status checks during reconfiguration
Comment 8 Lon Hohberger 2009-03-18 09:33:51 EDT

^^ Explanation of previously-attached patch.

There is another patch which is in stable3 (and I think this customer already has it) which also helped somewhat:

Comment 10 Samuel Kielek 2009-03-18 12:38:04 EDT
Created attachment 335728 [details]
debug log from ncldl38076
Comment 13 Lon Hohberger 2009-03-18 17:25:36 EDT
Created attachment 335766 [details]
Patch encompassing 3 additional patches:

* Fix rare segfault in -USR1 dump code
* Ensure we don't reconfig on the same new config version twice
* Ensure we always call init_resource_groups() with the right # of parameters
* Block signals in worker threads so SIGHUP/SIGINT/etc. do not abort status checks erroneously.
Comment 14 Lon Hohberger 2009-03-18 17:49:25 EDT
Buildable .src.rpm:

Comment 15 Samuel Kielek 2009-03-18 18:26:49 EDT
Created attachment 335774 [details]
debug log from ncldl38077
Comment 16 Lon Hohberger 2009-03-18 19:31:31 EDT
Updated SRPM


This one includes a scratch debugging patch to see if rgmanager actually believed it should restart/stop a VM after a reconfiguration.
Comment 17 Samuel Kielek 2009-03-18 20:05:54 EDT
Created attachment 335786 [details]
debug log from ncldl38077
Comment 18 Lon Hohberger 2009-03-18 21:46:27 EDT
Ok, so the current status here is as follows:

With all the above (working) patches, rgmanager still occasionally thinks it needs to stop a VM (or VMs).  It literally thinks that the VM has changed configuration after the configuration update, and for some reason flags it as needing a restart (or worse: just a stop).
Comment 20 Lon Hohberger 2009-03-19 08:14:00 EDT
I added a patch here which has debugging log messages.

Additionally, the patch zaps all flags during the tree delta in case any were set that we weren't aware of.  This -should- prevent the last instance of the VMs restarting, but why the 'RF_NEEDSTOP' flag was set in the first place is still not known.

Comment 21 Lon Hohberger 2009-03-19 10:31:54 EDT
I'm pretty sure the NEEDSTOP flag is getting set because of this:

* node A starts vm:foo.  Before starting vm:foo, it asks the rest of the cluster if they have seen vm:foo

* node B receives a status inquiry request from node A.  It then executes a status check on that VM to see if it is running.  It's not, so status returns 1.  At this point, node B sets a NEEDSTOP flag.

* Suppose you disable the VM on node A and start it on node B now.  At this point, the NEEDSTOP flag is still persisted on node B, but is ignored by the start/status checks.

* If you then do a configuration update, the NEEDSTOP flag is -still- there.  After a configuration update (or during a special "recover" operation", the NEEDSTOP flag is used by rgmanager to decide what resources need to be stopped or not.  Presence of this flag does NOT alter service state.

* Rgmanager does its reconfiguration, sees the NEESTOP flag, and stops the virtual machine.  Because the state has not actually changed according to rgmanager (NEEDSTOP is succeeded by NEEDSTART if a resource's parameters have changed, for example), the next status check causes a recovery of the VM and then the VM is restarted.

The previous patch masks this issue, but in the wrong way - it clears the NEEDSTOP flag during reconfiguration.  The NEEDSTOP flag should either be cleared on resource start or not set during the special status inquiry operation.
Comment 23 Lon Hohberger 2009-03-19 11:19:36 EDT

Same package which *only* fixes the problem described in comment #21
Comment 24 Samuel Kielek 2009-03-19 11:45:55 EDT
confirming that the minimal patch did resolve the issue on our test cluster
Comment 25 Lon Hohberger 2009-03-19 11:50:14 EDT
This is a general problem which can occur on any cluster following the steps in comment #21.
Comment 29 Lon Hohberger 2009-03-20 13:27:03 EDT
Created attachment 336081 [details]
Fake virtual machine agent which can be used to reproduce/test issue

This agent fakes out rgmanager to make it think it's controlling a virtual machine.  This is not for use in production environments.  The only use for this attachment is testing the errant behavior present within the internals of rgmanager.

You can install this test utility in the following way:

  chmod -x /usr/share/cluster/vm.sh
  cp vm-test.sh /usr/share/cluster
  chmod +x /usr/share/cluster/vm-test.sh

You must distribute ssh keys between all hosts in the cluster in order for 'migrate' to work.

How To Test:

* create a virtual machine on at least a 2 node cluster (or use the fake script provided)
* Start rgmanager on all nodes.  It will get started on one node.
* Disable the virtual machine using 'clusvcadm -d'.
* Enable the virtual machine explicitly on another node using 'clusvcadm -e'
* Change the configuration file by incrementing the config_version.
* Distribute the configuration file using ccs_tool update
* Ensure configuration version consistency in cman by running 'cman_tool version -r <new_config_version>'

Old (broken) behavior:

* Virtual machine which was disabled and enabled on another node will restart.

Corrected behavior:

* Virtual machine which was disabled and enabled on another node will not restart.
Comment 30 Lon Hohberger 2009-03-20 13:28:30 EDT
Created attachment 336082 [details]
Simple utility to bump configuration version and print out the new version
Comment 31 Lon Hohberger 2009-03-20 13:29:10 EDT
Created attachment 336083 [details]
Script which can be used to reproduce above behavior; edit as necessary
Comment 32 Lon Hohberger 2009-03-20 13:30:12 EDT
To compile attachment noted in comment #30, perform the following:

   gcc -o config-bump-xml config-bump-xml.c -I/usr/include/libxml2 -lxml2 -ggdb

You will need the libxml2-devel package installed.
Comment 34 Marc Grimme 2009-04-08 03:49:11 EDT
I can also confirm that the minimal rpm seems to fix this problem.

Will this Bugfix get an hotfix or go into Z-Stream or is this focused for 5.4?

Cause it's not yet flagged in any such way.

Comment 35 Lon Hohberger 2009-04-09 17:29:44 EDT
There's a z-stream bugzilla here:

Comment 38 Chris Ward 2009-07-03 14:27:17 EDT
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.
Comment 40 errata-xmlrpc 2009-09-02 07:03:37 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.