Bug 807628

Summary: lvm2-cluster package update must restart clvmd
Product: Red Hat Enterprise Linux 6 Reporter: Milan Broz <mbroz>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: high    
Version: 6.3CC: agk, cmarthal, dwysocha, heinzm, jbrassow, mbroz, msnitzer, nperic, prajnoha, prockai, pvrabec, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.95-3.el6 Doc Type: Bug Fix
Doc Text:
No documentation needed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 15:03:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Milan Broz 2012-03-28 11:26:41 UTC
Description of problem:

Unlike older release (RHEL5), clvmd is not restarted during package update.

With update to 6.3, there is serious problem with on-wire protocol change:
VG_CLUSTER bit was moved and old clvmd does not understand new command properly.

E.g. after lvchange -aey it activates volume non-exclusively on all nodes!

Not sure how properly solve this protocol incompatibility but clvmd should be at least restarted on rpm update for other reasons (new bugfixes, using new lvm code etc.)

This patch need to be add to spec:
--- a/lvm2.spec
+++ b/lvm2.spec
@@ -287,6 +287,10 @@ Extensions to LVM2 to support clusters.
 %post cluster
 /sbin/chkconfig --add clvmd
 
+if [ "$1" -gt "1" ] ; then
+       /usr/sbin/clvmd -S >/dev/null 2>&1 || :
+fi
+
 %preun cluster

Note we do not run "service restart" to avoid volume deactivation if start fails. Clvmd -S did exactly what the code need (nothing if clvmd is not running).

Comment 1 Milan Broz 2012-03-29 10:27:38 UTC
In Fedora now as well.

Comment 4 Peter Rajnoha 2012-04-12 13:27:46 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation needed.

Comment 5 Nenad Peric 2012-04-30 10:30:17 UTC
Tested with:

lvm2-2.02.95-5.el6.x86_64

while upgrading lvm2 packages to

lvm2-2.02.95-6.el6.x86_64

------------------------------------------------

Created two VGs with one LV each. 
One was activated normally, the other exclusively on one node (node01)

(05:02:25) [root@node01:~]$ lvs
  LV       VG        Attr     LSize    Pool Origin Data%  Move Log        Copy%  Convert
  lv_root  VolGroup  -wi-ao--    8.52g                                                  
  lv_swap  VolGroup  -wi-ao-- 1008.00m                                                  
  my_lv    exclusive mwi-a-m-  500.00m                         my_lv_mlog 100.00        
  other_lv global_lv -wi-a---  504.00m                   


(05:00:13) [root@node02:~]$ lvs
  LV       VG        Attr     LSize    Pool Origin Data%  Move Log        Copy%  Convert
  lv_root  VolGroup  -wi-ao--    8.52g                                                  
  lv_swap  VolGroup  -wi-ao-- 1008.00m                                                  
  my_lv    exclusive mwi---m-  500.00m                         my_lv_mlog               
  other_lv global_lv -wi-a---  504.00m                              



The logical volume exclusive/my_lv was activated exclusively only on node01.

All the lvm packages on all nodes updated.
After the update the clvmd has been restarted which is seen from ps output:

(05:30:44) [root@node02:~]$ ps -ef | grep clvm
root      3440     1  0 05:24 ?        00:00:00 clvmd -I cman


The LV stayed exclusively locked only on node01:

(05:31:25) [root@node01:~]$ lvs
  LV       VG        Attr     LSize    Pool Origin Data%  Move Log        Copy%  Convert
  lv_root  VolGroup  -wi-ao--    8.52g                                                  
  lv_swap  VolGroup  -wi-ao-- 1008.00m                                                  
  my_lv    exclusive mwi-a-m-  500.00m                         my_lv_mlog 100.00        
  other_lv global_lv -wi-a---  504.00m  


(05:30:49) [root@node02:~]$ lvs
  LV       VG        Attr     LSize    Pool Origin Data%  Move Log        Copy%  Convert
  lv_root  VolGroup  -wi-ao--    8.52g                                                  
  lv_swap  VolGroup  -wi-ao-- 1008.00m                                                  
  my_lv    exclusive mwi---m-  500.00m                         my_lv_mlog               
  other_lv global_lv -wi-a---  504.00m        



The only difference is that cman was running with different arguments after restart:

clvmd -T30 a default startup argument

clvmd -I cman after the installation of the packages. 

The second one being just internal cluster locking, which should be the same as  locking_type = 3 in lvm.conf 

(correct me if I am wrong :) )

Comment 6 Peter Rajnoha 2012-05-02 11:29:34 UTC
(In reply to comment #5)
> The only difference is that cman was running with different arguments after
> restart:
> 
> clvmd -T30 a default startup argument
> 

This one's not actually needed during restart since the cluster is already up and running (and quorate), so no need to wait here.

> clvmd -I cman after the installation of the packages. 

It uses the same cluster manager as for previous instance (so if, for example, "-I singlenode" was used, the restart would use it as well when replacing existing instance).

Comment 7 Peter Rajnoha 2012-05-02 11:50:25 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > The only difference is that cman was running with different arguments after
> > restart:
> > 
> > clvmd -T30 a default startup argument
> > 

(the protocol only assures that the RESTART clvmd request is propagated to other nodes, but it does not actually wait for the response, so timeout would be useless here anyway...)

Comment 9 errata-xmlrpc 2012-06-20 15:03:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0962.html