| Summary: | lvcreate in 6 node cluster no longer getting propagated to other nodes | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Debbie Johnson <dejohnso> | ||||||
| Component: | lvm2-cluster | Assignee: | Milan Broz <mbroz> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Cluster QE <mspqa-list> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 5.6 | CC: | agk, ccaulfie, dwysocha, heinzm, jbrassow, mbroz, ndoane, prajnoha, prockai, pvrabec, thornber, zkabelac | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-03-09 22:35:02 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
Debbie Johnson
2011-03-08 20:23:14 UTC
Created attachment 483007 [details]
clvmd debug logs
Some of the latest testing and what was performed during the logs I uploaded... Created a new lvol (LV_CLVMD_TEST) on CVG_101 at 10:32am on cglnxhv03. Issued clvmd -R at 12:30am (got sidetracked). Stopped debug clvmd 20minutes later. I don't see that lvol on any other node. You will see a few lvscans in the logs. Currently only cglnxhv03 displays the new lvol- the other 5 do not. [root@cglnxhv11 ~]# ssh n09 lvscan|grep CLVMD [root@cglnxhv11 ~]# ssh n07 lvscan|grep CLVMD [root@cglnxhv11 ~]# ssh n05 lvscan|grep CLVMD [root@cglnxhv11 ~]# ssh n03 lvscan|grep CLVMD ACTIVE '/dev/CVG_101/LV_CLVMD_TEST' [2.00 GB] inherit [root@cglnxhv11 ~]# ssh n01 lvscan|grep CLVMD [root@cglnxhv11 ~]# I used this to create the lvol: [root@cglnxhv03 ~]# lvcreate -L 2G -n LV_CLVMD_TEST CVG_101 Logical volume "LV_CLVMD_TEST" created Later followed by: [root@cglnxhv03 tmp]# clvmd -R This was only run on cglnxhv03. Removing the .cache file did not help. A new one was generated. [root@cglnxhv11 cache]# lvscan|grep CLVMD [root@cglnxhv11 cache]# rm .cache [root@cglnxhv11 cache]# lvscan|grep CLVMD [root@cglnxhv11 cache]# ls -la total 32 drwx------ 2 root root 4096 Mar 8 08:58 . drwx------ 5 root root 4096 Feb 3 16:30 .. -rw------- 1 root root 15037 Mar 8 08:58 .cache Created attachment 483039 [details]
dmsetup info -c and pvscans -vvv
What was done before problem started.. Prior to this issue, one of the activities was removing 3 multipathed luns from the cluster. They would have been named WARR_05FC_MP, WARR_060B_MP, WARR_061A_MP. I know one was part of an existing volume group that had to be reduced, the other two were in isolated volume groups that were removed entirely. LVM work was completed first (clvmd -R's frequently), followed by the removal the multipath definitions from multipath.conf, multipath was flushed, scsi devices deleted, multipath restarted, luns were unzoned. Milan, These dumps have nothing to do with the clvmd logs. The clvmd logs are from a # lvcreate -L 2G -n LV_CLVMD_TEST CVG_101 I will ask Robert to ask the customer to get the lvmdumps now if you wish. Please let me know what you need to go along with the clvmd -d logs. Deb Milan, Thanks. Will do. Thanks so much for getting to this so quickly. So action plan is: 1) Install lvm2-2.02.74-5.el5_6.1 lvm2-cluster-2.02.74-3.el5_6.1 on each cluster node. 2) Start with a clean configuration on all nodes and a fresh boot. Then: Please do the following: a) "killall clvmd" on each node b) script /tmp/clvmd-$(uname -n).out c) clvmd -d d) repro issue e) ctrl+c the clvmd f) exit script g) attach script h) clvmd -- to start it back up, not in debug mode 3) If every node was propagated with the new LV, let us know that the problem is resolved. If every node was not propagated then send us: The clvmd logs from all nodes and an lvmdump from each node collected after this test. Does this action plan cover what we need to have done? Deb Yes. I think Dave and Chrissie can help here as well to find source of problem. Milan, This BZ can be closed. The problem was fixed by upgrading the packages. Thank you so much for your help. From customer: After updating these 2 packages and rebooting each node, we are no longer experiencing this issue. I verified that lvcreate, lvremove, and lvrename results are now seen by all other nodes as expected. I used several different volume groups, and issued commands from different nodes to test. Then it was almost for sure bug #673981 ... Please reopen if this appears again. *** This bug has been marked as a duplicate of bug 673981 *** |