Bug 428475 - HA LVM service fails to relocate when I/O is running
HA LVM service fails to relocate when I/O is running
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: rgmanager (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On: 428448
Blocks:
  Show dependency treegraph
 
Reported: 2008-01-11 16:32 EST by Corey Marthaler
Modified: 2009-04-16 16:22 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2008-0791
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-25 15:15:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2008-01-11 16:32:14 EST
+++ This bug was initially created as a clone of Bug #428448 +++

Description of problem:
I attempted to relocate my lvm service with I/O running, and it failed.

[root@hayes-02 cluster]# clusvcadm -r halvm -m hayes-03
Trying to relocate service:halvm to hayes-03...Failure

This is a 3 node cluster, 4 lvs/fs in the service, and I/O running to all
filesystems.

HAYES-01:
Jan 11 05:36:27 hayes-01 clurgmgrd[20500]: <notice> Stopping service service:halvm
Jan 11 05:36:29 hayes-01 clurgmgrd: [20500]: <notice> Forcefully unmounting /mnt/fs4
Jan 11 05:36:29 hayes-01 clurgmgrd: [20500]: <warning> killing process 27558
(root xdoio /mnt/fs4)
Jan 11 05:36:41 hayes-01 clurgmgrd: [20500]: <notice> Forcefully unmounting /mnt/fs3
Jan 11 05:36:41 hayes-01 clurgmgrd: [20500]: <warning> killing process 27556
(root xdoio /mnt/fs3)
Jan 11 05:36:53 hayes-01 clurgmgrd: [20500]: <notice> Forcefully unmounting /mnt/fs2
Jan 11 05:36:53 hayes-01 clurgmgrd: [20500]: <warning> killing process 27557
(root xdoio /mnt/fs2)
Jan 11 05:37:04 hayes-01 clurgmgrd: [20500]: <notice> Forcefully unmounting /mnt/fs1
Jan 11 05:37:04 hayes-01 clurgmgrd: [20500]: <warning> killing process 27555
(root xdoio /mnt/fs1)
Jan 11 05:37:15 hayes-01 clurgmgrd: [20500]: <err> initrd image is newer than
lvm.conf [GOOD]
Jan 11 05:37:16 hayes-01 clurgmgrd[20500]: <notice> Service service:halvm is stopped
Jan 11 05:37:19 hayes-01 clurgmgrd[20500]: <err> #58: Failed opening connection
to member #2
Jan 11 05:37:19 hayes-01 clurgmgrd[20500]: <warning> #70: Failed to relocate
service:halvm; restarting locally
Jan 11 05:37:19 hayes-01 clurgmgrd[20500]: <notice> Recovering failed service
service:halvm
Jan 11 05:37:19 hayes-01 clurgmgrd: [20500]: <err> initrd image is newer than
lvm.conf [GOOD]
Jan 11 05:37:20 hayes-01 kernel: kjournald starting.  Commit interval 5 seconds
Jan 11 05:37:20 hayes-01 kernel: EXT3 FS on dm-2, internal journal
Jan 11 05:37:20 hayes-01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jan 11 05:37:20 hayes-01 kernel: kjournald starting.  Commit interval 5 seconds
Jan 11 05:37:20 hayes-01 kernel: EXT3 FS on dm-3, internal journal
Jan 11 05:37:20 hayes-01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jan 11 05:37:20 hayes-01 kernel: kjournald starting.  Commit interval 5 seconds
Jan 11 05:37:20 hayes-01 kernel: EXT3 FS on dm-4, internal journal
Jan 11 05:37:20 hayes-01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jan 11 05:37:21 hayes-01 kernel: kjournald starting.  Commit interval 5 seconds
Jan 11 05:37:21 hayes-01 kernel: EXT3 FS on dm-5, internal journal
Jan 11 05:37:21 hayes-01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jan 11 05:37:21 hayes-01 clurgmgrd[20500]: <notice> Service service:halvm started
Jan 11 05:37:23 hayes-01 clurgmgrd: [20500]: <notice> Getting status


HAYES-02:
Jan 11 05:37:05 hayes-02 clurgmgrd[1633]: <err> #37: Error receiving header from
1 sz=0 CTX 0x2aaaac000cf0


HAYES-03:
Jan 11 05:39:16 hayes-03 clurgmgrd[4180]: <notice> Starting stopped service
service:halvm
Jan 11 05:39:16 hayes-03 clurgmgrd: [4180]: <err> initrd image is newer than
lvm.conf [GOOD]
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> Failed to add ownership tag to
HAYES
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> Failed to activate volume
group, HAYES
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <notice> Attempting cleanup of HAYES
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> Failed to make HAYES consistent
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> start on lvm "lvm" returned 1
(generic error)
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <warning> #68: Failed to start
service:halvm; return value: 1
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> Stopping service service:halvm
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> stop: Could not match
/dev/HAYES/ha4 with a real device
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> stop on fs "fs4" returned 2
(invalid argument(s))
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> stop: Could not match
/dev/HAYES/ha3 with a real device
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> stop on fs "fs3" returned 2
(invalid argument(s))
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> stop: Could not match
/dev/HAYES/ha2 with a real device
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> stop on fs "fs2" returned 2
(invalid argument(s))
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> stop: Could not match
/dev/HAYES/ha1 with a real device
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> stop on fs "fs1" returned 2
(invalid argument(s))
Jan 11 05:39:17 hayes-03 clurgmgrd: [4180]: <err> initrd image is newer than
lvm.conf [GOOD]
Jan 11 05:39:17 hayes-03 clurgmgrd[4180]: <notice> Service service:halvm is
recovering


After the failure, the service remains runing on the initial node:
[root@hayes-01 cluster]# clustat
Cluster Status for HAYES @ Fri Jan 11 05:55:46 2008
Member Status: Quorate

 Member Name                      ID   Status
 ------ ----                      ---- ------
 hayes-01                             1 Online, Local, rgmanager
 hayes-02                             2 Online, rgmanager
 hayes-03                             3 Online, rgmanager

 Service Name            Owner (Last)            State
 ------- ----            ----- ------            -----
 service:halvm           hayes-01                started


[root@hayes-01 cluster]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M    388   2008-01-11 05:04:58  hayes-01
   2   M    392   2008-01-11 05:04:59  hayes-02
   3   M    404   2008-01-11 05:19:03  hayes-03



Version-Release number of selected component (if applicable):
2.6.18-62.el5
rgmanager-2.0.32-4.el5
Comment 1 Chris Feist 2008-03-05 16:09:20 EST
Setting 4.7 flag as this has been checked into the RHEL4 branch.
Comment 4 Corey Marthaler 2008-06-09 16:15:25 EDT
Marking this verified.
Comment 6 errata-xmlrpc 2008-07-25 15:15:50 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0791.html

Note You need to log in before you can comment on or make changes to this bug.