Bug 234318 - Removing a mirror disk from the mirror gives a failed mirror
Removing a mirror disk from the mirror gives a failed mirror
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: cmirror (Show other bugs)
4
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Jonathan Earl Brassow
Cluster QE
:
Depends On:
Blocks: CMirrorBetaTracker
  Show dependency treegraph
 
Reported: 2007-03-28 09:41 EDT by Frank Weyns
Modified: 2013-11-17 20:03 EST (History)
10 users (show)

See Also:
Fixed In Version: Beta1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-04-12 06:03:14 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Scripts we used and logfiles (320.00 KB, application/x-tar)
2007-03-30 09:29 EDT, Frank Weyns
no flags Details
Strange warnings from locking fallback code. (8.13 KB, text/plain)
2007-04-03 16:43 EDT, Jonathan Earl Brassow
no flags Details
List of changes since initial beta (2.26 KB, text/plain)
2007-04-10 14:48 EDT, Jonathan Earl Brassow
no flags Details

  None (edit)
Description Frank Weyns 2007-03-28 09:41:29 EDT
Description of problem: Removing a mirror disk from the mirror gives a failed mirror

This happened when one disk of a mirror was removed ...

dm-cmirror: Error while listening for server response: -110
dm-cmirror: Failed to receive election results from server:
(t4mVFc14,-110)

device-mapper: recovery failed on region 126
device-mapper: Unable to read from primary mirror during recovery
device-mapper: recovery failed on region 127
device-mapper: Unable to read from primary mirror during recovery
device-mapper: recovery failed on region 128
device-mapper: Unable to read from primary mirror during recovery

result: crashed service and fail over to another node ...

Hardware used:
SUN V40z with RHEL4 Update 5 beta

kernel-smp-2.6.9-50.EL

cman-kernel-smp-2.6.9-49.2
cman-kernheaders-2.6.9-49.2
cman-1.0.17-0

dlm-1.0.1-2
dlm-kernel-smp-2.6.9-46.14
dlm-kernheaders-2.6.9-46.14

GFS-kernheaders-2.6.9-71.0
GFS-6.1.13-0
GFS-kernel-smp-2.6.9-71.0

ccs-1.0.10-0

cmirror-kernel-smp-2.6.9-25.0
cmirror-1.0.1-1

lvm2-2.02.21-4.el4
lvm2-cluster-2.02.21-3.el4
Comment 1 Corey Marthaler 2007-03-28 11:43:01 EDT
How exactly was the mirror disk removed?  

There isn't a "detach and use" ability yet in LVM mirrors. So if you took out a
mirror leg from under LVM's knowledge, that should result in the mirror failing
that leg of the mirror and down converting to a one less legged mirror, or a
linear if there were only 2 legs to begin with.
Comment 2 Frank Weyns 2007-03-28 12:03:19 EDT
From what I undertand they diconnected the disk at the SAN switch.
(to simulate a SAN problem). The disk at the other SAN was fully working.

But it did not result in a one legged/linear volume. but in the above problems.
Comment 3 Jonathan Earl Brassow 2007-03-29 13:36:27 EDT
new -> assigned
Comment 4 Frank Weyns 2007-03-30 09:29:28 EDT
Created attachment 151276 [details]
Scripts we used and logfiles
Comment 5 Frank Weyns 2007-03-30 09:43:51 EDT
From after we tried again to reproduce the problem. the system continued normal.
But we suspect multipath. is this possible ?  
Comment 6 Corey Marthaler 2007-04-03 11:01:55 EDT
moving this to the cmirror component so that it's tracked easier.
Comment 7 Jonathan Earl Brassow 2007-04-03 16:27:00 EDT
Looking over the attachments...

create_vgmqm.sh:
================
Great problems are being created because of the 'vgchange -c [ny] vgmqm'
operations.  They should not be in the script.  CLVM is not converting single
node LVs to clustered LVs when the '-c y' operation is done.  Bug #235123 filed
to address this issue.
Comment 8 Jonathan Earl Brassow 2007-04-03 16:29:59 EDT
remove_vgmqm.sh
===============
again, don't do the 'vgchange -c -n vgmqm'.  By doing this you are randomizing
the results of your operations... you never have the same picture across the
cluster.
Comment 9 Jonathan Earl Brassow 2007-04-03 16:34:00 EDT
Ben, there are some multipath segfaults happening in the messages file (in the
attached tarball from comment #4).... You see anything there?
Comment 10 Jonathan Earl Brassow 2007-04-03 16:43:14 EDT
Created attachment 151620 [details]
Strange warnings from locking fallback code.

I cut this text from messages.txt.

Take a look at the volume group names it is failing to activate.
WARNING:
Falling
back
to
local
file-based
locking.
Volume
.... you get the idea.
Comment 11 Jonathan Earl Brassow 2007-04-03 16:48:08 EDT
I think in the process of doing LVM operations outside of the scope of the
cluster (see vgchange -c n) thing got pretty screwed up.
Comment 12 Jonathan Earl Brassow 2007-04-03 17:06:05 EDT
It may be useful to see /etc/multipath.conf and /etc/lvm/lvm.conf.

If you could, please also rerun your tests after taking out the 'vgchange -c
...' commands from your scripts.

I also noticed ext3 as test file system... suggests that HA LVM might be a
better fit for customer than full-blown cluster mirror.
Comment 13 Jonathan Earl Brassow 2007-04-03 17:17:55 EDT
The important thing to remember is that:

'vgchange -c ...' does not make objects cluster-aware; it changes the mode that
LVM operates in.  You should never have to issue that command.  (Unless you are
in a cluster and you want to create a local volume group with disk attached to
only one machine.)

So, if you are in a cluster and you do 'vgchange -an <vg>', you just deactivated
cluster mode.  Follow that with lvcreate/lvconvert operations, and you have made
the cluster inconsistent (because LVM is not operating in cluster mode during
those ops)...
Comment 14 Frank Weyns 2007-04-04 02:00:25 EDT
we use "vgchange -c n" because otherwise "lvcreate -L 200 -n mqu_QTTEST vgmqm
$DISK1" failed ... I will check the error again.
Comment 15 Jonathan Earl Brassow 2007-04-04 10:01:44 EDT
Yes, I would be interested to see the errors....  changing the cluster setting
is not a valid workaround for that problem.
Comment 16 Frank Weyns 2007-04-04 10:35:26 EDT
After starting with a new lvm.conf, my customer managed to create "lvm"'s without 
the "vgchance -c n" :-)  but ....

With the mirroring setup again it went wrong again

device-mapper: Primary mirror device has failed while mirror is out of sync.
device-mapper: Unable to choose alternative primary device
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: fatal: I/O error
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0:   block = 31840
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0:   function = gfs_logbh_wait
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0:   file =
/builddir/build/BUILD/gfs-kernel-2.6.9-71/smp/src/gfs/dio.c, line = 923
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0:   time = 1175694685
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: about to withdraw from the cluster
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: waiting for outstanding I/O
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: telling LM to withdraw
lock_dlm: withdraw abandoned memory
GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: withdrawn
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: fatal: I/O error
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0:   block = 30662
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0:   function = gfs_logbh_wait
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0:   file =
/builddir/build/BUILD/gfs-kernel-2.6.9-71/smp/src/gfs/dio.c, line = 923
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0:   time = 1175694686
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: about to withdraw from the cluster
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: waiting for outstanding I/O
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: telling LM to withdraw
lock_dlm: withdraw abandoned memory
GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: withdrawn

The system where the devices where mounted a df command hang. This locks the
devices.

[root@bsxe01000 bin]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vgroot-lvroot
                       4128448   1150760   2767976  30% /
/dev/md0                194366     32374    151957  18% /boot
none                   8179680         0   8179680   0% /dev/shm
/dev/mapper/vgroot-lvhome
                       1548144     35484   1434020   3% /home
/dev/mapper/vgroot-lvtmp
                       2064208     42076   1917276   3% /tmp
/dev/mapper/vgroot-lvvar
                       4128448    266420   3652316   7% /var
/dev/mapper/vglocal-rhel4_u5
                      50412228   5390928  42460484  12% /rhel4u5
/dev/mapper/vgmqm-mqu_QTTEST
                        139088        20    139068   1% /MQHA/mqu/qmgrs/QTTEST
df: `/MQHA/mqm/qmgrs/QTTEST': Input/output error
df: `/MQHA/mqm/log/QTTEST': Input/output error

Where, via another session, I want to setup de mirroring again, I get the
following errors:

[root@bsxe01000 ~]# lvconvert -m1 /dev/vgmqm/qmgrs_QTTEST
  Logical volume qmgrs_QTTEST already has 1 mirror(s).
[root@bsxe01000 ~]# lvconvert -m0 /dev/vgmqm/qmgrs_QTTEST
  Error locking on node bsxe01000: Volume group for uuid not found:
JlYv49BDDec9EjZZ4opqpInD8sbHMD4q7onhPS2n3FoGi5GYN92FdKhzM0tPWWx8
  Failed to lock qmgrs_QTTEST
[root@bsxe01000 ~]# lvconvert -m1 /dev/vgmqm/qmgrs_QTTEST
  Logical volume qmgrs_QTTEST already has 1 mirror(s).

The devices are locked.

...
Comment 17 Jonathan Earl Brassow 2007-04-04 11:25:14 EDT
This is a known issue, and is partially expected.

Corey, do you remember the bug # for mirrors not converting on reads?

The problem is that when you create a mirror, it tries to sync the devices.  The
only good device at this point is the primary.  You are failing the primary, so
all of your "good" devices are gone.  The only course of action for the mirror
is to return -EIO when GFS tries to read from it.  This causes GFS to withdraw -
causing your df's to fail.  You can also certainly expect lvm errors, because a
device is missing.

1) When you create a mirror, use the --nosync flag.  This will skip the initial
sync process (which is valid because you haven't written anything yet).  That
will allow you to pull the primary device immediately.  You're other course of
action would be to wait until the mirror is in-sync.

2) If you find yourself in this position and can't do lvm operations, do a
'vgreduce --removemissing vgmqm'

Comment 18 Corey Marthaler 2007-04-04 11:34:33 EDT
The bz for reads not causing a down convert is:
232711: RFE reads should trigger a down conversion of a failed cmirror

Another bz that can reliably cause a deadlock scenario after a leg failure (lvm
cmds and df hang) is:
234613 primary leg failure of corelog cmirror can cause down conversion to deadlock

Are you using a disk log or core log?
Comment 19 Frank Weyns 2007-04-04 11:41:49 EDT
disk log.
Comment 20 Jonathan Earl Brassow 2007-04-04 12:12:50 EDT
Please also note that there have been several key bug fixes in cluster mirroring
since the beta package.  The new package is built and being tested, but i don't
think will be available until Apr 6, 2007.
Comment 21 Frank Weyns 2007-04-05 12:10:52 EDT
After all the good news back to work ... (removing a disk now works) but...

In the message I see the following message:
      Apr  5 12:51:20 bsxe01000 kernel: dm-cmirror: unable to notify server of
completed resync work

While the lvs command is telling me its still 98%
log_QTTEST              vgmqm   mwi-ao 300.00M                   
log_QTTEST_mlog    98.67 log_QTTEST_mimage_0(0),log_QTTEST_mimage_1(0)
  [log_QTTEST_mimage_0]   vgmqm   iwi-ao 300.00M                               
             /dev/emcpowera(50)
  [log_QTTEST_mimage_1]   vgmqm   iwi-ao 300.00M                               
             /dev/emcpowerb(1)
  [log_QTTEST_mlog]       vgmqm   lwi-ao   4.00M                               
             /dev/emcpowerd(50)
  mqu_QTTEST              vgmqm   mwi-ao 200.00M                   
mqu_QTTEST_mlog    98.00 mqu_QTTEST_mimage_0(0),mqu_QTTEST_mimage_1(0)
  [mqu_QTTEST_mimage_0]   vgmqm   iwi-ao 200.00M                               
             /dev/emcpowera(0)
  [mqu_QTTEST_mimage_1]   vgmqm   iwi-ao 200.00M                               
             /dev/emcpowerd(0)
  [mqu_QTTEST_mlog]       vgmqm   lwi-ao   4.00M                               
             /dev/emcpowerb(0)
  qmgrs_QTTEST            vgmqm   mwi-ao 300.00M                   
qmgrs_QTTEST_mlog   2.67 qmgrs_QTTEST_mimage_0(0),qmgrs_QTTEST_mimage_1(0)
  [qmgrs_QTTEST_mimage_0] vgmqm   iwi-ao 300.00M                               
             /dev/emcpowera(125)
  [qmgrs_QTTEST_mimage_1] vgmqm   iwi-ao 300.00M                               
             /dev/emcpowerd(51)
  [qmgrs_QTTEST_mlog]     vgmqm   lwi-ao   4.00M                               
             /dev/emcpowerb(76)
 
Comment 22 Corey Marthaler 2007-04-05 12:24:19 EDT
Jon,

Could comment #21 be BZ 235252/217438? Seems like a similar "stuck copy percent"
after a failure scenario and an inability to communicate with the server.
Comment 23 Jonathan Earl Brassow 2007-04-05 13:21:57 EDT
It might be possible.  The client should reattempt the notify.  If fails over
and over again, we might hit that.

It'd be nice to have the customer on the latest code though.
Comment 24 Frank Weyns 2007-04-05 13:51:34 EDT
show met the latest code  ;-), customer will upgrade on wednessday.
Comment 25 Jonathan Earl Brassow 2007-04-05 15:23:46 EDT
Wednesday Apr 11, 2007.  Ok, I'll keep that in mind.
Comment 26 Jonathan Earl Brassow 2007-04-10 14:48:53 EDT
Created attachment 152170 [details]
List of changes since initial beta

QA testing on latest packages began 4/10/07

Note You need to log in before you can comment on or make changes to this bug.