Description of problem: Removing a mirror disk from the mirror gives a failed mirror This happened when one disk of a mirror was removed ... dm-cmirror: Error while listening for server response: -110 dm-cmirror: Failed to receive election results from server: (t4mVFc14,-110) device-mapper: recovery failed on region 126 device-mapper: Unable to read from primary mirror during recovery device-mapper: recovery failed on region 127 device-mapper: Unable to read from primary mirror during recovery device-mapper: recovery failed on region 128 device-mapper: Unable to read from primary mirror during recovery result: crashed service and fail over to another node ... Hardware used: SUN V40z with RHEL4 Update 5 beta kernel-smp-2.6.9-50.EL cman-kernel-smp-2.6.9-49.2 cman-kernheaders-2.6.9-49.2 cman-1.0.17-0 dlm-1.0.1-2 dlm-kernel-smp-2.6.9-46.14 dlm-kernheaders-2.6.9-46.14 GFS-kernheaders-2.6.9-71.0 GFS-6.1.13-0 GFS-kernel-smp-2.6.9-71.0 ccs-1.0.10-0 cmirror-kernel-smp-2.6.9-25.0 cmirror-1.0.1-1 lvm2-2.02.21-4.el4 lvm2-cluster-2.02.21-3.el4
How exactly was the mirror disk removed? There isn't a "detach and use" ability yet in LVM mirrors. So if you took out a mirror leg from under LVM's knowledge, that should result in the mirror failing that leg of the mirror and down converting to a one less legged mirror, or a linear if there were only 2 legs to begin with.
From what I undertand they diconnected the disk at the SAN switch. (to simulate a SAN problem). The disk at the other SAN was fully working. But it did not result in a one legged/linear volume. but in the above problems.
new -> assigned
Created attachment 151276 [details] Scripts we used and logfiles
From after we tried again to reproduce the problem. the system continued normal. But we suspect multipath. is this possible ?
moving this to the cmirror component so that it's tracked easier.
Looking over the attachments... create_vgmqm.sh: ================ Great problems are being created because of the 'vgchange -c [ny] vgmqm' operations. They should not be in the script. CLVM is not converting single node LVs to clustered LVs when the '-c y' operation is done. Bug #235123 filed to address this issue.
remove_vgmqm.sh =============== again, don't do the 'vgchange -c -n vgmqm'. By doing this you are randomizing the results of your operations... you never have the same picture across the cluster.
Ben, there are some multipath segfaults happening in the messages file (in the attached tarball from comment #4).... You see anything there?
Created attachment 151620 [details] Strange warnings from locking fallback code. I cut this text from messages.txt. Take a look at the volume group names it is failing to activate. WARNING: Falling back to local file-based locking. Volume .... you get the idea.
I think in the process of doing LVM operations outside of the scope of the cluster (see vgchange -c n) thing got pretty screwed up.
It may be useful to see /etc/multipath.conf and /etc/lvm/lvm.conf. If you could, please also rerun your tests after taking out the 'vgchange -c ...' commands from your scripts. I also noticed ext3 as test file system... suggests that HA LVM might be a better fit for customer than full-blown cluster mirror.
The important thing to remember is that: 'vgchange -c ...' does not make objects cluster-aware; it changes the mode that LVM operates in. You should never have to issue that command. (Unless you are in a cluster and you want to create a local volume group with disk attached to only one machine.) So, if you are in a cluster and you do 'vgchange -an <vg>', you just deactivated cluster mode. Follow that with lvcreate/lvconvert operations, and you have made the cluster inconsistent (because LVM is not operating in cluster mode during those ops)...
we use "vgchange -c n" because otherwise "lvcreate -L 200 -n mqu_QTTEST vgmqm $DISK1" failed ... I will check the error again.
Yes, I would be interested to see the errors.... changing the cluster setting is not a valid workaround for that problem.
After starting with a new lvm.conf, my customer managed to create "lvm"'s without the "vgchance -c n" :-) but .... With the mirroring setup again it went wrong again device-mapper: Primary mirror device has failed while mirror is out of sync. device-mapper: Unable to choose alternative primary device GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: fatal: I/O error GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: block = 31840 GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: function = gfs_logbh_wait GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: file = /builddir/build/BUILD/gfs-kernel-2.6.9-71/smp/src/gfs/dio.c, line = 923 GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: time = 1175694685 GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: about to withdraw from the cluster GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: waiting for outstanding I/O GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: telling LM to withdraw lock_dlm: withdraw abandoned memory GFS: fsid=alpha_cluster:mqm_qmgrs_QTTEST.0: withdrawn GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: fatal: I/O error GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: block = 30662 GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: function = gfs_logbh_wait GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: file = /builddir/build/BUILD/gfs-kernel-2.6.9-71/smp/src/gfs/dio.c, line = 923 GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: time = 1175694686 GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: about to withdraw from the cluster GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: waiting for outstanding I/O GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: telling LM to withdraw lock_dlm: withdraw abandoned memory GFS: fsid=alpha_cluster:mqm_log_QTTEST.0: withdrawn The system where the devices where mounted a df command hang. This locks the devices. [root@bsxe01000 bin]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vgroot-lvroot 4128448 1150760 2767976 30% / /dev/md0 194366 32374 151957 18% /boot none 8179680 0 8179680 0% /dev/shm /dev/mapper/vgroot-lvhome 1548144 35484 1434020 3% /home /dev/mapper/vgroot-lvtmp 2064208 42076 1917276 3% /tmp /dev/mapper/vgroot-lvvar 4128448 266420 3652316 7% /var /dev/mapper/vglocal-rhel4_u5 50412228 5390928 42460484 12% /rhel4u5 /dev/mapper/vgmqm-mqu_QTTEST 139088 20 139068 1% /MQHA/mqu/qmgrs/QTTEST df: `/MQHA/mqm/qmgrs/QTTEST': Input/output error df: `/MQHA/mqm/log/QTTEST': Input/output error Where, via another session, I want to setup de mirroring again, I get the following errors: [root@bsxe01000 ~]# lvconvert -m1 /dev/vgmqm/qmgrs_QTTEST Logical volume qmgrs_QTTEST already has 1 mirror(s). [root@bsxe01000 ~]# lvconvert -m0 /dev/vgmqm/qmgrs_QTTEST Error locking on node bsxe01000: Volume group for uuid not found: JlYv49BDDec9EjZZ4opqpInD8sbHMD4q7onhPS2n3FoGi5GYN92FdKhzM0tPWWx8 Failed to lock qmgrs_QTTEST [root@bsxe01000 ~]# lvconvert -m1 /dev/vgmqm/qmgrs_QTTEST Logical volume qmgrs_QTTEST already has 1 mirror(s). The devices are locked. ...
This is a known issue, and is partially expected. Corey, do you remember the bug # for mirrors not converting on reads? The problem is that when you create a mirror, it tries to sync the devices. The only good device at this point is the primary. You are failing the primary, so all of your "good" devices are gone. The only course of action for the mirror is to return -EIO when GFS tries to read from it. This causes GFS to withdraw - causing your df's to fail. You can also certainly expect lvm errors, because a device is missing. 1) When you create a mirror, use the --nosync flag. This will skip the initial sync process (which is valid because you haven't written anything yet). That will allow you to pull the primary device immediately. You're other course of action would be to wait until the mirror is in-sync. 2) If you find yourself in this position and can't do lvm operations, do a 'vgreduce --removemissing vgmqm'
The bz for reads not causing a down convert is: 232711: RFE reads should trigger a down conversion of a failed cmirror Another bz that can reliably cause a deadlock scenario after a leg failure (lvm cmds and df hang) is: 234613 primary leg failure of corelog cmirror can cause down conversion to deadlock Are you using a disk log or core log?
disk log.
Please also note that there have been several key bug fixes in cluster mirroring since the beta package. The new package is built and being tested, but i don't think will be available until Apr 6, 2007.
After all the good news back to work ... (removing a disk now works) but... In the message I see the following message: Apr 5 12:51:20 bsxe01000 kernel: dm-cmirror: unable to notify server of completed resync work While the lvs command is telling me its still 98% log_QTTEST vgmqm mwi-ao 300.00M log_QTTEST_mlog 98.67 log_QTTEST_mimage_0(0),log_QTTEST_mimage_1(0) [log_QTTEST_mimage_0] vgmqm iwi-ao 300.00M /dev/emcpowera(50) [log_QTTEST_mimage_1] vgmqm iwi-ao 300.00M /dev/emcpowerb(1) [log_QTTEST_mlog] vgmqm lwi-ao 4.00M /dev/emcpowerd(50) mqu_QTTEST vgmqm mwi-ao 200.00M mqu_QTTEST_mlog 98.00 mqu_QTTEST_mimage_0(0),mqu_QTTEST_mimage_1(0) [mqu_QTTEST_mimage_0] vgmqm iwi-ao 200.00M /dev/emcpowera(0) [mqu_QTTEST_mimage_1] vgmqm iwi-ao 200.00M /dev/emcpowerd(0) [mqu_QTTEST_mlog] vgmqm lwi-ao 4.00M /dev/emcpowerb(0) qmgrs_QTTEST vgmqm mwi-ao 300.00M qmgrs_QTTEST_mlog 2.67 qmgrs_QTTEST_mimage_0(0),qmgrs_QTTEST_mimage_1(0) [qmgrs_QTTEST_mimage_0] vgmqm iwi-ao 300.00M /dev/emcpowera(125) [qmgrs_QTTEST_mimage_1] vgmqm iwi-ao 300.00M /dev/emcpowerd(51) [qmgrs_QTTEST_mlog] vgmqm lwi-ao 4.00M /dev/emcpowerb(76)
Jon, Could comment #21 be BZ 235252/217438? Seems like a similar "stuck copy percent" after a failure scenario and an inability to communicate with the server.
It might be possible. The client should reattempt the notify. If fails over and over again, we might hit that. It'd be nice to have the customer on the latest code though.
show met the latest code ;-), customer will upgrade on wednessday.
Wednesday Apr 11, 2007. Ok, I'll keep that in mind.
Created attachment 152170 [details] List of changes since initial beta QA testing on latest packages began 4/10/07