Bug 383291

Summary:	RHEL5 cmirror tracker: I/O to cmirror causes havoc with mirror resyncing
Product:	Red Hat Enterprise Linux 5	Reporter:	Corey Marthaler <cmarthal>
Component:	cmirror	Assignee:	Jonathan Earl Brassow <jbrassow>
Status:	CLOSED ERRATA	QA Contact:	Cluster QE <mspqa-list>
Severity:	high	Docs Contact:
Priority:	high
Version:	5.0	Keywords:	TestBlocker
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2009-01-20 21:25:30 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	430797

Description Corey Marthaler 2007-11-14 20:05:31 UTC

Description of problem:
I created 2 cmirrors, let them sync to 100%, put gfs on top of each and then
started I/O. Shortly afterwards, I noticed the I/O appeared hung, the mirrors
were no longer in sync (stuck at 0.00%), and a flood of messages on the console.

[...]
Nov 13 17:56:11 taft-01 kernel: device-mapper: dm-log-clustered: Timed out
waiting for cluster log server [DM_CLOG_GET_RESYNC_WORK]
Nov 13 17:56:11 taft-01 kernel: device-mapper: dm-log-clustered: Retrying
request [DM_CLOG_GET_RESYNC_WORK]
Nov 13 17:56:16 taft-01 kernel: device-mapper: dm-log-clustered: Timed out
waiting for cluster log server [DM_CLOG_GET_RESYNC_WORK]
Nov 13 17:56:16 taft-01 kernel: device-mapper: dm-log-clustered: Retrying
request [DM_CLOG_GET_RESYNC_WORK]
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Timed out
waiting for cluster log server [DM_CLOG_GET_RESYNC_WORK]
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Retrying
request [DM_CLOG_GET_RESYNC_WORK]
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send
cluster log request [DM_CLOG_GET_RESYNC_WORK] to server: -3
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send
cluster log request [DM_CLOG_IS_REMOTE_RECOVERING] to server: -3
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send
cluster log request [DM_CLOG_GET_RESYNC_WORK] to server: -3
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send
cluster log request [DM_CLOG_IS_REMOTE_RECOVERING] to server: -3
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send
cluster log request [DM_CLOG_GET_RESYNC_WORK] to server: -3
Nov 13 17:56:21 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send
cluster log request [DM_CLOG_IS_REMOTE_RECOVERING] to server: -3
[...]
 

[root@taft-01 ~]# lvs -a -o +devices
  LV                 VG         Attr   LSize  Origin Snap%  Move Log         
Copy%  Devices
  LogVol00           VolGroup00 -wi-ao 66.19G                                  
     /dev/sda2(0)
  LogVol01           VolGroup00 -wi-ao  1.94G                                  
     /dev/sda2(2118)
  mirrorA            taft       mwi-ao 10.00G                    mirrorA_mlog  
0.00 mirrorA_mimage_0(0),mirrorA_mimage_1(0)
  [mirrorA_mimage_0] taft       iwi-ao 10.00G                                  
     /dev/sdb1(0)
  [mirrorA_mimage_1] taft       iwi-ao 10.00G                                  
     /dev/sdb2(0)
  [mirrorA_mlog]     taft       lwi-ao  4.00M                                  
     /dev/sdb5(0)
  mirrorB            taft       mwi-ao 10.00G                    mirrorB_mlog  
0.00 mirrorB_mimage_0(0),mirrorB_mimage_1(0)
  [mirrorB_mimage_0] taft       iwi-ao 10.00G                                  
     /dev/sdb3(0)
  [mirrorB_mimage_1] taft       iwi-ao 10.00G                                  
     /dev/sdb4(0)
  [mirrorB_mlog]     taft       lwi-ao  4.00M                                  
     /dev/sdb2(2560)


[root@taft-01 ~]# dmsetup ls
taft-mirrorB    (253, 9)
taft-mirrorA    (253, 5)
taft-mirrorA_mlog       (253, 2)
taft-mirrorB_mimage_1   (253, 8)
VolGroup00-LogVol01     (253, 1)
taft-mirrorB_mimage_0   (253, 7)
taft-mirrorB_mlog       (253, 6)
VolGroup00-LogVol00     (253, 0)
taft-mirrorA_mimage_1   (253, 4)
taft-mirrorA_mimage_0   (253, 3)
[root@taft-01 ~]# dmsetup info
Name:              taft-mirrorB
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      1
Major, minor:      253, 9
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3Gs0MTyjCQZWJTIHSy0qxvnwX3qHsKZV2Qt

Name:              taft-mirrorA
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 5
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3GsblVSQOlcbuAFlTguLBJ361TTXqfmjrfJ

Name:              taft-mirrorA_mlog
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 2
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3GsLIZEltJ3S3B4UxcyCpuxiaNvj4z4Jldq

Name:              taft-mirrorB_mimage_1
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 8
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3GsgXHzwXdn8zcHdK2kA9mlJ1v6fcv6Th8I

Name:              VolGroup00-LogVol01
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 1
Number of targets: 1
UUID: LVM-tksJHxpzslH7ylcpESCmUbfC3pi1aI2IUfqVRnIZNcJFqPBR0t5IMf5sXnAJRfxl

Name:              taft-mirrorB_mimage_0
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 7
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3Gs2gddIlotw0lxiDjwrRXnZVBgGuu9XEk4

Name:              taft-mirrorB_mlog
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 6
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3Gs6XyckFh8oE6p25slt3opeJSuu8mM7Dp6

Name:              VolGroup00-LogVol00
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 0
Number of targets: 1
UUID: LVM-tksJHxpzslH7ylcpESCmUbfC3pi1aI2IwBiL7dfdiW1ilAnZLqNXZofdR00tQenl

Name:              taft-mirrorA_mimage_1
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 4
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3GsFEhzjgkoUGYKYcvbs5TCIa3EJksIRg9X

Name:              taft-mirrorA_mimage_0
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 3
Number of targets: 1
UUID: LVM-D8b632IRI0s3LJSV10wC4PFU5fXMa3GsDvta7w8SuvjfKrVBEoCl4cRmEFpYLlnh



Version-Release number of selected component (if applicable):
2.6.18-53.el5
cmirror-1.1.0-7

Comment 1 Jonathan Earl Brassow 2007-11-14 20:40:32 UTC

yes, I've seen this.  It's likely this is an AIS issue.

AIS gets flooded, then leaves the CPG while holding a lock - causing the cluster
log daemon to hang when trying to aquire again.  It can't respond to the kernel
and the kernel continues to complain that the server isn't listening.

Comment 2 Jonathan Earl Brassow 2008-01-18 17:39:28 UTC

I don't see this anymore, but i need to put higher I/O load on my systems to
stress the flow control...

Comment 3 Jonathan Earl Brassow 2008-01-22 16:36:03 UTC

ok, I'm willing to mark this modified now.

Comment 4 Corey Marthaler 2008-01-25 14:55:25 UTC

There are still issues when running I/O to cmirrors. I started up a moderate I/O
load, consisting of xiogen/xdoio processes from all nodes to each of two
cmirrors with gfs on top. I watched it run for awhile (15 - 20 minutes or so)
and didn't notice any problems, but when I came in this morning, I noticed that
the filesystems are either hung, or so slow that nothing is responding. I see
similar messages (to those in comment #0) scrolling on the console as well.

[...]
Jan 25 02:43:15 hayes-01 clogd[3607]: Failed to get clog_tfr from free_queue
Jan 25 02:43:15 hayes-01 clogd[3607]: Bad callback on local/4
Jan 25 02:43:15 hayes-01 clogd[3607]: Failed to get clog_tfr from free_queue
Jan 25 02:43:15 hayes-01 clogd[3607]: Bad callback on local/4
Jan 25 02:43:15 hayes-01 clogd[3607]: Failed to get clog_tfr from free_queue
Jan 25 02:43:15 hayes-01 clogd[3607]: Bad callback on local/4
[...]
device-mapper: dm-log-clustered: Unable to send cluster log request
[DM_CLOG_CTR] to server: -3
device-mapper: dm-log-clustered: Attempting to contact cluster log server...
device-mapper: dm-log-clustered: Unable to send cluster log request
[DM_CLOG_CTR] to server: -3
device-mapper: dm-log-clustered: Attempting to contact cluster log server...
device-mapper: dm-log-clustered: Unable to send cluster log request
[DM_CLOG_CTR] to server: -3
device-mapper: dm-log-clustered: Attempting to contact cluster log server...
device-mapper: dm-log-clustered: Unable to send cluster log request
[DM_CLOG_CTR] to server: -3
device-mapper: dm-log-clustered: Attempting to contact cluster log server...
device-mapper: dm-log-clustered: Unable to send cluster log request
[DM_CLOG_CTR] to server: -3


Processes appear stuck in device wait:
[root@hayes-03 ~]# ps -elf | grep doio
0 D root      3396  3364  0  78   0 -  5170 sync_b 08:48 pts/0    00:00:00 xdoio -vD
0 D root      3398  3364  0  78   0 -  5171 glock_ 08:48 pts/0    00:00:00 xdoio -vD
1 S root      3400  3396  0  77   0 -  5168 184466 08:48 pts/0    00:00:00 xdoio -vD
1 S root      3402  3398  0  78   0 -  5168 184466 08:48 pts/0    00:00:00 xdoio -vD
0 S root      3748  9279  0  78   0 - 15288 pipe_w 08:53 pts/1    00:00:00 grep doio
4 S root      9173  3364  4  75   0 -  5186 184466 Jan24 pts/0    00:43:29 xdoio -vD
4 S root      9175  3364 19  75   0 -  5184 184466 Jan24 pts/0    03:03:03 xdoio -vD
5 D root      9178  9173 17  78   0 -  5624 -      Jan24 pts/0    02:47:27 xdoio -vD
5 D root      9179  9175 75  78   0 -  5623 glock_ Jan24 pts/0    11:54:10 xdoio -vD
4 S root      9971  9805  4  75   0 -  5181 184466 Jan24 pts/2    00:42:23 xdoio -vD
4 S root      9973  9805 19  75   0 -  5187 184466 Jan24 pts/2    03:00:10 xdoio -vD
5 D root      9976  9971 16  78   0 -  5623 -      Jan24 pts/2    02:36:28 xdoio -vD
5 D root      9977  9973 74  85   0 -  5624 -      Jan24 pts/2    11:43:57 xdoio -vD

I'll dump the process stacks and attach them.

Comment 5 Corey Marthaler 2008-01-28 14:55:03 UTC

Reproduced this again over the weekend.

Comment 6 Jonathan Earl Brassow 2008-02-04 21:36:21 UTC

One cause of this has been found and fixed.  (Failure to send a cluster request
would leave memory in cluster queue without sending.  If this happened several
times, you could run out of memory and be stuck with the case in comment #4.) 
Need to continue testing to see if all issues have been found, but will mark
MODIFIED.

Comment 7 Corey Marthaler 2008-02-05 20:14:07 UTC

Just a quick note that the latest build does seem to have an impact on this
issue. On two different clusters, I ran I/O over night to a single cmirror (10
processes from each node) and neither cluster had any issues the next morning.
I'll now attempt to bump up the I/O load as well as the number of cmirrors being
written to.

cmirror-1.1.9-1.el5/kmod-cmirror-0.1.5-2.el5

Comment 8 Corey Marthaler 2008-02-05 20:41:25 UTC

Another quick note, the messages in this bug can still appear in the service
recovery issue (bz 430296) while I/O is running.

Comment 10 Corey Marthaler 2008-02-07 15:30:55 UTC

I was able to reproduce this issue (on two different clusters) when bumping up
the I/O load to 3 gfs/mirrors and each node writing (10 processes) to each.

Comment 11 Jonathan Earl Brassow 2008-02-08 21:14:44 UTC

What kind of failure messages did you see this last time?

Comment 12 Corey Marthaler 2008-02-11 17:08:26 UTC

Although I was unable to reproduce a deadlock during I/O. I did see the
following scrolling messages while running over the weekend.

[...]
Feb 11 10:12:12 grant-03 clogd[3692]: Bad callback on local/4
Feb 11 10:12:12 grant-03 kernel: device-mapper: dm-log-clustered: Stray request
returned: <NULL>, 0
Feb 11 10:12:27 grant-03 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_IN_SYNC:6353015 - retrying
Feb 11 10:18:31 grant-03 clogd[3692]: cpg_mcast_joined error: 9
Feb 11 10:18:31 grant-03 clogd[3692]: cluster_send failed at: local.c:212
(do_local_work)
Feb 11 10:18:31 grant-03 clogd[3692]: [] Unable to send (null) to cluster:
Invalid exchange
Feb 11 10:18:31 grant-03 clogd[3692]: Bad callback on local/4
Feb 11 10:18:31 grant-03 kernel: device-mapper: dm-log-clustered: Stray request
returned: <NULL>, 0
Feb 11 10:18:46 grant-03 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_IN_SYNC:6363398 - retrying
Feb 11 10:27:21 grant-03 clogd[3692]: cpg_mcast_joined error: 9
Feb 11 10:27:21 grant-03 clogd[3692]: cluster_send failed at: local.c:212
(do_local_work)
Feb 11 10:27:21 grant-03 clogd[3692]: [] Unable to send (null) to cluster:
Invalid exchange
Feb 11 10:27:21 grant-03 kernel: device-mapper: dm-log-clustered: Stray request
returned: <NULL>, 0
Feb 11 10:27:21 grant-03 clogd[3692]: Bad callback on local/4
Feb 11 10:27:36 grant-03 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_IN_SYNC:6376971 - retrying
Feb 11 10:38:02 grant-03 clogd[3692]: cpg_mcast_joined error: 9
Feb 11 10:38:02 grant-03 clogd[3692]: cluster_send failed at: local.c:212
(do_local_work)
Feb 11 10:38:02 grant-03 clogd[3692]: [] Unable to send (null) to cluster:
Invalid exchange
[...]

Comment 14 Jonathan Earl Brassow 2008-02-13 21:26:05 UTC

2/13/08 update to code should speedup (and help out) during I/O + mirror resyncing.

Comment 15 Corey Marthaler 2008-02-15 20:29:40 UTC

I have now been running cmirror io for 3 days straight (4 mirrors * 4 nodes * 15
processes) and have been unable to reproduce the deadlock. However, just like in
comment #12 (and outlined in bz 432109), the "scary to a customer" messages
scroll  over and over on the console while the I/O runs. Also, even though,
these mirrors started out as 100% synced, only one reamins that way.

  mirror1  taft       Mwi-ao 50.00G                    mirror1_mlog  99.99
  mirror2  taft       Mwi-ao 50.00G                    mirror2_mlog 100.00
  mirror3  taft       Mwi-ao 50.00G                    mirror3_mlog  99.99
  mirror4  taft       Mwi-ao 50.00G                    mirror4_mlog  99.99

So there is still an issue with regards to I/O and cmirrors. Whether or not it's
already documented in other bugs is the question.

Comment 16 Corey Marthaler 2008-02-18 20:26:45 UTC

Putting this back in ASSIGNED to see what Jon thinks about comment #15.

Comment 17 Jonathan Earl Brassow 2008-04-16 20:08:37 UTC

should be fixed by this commit:

commit 2200d92f9ebc30fca8f4107929fc4707b57bcebd
Author: Jonathan Brassow <jbrassow>
Date:   Mon Mar 24 16:09:52 2008 -0500

Comment 19 Corey Marthaler 2008-05-20 14:45:31 UTC

Still seeing the issues in comment #15. 

After starting an i/o load (w/ xdoio) on fully synced mirrors and running over
night, the load appears to be running noticably slower and the sync counts are off.

[root@taft-03 ~]# lvs
  LV       VG         Attr   LSize  Origin Snap%  Move Log           Copy%  Convert
  LogVol00 VolGroup00 -wi-ao 66.19G
  LogVol01 VolGroup00 -wi-ao  1.94G
  cmirror1 taft       mwi-ao 15.00G                    cmirror1_mlog  99.97
  cmirror2 taft       mwi-ao 15.00G                    cmirror2_mlog  99.97
  cmirror3 taft       mwi-ao 15.00G                    cmirror3_mlog 100.00

Comment 20 Jonathan Earl Brassow 2008-07-24 19:11:33 UTC

Much has changed since 2008-05-20 and I do not see this.  I'm putting into
MODIFIED state... at least to prompt a retest and possible validation.

Comment 22 Corey Marthaler 2008-07-28 18:37:14 UTC

FYI - I ran an I/O load to 2 two-legged mirrors over the weekend and didn't see
any issues. I'm going to bump up a load a bit as well as add another mirror
before  marking this issue verified.

Comment 23 Corey Marthaler 2008-07-29 13:59:41 UTC

Ran I/O over night without issues on four mirrors. One was even a 3 way mirror.
Marking this verified (will reopen if issues are seen again).

Comment 25 errata-xmlrpc 2009-01-20 21:25:30 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0158.html