Bug 429596 - RHEL5 cmirror tracker: error during CPG processing
RHEL5 cmirror tracker: error during CPG processing
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cmirror (Show other bugs)
5.2
All Linux
low Severity low
: rc
: ---
Assigned To: Jonathan Earl Brassow
Cluster QE
:
Depends On:
Blocks: 430797
  Show dependency treegraph
 
Reported: 2008-01-21 16:11 EST by Corey Marthaler
Modified: 2010-04-27 11:01 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-04-27 11:01:52 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2008-01-21 16:11:13 EST
Description of problem:
I'm seeing the following error during most cmirror config operations (these
operations do however seem to work each time). 

Jan 21 15:05:30 hayes-03 clogd[15545]: No match for cluster response:
DM_CLOG_RESUME/LVM-EMxcC1b2TId4QmumVcb9mOePcYh
Q0i2ETB9YfbQRUN6WMkmOcQUNonQQFxxefw3g
Jan 21 15:05:30 hayes-03 clogd[15545]: Error while processing CPG message


Version-Release number of selected component (if applicable):
cmirror-1.1.5-4.el5

How reproducible:
Often
Comment 1 RHEL Product and Program Management 2008-02-04 10:17:55 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 2 Jonathan Earl Brassow 2008-02-04 12:08:50 EST
I've never seen this... could you please confirm?
Comment 3 Jonathan Earl Brassow 2008-02-04 14:26:23 EST
just repo'ed this, but with different request type:

clogd[2102]: No match for cluster response:
DM_CLOG_GET_SYNC_COUNT/LVM-TZAP1flLyyxiVgHEe7YcVp4h7gFmhUAG3Y785UkameqNDAOFc21IOg86TaM1WjMJ
Comment 4 Corey Marthaler 2008-02-08 15:05:02 EST
Reproduced the CPG error messages during log device failure testing. In this
case, the result was a deadlocked (not just really slow) mirror.

cmirror-1.1.13-1.el5/kmod-cmirror-0.1.6-1.el5.


Feb  8 11:40:57 taft-04 kernel: sd 1:0:0:7: rejecting I/O to offline device
Feb  8 11:40:57 taft-04 lvm[7071]: No longer monitoring mirror device
helter_skelter-syncd_log_3legs_1 for events
Feb  8 11:41:12 taft-04 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_GET_RESYNC_WORK:1068855 - retrying
Feb  8 11:41:12 taft-04 clogd[6767]: rw_log:  write failure: Input/output error
Feb  8 11:41:12 taft-04 clogd[6767]: Error writing to disk log
Feb  8 11:41:12 taft-04 kernel: sd 1:0:0:7: rejecting I/O to offline device
Feb  8 11:41:12 taft-04 kernel: device-mapper: dm-log-clustered: Server error
while processing request [DM_CLOG_FLUSH]: -5
Feb  8 11:41:12 taft-04 lvm[7071]: Monitoring mirror device
helter_skelter-syncd_log_3legs_1 for events
Feb  8 11:41:13 taft-04 clogd[6767]: cpg_mcast_joined error: 9
Feb  8 11:41:13 taft-04 clogd[6767]: cluster_send failed at: local.c:212
(do_local_work)
Feb  8 11:41:13 taft-04 clogd[6767]: [] Unable to send (null) to cluster:
Invalid exchange
Feb  8 11:41:13 taft-04 clogd[6767]: Bad callback on local/4
Feb  8 11:41:13 taft-04 kernel: device-mapper: dm-log-clustered: Stray request
returned: <NULL>, 0
Feb  8 11:41:14 taft-04 lvm[7071]: No longer monitoring mirror device
helter_skelter-syncd_log_3legs_2 for events
Feb  8 11:41:28 taft-04 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_IN_SYNC:1077480 - retrying
Feb  8 11:41:28 taft-04 kernel: device-mapper: dm-log-clustered: Server error
while processing request [DM_CLOG_FLUSH]: -5
Feb  8 11:41:28 taft-04 kernel: device-mapper: dm-log-clustered: Server error
while processing request [DM_CLOG_FLUSH]: -5
Feb  8 11:41:28 taft-04 clogd[6767]: [QbkqfJyQ] No match for cluster response:
DM_CLOG_IS_REMOTE_RECOVERING:1083771
Feb  8 11:41:28 taft-04 clogd[6767]: Current list:
Feb  8 11:41:28 taft-04 clogd[6767]:    [none]
Feb  8 11:41:28 taft-04 clogd[6767]: [QbkqfJyQ] Error while processing CPG
message, DM_CLOG_IS_REMOTE_RECOVERING: Invalid argument
Feb  8 11:41:28 taft-04 clogd[6767]: [QbkqfJyQ]    Response  : YES
Feb  8 11:41:28 taft-04 clogd[6767]: [QbkqfJyQ]    Originator: 4
Feb  8 11:41:28 taft-04 clogd[6767]: [QbkqfJyQ]    Responder : 1
Feb  8 11:41:28 taft-04 clogd[6767]: HISTORY::
Feb  8 11:41:28 taft-04 clogd[6767]: 0:10) SEQ#=1083769, UUID=2RP77UEX,
TYPE=DM_CLOG_FLUSH, ORIG=4, RESP=YES, RSPR=1
Feb  8 11:41:28 taft-04 clogd[6767]: 1:11) SEQ#=777788, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 2:12) SEQ#=1083770, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=4, RESP=YES, RSPR=1
Feb  8 11:41:28 taft-04 clogd[6767]: 3:13) SEQ#=758142, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 4:14) SEQ#=777789, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 5:15) SEQ#=758143, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 6:16) SEQ#=777790, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 7:17) SEQ#=758144, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 8:18) SEQ#=777791, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 9:19) SEQ#=758145, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 10:0) SEQ#=777792, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 11:1) SEQ#=758146, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 12:2) SEQ#=777793, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 13:3) SEQ#=758147, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 14:4) SEQ#=777794, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 15:5) SEQ#=758148, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 16:6) SEQ#=777795, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 17:7) SEQ#=758149, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=2, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 18:8) SEQ#=1083771, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=4, RESP=NO
Feb  8 11:41:28 taft-04 clogd[6767]: 19:9) SEQ#=777796, UUID=QbkqfJyQ,
TYPE=DM_CLOG_IS_REMOTE_RECOVERING, ORIG=1, RESP=NO
Feb  8 11:41:43 taft-04 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_IS_REMOTE_RECOVERING:1083771 - retrying
Feb  8 11:41:44 taft-04 lvm[7071]: Monitoring mirror device
helter_skelter-syncd_log_3legs_2 for events
Feb  8 11:42:24 taft-04 clogd[6767]: cpg_mcast_joined error: 9
Feb  8 11:42:24 taft-04 clogd[6767]: cluster_send failed at: local.c:212
(do_local_work)
Feb  8 11:42:24 taft-04 clogd[6767]: [] Unable to send (null) to cluster:
Invalid exchange
Feb  8 11:42:24 taft-04 clogd[6767]: Bad callback on local/4
Feb  8 11:42:24 taft-04 kernel: device-mapper: dm-log-clustered: Stray request
returned: <NULL>, 0
Feb  8 11:42:39 taft-04 kernel: device-mapper: dm-log-clustered: Request timed
out on DM_CLOG_IS_REMOTE_RECOVERING:1102107 - retrying
[...]
Comment 5 Corey Marthaler 2008-02-08 16:14:01 EST
After talking this over with Jon, comment #4 is a different bug. So, moving this
back to NEEDINFO and filing a seperate bz for the new issue.  
Comment 6 Kiersten (Kerri) Anderson 2008-02-11 15:52:45 EST
Cluster mirror support did not stabilize in time to make the rhel 5.2 beta
compose so the packages were pulled from the release.  Clearing the release
flags since at this point, we need to monitor all of the cluster mirror defects
for resolution. 
Comment 7 Jonathan Earl Brassow 2008-02-11 16:36:25 EST
This issue is fixed... results in comment #4 should be separate bug.
Comment 8 Corey Marthaler 2008-02-11 17:25:49 EST
FYI - the bz filed for comment #4 is 432109.
Comment 9 Corey Marthaler 2008-02-14 11:15:08 EST
Have not seen these messages on any of the latest builds, marking verified.
Comment 11 Alasdair Kergon 2010-04-27 11:01:52 EDT
Assuming this VERIFIED fix got released.  Closing.
Reopen if it's not yet resolved.

Note You need to log in before you can comment on or make changes to this bug.