Bug 222502
Summary: | failures with dmevent cause cmirror issues | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> |
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4 | CC: | agk, dwysocha, jbrassow, mbroz, prockai |
Target Milestone: | --- | Keywords: | Reopened, TestBlocker |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-03-02 13:36:21 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2007-01-12 23:56:01 UTC
I tried that again with gfs on top of the 2 cmirrors and had I/O going to them from all nodes in the cluster. After the primary leg failure, I again saw this issue and all the I/O to those mirrors ended up deadlocking. This bug needs to be on the blocker list for cmirrors. Devel ACK for blocker-beta for cluster 4.5 Was the cluster mirror in-sync? Did you get any core dumps? (There have been changes made to dmeventd - check to ensure that it is running.) The mirrors have always been in-sync before attempting the failure case and I didn't see any core dumps and have no reason to believe that dmeventd wasn't running. It appears that dmeventd is running right up until a write is attempted to the failed mirror, at which time it stops for some reason. where "for some reason" == seg fault try: sysctl kernel.core_pattern=/tmp/core when you start up. The your core files will appear as /tmp/core.* I am also seeing cmirror creation issues now due to what also appears to be dmeventd failures. Changing title of this bug and bumping priority. [root@link-08 ~]# lvcreate -m 1 -L 1G -n mirror vg /dev/sda1:0-2000 /dev/sdb1:0-2000 /dev/sdh1:0-100 Error locking on node link-07: vg-mirror: event registration failed: libdevmapper-event-lvm2mirror.so LVM-RgEPrNphjR3yUmyDl8At2ccc3MBgeEYiq09j2Q60abIsnd8liJdela6Z05Sotbss 65280 0 Error locking on node link-04: vg-mirror: event registration failed: libdevmapper-event-lvm2mirror.so LVM-RgEPrNphjR3yUmyDl8At2ccc3MBgeEYiq09j2Q60abIsnd8liJdela6Z05Sotbss 65280 0 Error locking on node link-02: vg-mirror: event registration failed: libdevmapper-event-lvm2mirror.so LVM-RgEPrNphjR3yUmyDl8At2ccc3MBgeEYiq09j2Q60abIsnd8liJdela6Z05Sotbss 65280 0 Error locking on node link-08: vg-mirror: event registration failed: libdevmapper-event-lvm2mirror.so LVM-RgEPrNphjR3yUmyDl8At2ccc3MBgeEYiq09j2Q60abIsnd8liJdela6Z05Sotbss 65280 0 Failed to activate new LV. Just a note that QA is still seeing this issue and that this is blocking any extensive cmirror testing. dmeventd: [...] select(6, [5], NULL, NULL, {1, 0}) = 0 (Timeout) select(6, [5], NULL, NULL, {1, 0}) = 0 (Timeout) select(6, [5], NULL, NULL, {1, 0}) = 0 (Timeout) select(6, [5], NULL, NULL, {1, 0}) = 0 (Timeout) select(6, [5], NULL, NULL, {1, 0}) = ? ERESTARTNOHAND (To be restarted) PANIC: attached pid 3107 exited Process 3107 detached lvm2-cluster-2.02.20-1.el4 lvm2-2.02.20-1.el4 device-mapper-1.02.16-1.el4 2.6.9-43.ELsmp The locking model in lvm.conf changed. Woulda been nice to know 10 days ago. :) |