Bug 539670 - clvmd segfaults when attempting basic operations
Summary: clvmd segfaults when attempting basic operations
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2-cluster
Version: 5.4
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Milan Broz
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-20 19:53 UTC by Corey Marthaler
Modified: 2013-03-01 04:07 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 09:02:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0299 0 normal SHIPPED_LIVE lvm2-cluster bug fix and enhancement update 2010-03-29 14:26:30 UTC

Description Corey Marthaler 2009-11-20 19:53:59 UTC
Description of problem:
I hit this while running mirror_sanity on the hayes cluster (hayes-0[123]). When the mirror create was attempted, two of the nodes segfaulted.

SCENARIO - [create_mirror_on_1Kextent_vg]
Create a mirror on a VG with an extent size of only 1K
Recreating PVs/VG with smaller (1K) extent size
hayes-01: vgcreate -s 1K mirror_sanity /dev/etherd/e1.1p1 /dev/etherd/e1.1p2 /dev/etherd/e1.1p3 /dev/etherd/e1.1p4 /dev/etherd/e1.1p5
  clvmd not running on node hayes-03
  clvmd not running on node hayes-02
  Unable to drop cached metadata for VG mirror_sanity.
  Failed to create mirror log.
  clvmd not running on node hayes-03
  clvmd not running on node hayes-02
  Unable to drop cached metadata for VG mirror_sanity.
  Manual intervention may be required to remove abandoned LV(s) before retrying.

Nov 20 12:03:01 hayes-01 qarshd[4993]: Running cmdline: vgcreate -s 1K mirror_sanity /dev/etherd/e1.1p1 /dev/etherd/e1.1p2 /dev/etherd/e1.1p3 /dev/etherd/e1.1p4 /dev/etherd/e1.1p5
Nov 20 12:05:05 hayes-01 xinetd[2811]: EXIT: qarsh status=0 pid=4993 duration=124(sec)
Nov 20 12:05:05 hayes-01 xinetd[2811]: START: qarsh pid=5023 from=10.15.80.47
Nov 20 12:05:05 hayes-01 qarshd[5023]: Talking to peer 10.15.80.47:59154
Nov 20 12:05:05 hayes-01 qarshd[5023]: Running cmdline: lvcreate -m 1 -n mirror_on_1Kextent_vg -L 20M mirror_sanity

Nov 20 12:02:48 hayes-02 kernel: clvmd[3519]: segfault at 0000000000000000 rip 0000000000000000 rsp 00000000415e28c8 error 14

Nov 20 12:04:53 hayes-03 kernel: clvmd[3515]: segfault at 0000000000000000 rip 0000000000000000 rsp 0000000042f548c8 error 14


Version-Release number of selected component (if applicable):
[root@hayes-01 ~]# /usr/tests/sts-rhel5.4/lvm2/bin/lvm_rpms 
2.6.18-162.el5

lvm2-2.02.55-1.el5    BUILT: Fri Nov 20 07:48:44 CST 2009
lvm2-cluster-2.02.55-1.el5    BUILT: Fri Nov 20 07:54:30 CST 2009
device-mapper-1.02.39-1.el5    BUILT: Wed Nov 11 12:31:44 CST 2009
cmirror-1.1.39-2.el5    BUILT: Mon Jul 27 15:39:05 CDT 2009
kmod-cmirror-0.1.22-1.el5    BUILT: Mon Jul 27 15:28:46 CDT 2009

Comment 1 Corey Marthaler 2009-11-20 20:13:46 UTC
Looks like there were a bunch of the following messages in previous test cases before the segfaults.

"Internal error: _memlock_count has dropped below 0."

I'll attempt to get a core for this issue.

Comment 2 Corey Marthaler 2009-11-20 20:25:57 UTC
This is worse than just mirror creation.

[root@hayes-03 ~]# vgcreate mirror_sanity /dev/etherd/e1.1p*
Nov 20 14:22:57 hayes-02 kernel: clvmd[3308]: segfault at 0000000000000000 rip 00000000004

Comment 3 Corey Marthaler 2009-11-20 20:44:18 UTC
core files are located in /home/msp/cmarthal/core.35*

Comment 4 Milan Broz 2009-11-21 08:23:28 UTC
The 2.02.55 clvmd build have fatal bug, we have already found the problem, should be be fixed in next build.

Comment 5 Milan Broz 2009-11-24 19:23:04 UTC
Fixed in lvm2-cluster-2_02_56-1_el5.

Comment 8 Corey Marthaler 2010-01-28 19:40:10 UTC
Fix verified in lvm2-2.02.56-6.el5/lvm2-cluster-2.02.56-6.el5.

Comment 10 errata-xmlrpc 2010-03-30 09:02:10 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0299.html


Note You need to log in before you can comment on or make changes to this bug.