Bug 149542 - clvmd segfault on vgchange -an
clvmd segfault on vgchange -an
Status: CLOSED WORKSFORME
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: lvm2-cluster (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Christine Caulfield
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-02-23 16:14 EST by Derek Anderson
Modified: 2010-01-11 23:03 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-01 11:46:51 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Derek Anderson 2005-02-23 16:14:49 EST
Description of problem:
I'll need to try to narrow this down, but I'm logging everything here
before I forget something.

Setup: 4 node cluster.  link-10,link-11,link-12 are x86.  link-08 is
x86_64 Opteron.

Created a pool device and GFS from another node attached to this
cluster.  Umounted and unassembled the pool.  On the 6.1 cluster (from
link-12) ran a vgconvert -M2 to make it an lvm2 vol and used gfs_tool
df to update the lock type to lock_dlm.  Mounted it and ran some
traffic (placemaker) during a meeting (~45 mins).  After that I
umounted and ran the new gfs_fsck on it twice.  I then ran "vgchange
-an" on link-12.  I had two other existing lvm2 volumes: /dev/VG1/LV1
and /dev/LV2/VG2 managed by the cluster.

On link-12:
[root@link-12 /]# vgchange -an
  0 logical volume(s) in volume group "VG2" now active
  EOF reading CLVMD
  1 logical volume(s) in volume group "POOLIO1" now active
  Error writing data to clvmd: Broken pipe
  Error writing data to clvmd: Broken pipe
  Can't lock VG1: skipping
[root@link-12 /]# echo $?
0

Got this message in link-08 /var/log/messages:
Feb 23 15:04:08 link-08 udev[16927]: removing device node '/dev/dm-3'
Feb 23 15:04:08 link-08 kernel: clvmd[2287]: segfault at
000000335952d700 rip 00000032593684bd rsp 00000000413fc360 error 4

Now the clvmd is dead on link-08, but the cluster thinks it is still
running:
[root@link-08 tmp]# clu_state
Node  Votes Exp Sts  Name
   8    1    4   M   link-08-pvt
  10    1    4   M   link-10-pvt
  11    1    4   M   link-11-pvt
  12    1    4   M   link-12-pvt
Protocol version: 5.0.1
Config version: 234
Cluster name: MILTON
Cluster ID: 4812
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 4
Expected_votes: 4
Total_votes: 4
Quorum: 3
Active subsystems: 3
Node name: link-08-pvt
Node addresses: 10.1.1.158

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[8 11 12 10]

DLM Lock Space:  "clvmd"                             2   3 run       -
[8 11 12 10]

Version-Release number of selected component (if applicable):
[root@link-08 tmp]# clvmd -V
Cluster LVM daemon version: 2.01.04 (2005-02-09)
Protocol version:           0.2.1

How reproducible:
I'll try to boil down the steps necessary to reproduce this.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Christine Caulfield 2005-02-24 11:17:37 EST
If clvmd is built with debug then output of clvmd -d would be very
helpful.

It would also be useful to eliminate (or implicate) the pool
conversion too.
Comment 2 Kiersten (Kerri) Anderson 2005-03-02 15:55:54 EST
Blocker bug - add it to the list
Comment 3 Christine Caulfield 2005-03-08 08:36:15 EST
Marking this as NEEDINFO as there's nothing I can do with it in it's
present state.
Comment 4 Kiersten (Kerri) Anderson 2005-03-22 14:06:17 EST
Removing from Blocker list until it is recreated.

Note You need to log in before you can comment on or make changes to this bug.