149542 – clvmd segfault on vgchange -an

Bug 149542 - clvmd segfault on vgchange -an

Summary: clvmd segfault on vgchange -an

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	lvm2-cluster
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Christine Caulfield
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-02-23 21:14 UTC by Derek Anderson
Modified:	2010-01-12 04:03 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-02-01 16:46:51 UTC
Embargoed:

Attachments	(Terms of Use)

Description Derek Anderson 2005-02-23 21:14:49 UTC

Description of problem:
I'll need to try to narrow this down, but I'm logging everything here
before I forget something.

Setup: 4 node cluster.  link-10,link-11,link-12 are x86.  link-08 is
x86_64 Opteron.

Created a pool device and GFS from another node attached to this
cluster.  Umounted and unassembled the pool.  On the 6.1 cluster (from
link-12) ran a vgconvert -M2 to make it an lvm2 vol and used gfs_tool
df to update the lock type to lock_dlm.  Mounted it and ran some
traffic (placemaker) during a meeting (~45 mins).  After that I
umounted and ran the new gfs_fsck on it twice.  I then ran "vgchange
-an" on link-12.  I had two other existing lvm2 volumes: /dev/VG1/LV1
and /dev/LV2/VG2 managed by the cluster.

On link-12:
[root@link-12 /]# vgchange -an
  0 logical volume(s) in volume group "VG2" now active
  EOF reading CLVMD
  1 logical volume(s) in volume group "POOLIO1" now active
  Error writing data to clvmd: Broken pipe
  Error writing data to clvmd: Broken pipe
  Can't lock VG1: skipping
[root@link-12 /]# echo $?
0

Got this message in link-08 /var/log/messages:
Feb 23 15:04:08 link-08 udev[16927]: removing device node '/dev/dm-3'
Feb 23 15:04:08 link-08 kernel: clvmd[2287]: segfault at
000000335952d700 rip 00000032593684bd rsp 00000000413fc360 error 4

Now the clvmd is dead on link-08, but the cluster thinks it is still
running:
[root@link-08 tmp]# clu_state
Node  Votes Exp Sts  Name
   8    1    4   M   link-08-pvt
  10    1    4   M   link-10-pvt
  11    1    4   M   link-11-pvt
  12    1    4   M   link-12-pvt
Protocol version: 5.0.1
Config version: 234
Cluster name: MILTON
Cluster ID: 4812
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 4
Expected_votes: 4
Total_votes: 4
Quorum: 3
Active subsystems: 3
Node name: link-08-pvt
Node addresses: 10.1.1.158

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[8 11 12 10]

DLM Lock Space:  "clvmd"                             2   3 run       -
[8 11 12 10]

Version-Release number of selected component (if applicable):
[root@link-08 tmp]# clvmd -V
Cluster LVM daemon version: 2.01.04 (2005-02-09)
Protocol version:           0.2.1

How reproducible:
I'll try to boil down the steps necessary to reproduce this.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Christine Caulfield 2005-02-24 16:17:37 UTC

If clvmd is built with debug then output of clvmd -d would be very
helpful.

It would also be useful to eliminate (or implicate) the pool
conversion too.

Comment 2 Kiersten (Kerri) Anderson 2005-03-02 20:55:54 UTC

Blocker bug - add it to the list

Comment 3 Christine Caulfield 2005-03-08 13:36:15 UTC

Marking this as NEEDINFO as there's nothing I can do with it in it's
present state.

Comment 4 Kiersten (Kerri) Anderson 2005-03-22 19:06:17 UTC

Removing from Blocker list until it is recreated.

Note You need to log in before you can comment on or make changes to this bug.