Bug 236432
Summary: | Failures to multiple cmirrors on mulitple VGs causes problems | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> | ||||||||||
Component: | lvm2-cluster | Assignee: | Christine Caulfield <ccaulfie> | ||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 4 | CC: | agk, dwysocha, jbrassow, mbroz, prockai | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | 2.02.21-7 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2008-02-28 13:30:10 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Corey Marthaler
2007-04-13 20:57:48 UTC
clmvd dying Patrick, Here is how I set this up... I have created 3 VGs each with at least 3 PVs so that I can have a cmirror created out of each VG. Also, I partitioned one of the devices (in this case /dev/sdh) into three pieces, that way there is a piece in each VG. When I fail /dev/sdh, all three cmirrors, all three VGs, will experience a failure at the same time. [root@link-08 ~]# pvscan PV /dev/sdh3 VG vg3 lvm2 [402.62 GB / 392.62 GB free] PV /dev/sde1 VG vg3 lvm2 [135.66 GB / 125.66 GB free] PV /dev/sdf1 VG vg3 lvm2 [135.66 GB / 135.65 GB free] PV /dev/sdh2 VG vg2 lvm2 [402.62 GB / 392.62 GB free] PV /dev/sdc1 VG vg2 lvm2 [135.66 GB / 125.66 GB free] PV /dev/sdd1 VG vg2 lvm2 [135.66 GB / 135.65 GB free] PV /dev/hda2 VG VolGroup00 lvm2 [74.41 GB / 96.00 MB free] PV /dev/sdh1 VG vg1 lvm2 [402.62 GB / 392.62 GB free] PV /dev/sda1 VG vg1 lvm2 [135.66 GB / 125.66 GB free] PV /dev/sdb1 VG vg1 lvm2 [135.66 GB / 135.65 GB free] PV /dev/sdg1 lvm2 [135.66 GB] Total: 11 [2.18 TB] / in use: 10 [2.05 TB] / in no VG: 1 [135.66 GB] [root@link-08 ~]# vgs -a -o +devices VG #PV #LV #SN Attr VSize VFree Devices VolGroup00 1 2 0 wz--n- 74.41G 96.00M /dev/hda2(0) VolGroup00 1 2 0 wz--n- 74.41G 96.00M /dev/hda2(2316) vg1 3 4 0 wz--nc 673.93G 653.93G mirror1_mimage_0(0),mirror1_mimage_1(0) vg1 3 4 0 wz--nc 673.93G 653.93G /dev/sdb1(0) vg1 3 4 0 wz--nc 673.93G 653.93G /dev/sdh1(0) vg1 3 4 0 wz--nc 673.93G 653.93G /dev/sda1(0) vg2 3 4 0 wz--nc 673.93G 653.93G mirror2_mimage_0(0),mirror2_mimage_1(0) vg2 3 4 0 wz--nc 673.93G 653.93G /dev/sdd1(0) vg2 3 4 0 wz--nc 673.93G 653.93G /dev/sdh2(0) vg2 3 4 0 wz--nc 673.93G 653.93G /dev/sdc1(0) vg3 3 4 0 wz--nc 673.93G 653.93G mirror3_mimage_0(0),mirror3_mimage_1(0) vg3 3 4 0 wz--nc 673.93G 653.93G /dev/sdf1(0) vg3 3 4 0 wz--nc 673.93G 653.93G /dev/sdh3(0) vg3 3 4 0 wz--nc 673.93G 653.93G /dev/sde1(0) [root@link-08 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Devices LogVol00 VolGroup00 -wi-ao 72.38G /dev/hda2(0) LogVol01 VolGroup00 -wi-ao 1.94G /dev/hda2(2316) mirror1 vg1 Mwi-ao 10.00G mirror1_mlog 100.00 mirror1_mimage_0(0),mirror1_mimage_1(0) [mirror1_mimage_0] vg1 iwi-ao 10.00G /dev/sdh1(0) [mirror1_mimage_1] vg1 iwi-ao 10.00G /dev/sda1(0) [mirror1_mlog] vg1 lwi-ao 4.00M /dev/sdb1(0) mirror2 vg2 Mwi-ao 10.00G mirror2_mlog 100.00 mirror2_mimage_0(0),mirror2_mimage_1(0) [mirror2_mimage_0] vg2 iwi-ao 10.00G /dev/sdh2(0) [mirror2_mimage_1] vg2 iwi-ao 10.00G /dev/sdc1(0) [mirror2_mlog] vg2 lwi-ao 4.00M /dev/sdd1(0) mirror3 vg3 Mwi-ao 10.00G mirror3_mlog 100.00 mirror3_mimage_0(0),mirror3_mimage_1(0) [mirror3_mimage_0] vg3 iwi-ao 10.00G /dev/sdh3(0) [mirror3_mimage_1] vg3 iwi-ao 10.00G /dev/sde1(0) [mirror3_mlog] vg3 lwi-ao 4.00M /dev/sdf1(0) I created a gfs filesystem on each mirror, and mounted each on all three nodes. I then started I/O to each gfs before the failure. /dev/mapper/vg1-mirror1 on /mnt/1 type gfs (rw) /dev/mapper/vg2-mirror2 on /mnt/2 type gfs (rw) /dev/mapper/vg3-mirror3 on /mnt/3 type gfs (rw) The I/O that I used was xdoio/xiogen, but I'm sure any will work. Now I fail /dev/sdh and the auto lvconvert to linear will usually work but when I bring the device back online and convert the linears back to mirrors, it will possibly hit either this bug or 236289, but not every time. It may cause this bug faster if you use 3 legged (-m 2) mirrors. There seems to be some sort of discrepancy between the devices that are registered and those that are recovered (according to my log files). Sometimes I get all three mirrors registered but only two recovered - for me this is the most common situation. Some other times, it seems that only two mirrors are registered but 3 are recovered...that can't be right (though it might explain a crash!) but I've not seen that since I put more logging in (hah). What gives you the impression there is a discrepancy (what's in the log file)? I believe that I hit this bug this morning. I had 3 3-way mirrors (2 corelog and 1 disklog) all on different VGs, and after failing a leg of each of the three mirrors, clvmd died on one node only and caused everything else to lockup. Apr 20 11:20:42 link-07 lvm[6305]: Locking tfUW1A1zzWkxldikE6dQLZop1Se6pcdUH1T0jzc26AlGUkpq2TYiSolF8xP12loT at 0x9e Apr 20 11:20:42 link-07 kernel: dm-cmirror: Removing 8xP12loT (1) Apr 20 11:20:42 link-07 kernel: dm-cmirror: 2 mark requests remain at cluster log deactivation Apr 20 11:20:42 link-07 kernel: dm-cmirror: 0 region user structures freed Apr 20 11:20:43 link-07 lvm[6305]: dm info LVM-tfUW1A1zzWkxldikE6dQLZop1Se6pcdUH1T0jzc26AlGUkpq2TYiSolF8xP12loT OF [16384] Apr 20 11:20:43 link-07 lvm[6305]: Monitoring mirror device V1-cm1 for events Apr 20 11:20:43 link-07 lvm[6305]: Another thread is handling an event. Waiting... Apr 20 11:20:43 link-07 kernel: SCSI error : <2 0 1 1> return code = 0x10000 Apr 20 11:20:43 link-07 kernel: end_request: I/O error, dev sdh, sector 855140367 Apr 20 11:20:43 link-07 kernel: dm-cmirror: LOG INFO: Apr 20 11:20:43 link-07 kernel: dm-cmirror: uuid: LVM-tfUW1A1zzWkxldikE6dQLZop1Se6pcdUH1T0jzc26AlGUkpq2TYiSolF8xP12loT Apr 20 11:20:43 link-07 kernel: dm-cmirror: uuid_ref : 2 Apr 20 11:20:43 link-07 kernel: dm-cmirror: ?region_count: 20480 Apr 20 11:20:43 link-07 kernel: dm-cmirror: ?sync_count : 20480 Apr 20 11:20:43 link-07 kernel: dm-cmirror: ?sync_search : 20481 Apr 20 11:20:43 link-07 kernel: dm-cmirror: in_sync : YES Apr 20 11:20:43 link-07 kernel: dm-cmirror: server_id : 4 Apr 20 11:20:43 link-07 kernel: dm-cmirror: server_valid: YES Apr 20 11:20:43 link-07 kernel: dm-cmirror: Creating t2VhLaPk (2) Apr 20 11:20:43 link-07 kernel: dm-cmirror: server_id=dead, server_valid=0, t2VhLaPk Apr 20 11:20:43 link-07 kernel: dm-cmirror: trigger = LRT_GET_SYNC_COUNT Apr 20 11:20:43 link-07 kernel: dm-cmirror: Performing flush to work around bug 235040 Apr 20 11:20:43 link-07 kernel: dm-cmirror: Log flush complete Apr 20 11:20:43 link-07 kernel: clvmd[2730]: segfault at 0000000000000010 rip 00000000004255f2 rsp 0000000041e01da0 error 4 Apr 20 11:20:43 link-07 lvm[6305]: EOF reading CLVMD Apr 20 11:20:43 link-07 lvm[6305]: Problem reactivating cm1 I'm hitting this, but finding it hard to gather information: From messages: Apr 20 14:47:35 neo-06 lvm[3515]: LV percent: 100.000000 Apr 20 14:47:35 neo-06 lvm[3515]: Reducing mirror set from 3 to 2 image(s). ... Apr 20 14:47:36 neo-06 lvm[3515]: Pre-Committing vg2 metadata (6) to /dev/sdf2 header at 4096 Apr 20 14:47:36 neo-06 lvm[3515]: Locking pONgvBECf3UsRzsBpSrG6ihA7z7f8dL4oojjpoWuyUWnZeRffmvyyVTcXnY1uE7S at 0x9c Apr 20 14:47:36 neo-06 lvm[3515]: Error writing data to clvmd: Broken pipe Apr 20 14:47:36 neo-06 lvm[3515]: Failed to lock lv2 From clvmd -d: CLVMD[b7580bb0]: Apr 20 14:47:35 do_lock_lv: resource 'iKN02g00wrkWjskQfw29eWflbN7aM9aBw68oGB4UmPwGZe394dnIj2Q vFGd3sZQE', cmd = 0x1e, flags = 84 CLVMD[b7580bb0]: Apr 20 14:47:35 do_resume_lv, lock not already held CLVMD[b7580bb0]: Apr 20 14:47:35 Command return is 0 CLVMD[b7580bb0]: Apr 20 14:47:35 LVM thread waiting for work *** last message *** If the timestamps are correct, clvmd went away 1 second before the broken pipe. No indication of *why* it went away. On each machine, failure events from the kernel will trigger dmeventd to start vgreduce. Each machine will process the events serially; and each vgreduce on a volume group happens serially (due to VG locking). However, if you are really unlucky, you could have each machine doing a vgreduce on different volume groups at the same time. That doesn't necessarily point to any bugs... just something to be aware of. From device-mapper/dmeventd/dmeventd.c /* * Check against filter. * * If there's current events delivered from _event_wait() AND * the device got registered for those events AND * those events haven't been processed yet, call * the DSO's process_event() handler. */ This definitly isn't happening correctly.... but shouldn't hurt us (I don't think). Created attachment 153217 [details]
Backtrace of failed clvmd process
doing a LCK_LV_RESUME contents of vg is bad (gdb) frame 4 #4 0x08073140 in _read_vg (fid=0x9d18a50, cft=0x9d8f3a8) at format_text/import_vsn1.c:756 756 log_error("Couldn't find all physical volumes for volume " (gdb) p vg $1 = (struct volume_group *) 0x9d1c238 (gdb) p *vg $2 = {cmd = 0xdeaddead, fid = 0xdeaddead, seqno = 3735936685, id = {uuid = "�}, name = 0xdeaddead <Address 0xdeaddead out of bounds>, system_id = 0xdeaddead <Address 0xdeaddead out of bounds>, status = 3735936685, alloc = 3735936685, extent_size = 3735936685, extent_count = 3735936685, free_count = 3735936685, max_lv = 3735936685, max_pv = 3735936685, pv_count = 3735936685, pvs = {n = 0xdeaddead, p = 0xdeaddead}, lv_count = 3735936685, snapshot_count = 3735936685, lvs = {n = 0xdeaddead, p = 0xdeaddead}, tags = {n = 0xdeaddead, p = 0xdeaddead}} overlapping thread poisoning our memory? It's unlikely to be threads overlapping. clvmd runs all LVM operations in a single thread. Created attachment 153283 [details]
clmvd dies again... here's the backtrace.
(gdb) list 687 return dev->fd; 688 } 689 690 const char *dev_name(const struct device *dev) 691 { 692 return (dev) ? list_item(dev->aliases.n, struct str_list)->str : 693 "unknown device"; 694 } (gdb) p dev $1 = (const struct device *) 0xdeaddead I'm saving these core files, so anytime you want to look at them... able to reproduce with 2-way mirrors Created attachment 153299 [details]
evidence of threads coliding?
Created attachment 153335 [details]
Add locking arounf lvm calls.
Yep. Dead right. This innocent-looking call in post_lock_lv() is the culprit:
if (!lv_info_by_lvid(cmd, resource, &lvi, 0))
Short-term I think putting a mutex around that should fix it. Longer-term I'd
like to find a non-locking fix.
This patch fixes things for me, though I'm seeing the lockup bug in 236289
rather than the segfault.
Mutex fix checked in: Checking in daemons/clvmd/lvm-functions.c; /cvs/lvm2/LVM2/daemons/clvmd/lvm-functions.c,v <-- lvm-functions.c new revision: 1.29; previous revision: 1.28 done Marking this bug verified, it hasn't been seen in over six months since a fix was added. |