RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1412843 - vgs segfault after re-enabling failed raid10 images when lvmetad is not running
Summary: vgs segfault after re-enabling failed raid10 images when lvmetad is not running
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.9
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1434054
TreeView+ depends on / blocked
 
Reported: 2017-01-12 23:32 UTC by Corey Marthaler
Modified: 2017-12-06 10:58 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1434054 (view as bug list)
Environment:
Last Closed: 2017-12-06 10:58:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2017-01-12 23:32:44 UTC
Description of problem:
This test case is being used again in order to verify bug 1025322. This bug does not appear to happen when running this case with lvmetad running. I downgraded to the 6.8 lvm rpms (lvm2-2.02.143-7) and was able to see this bug there as well, so this does not appear to be a regression. Also, other raid10 image failure scenarios do not appear to hit this. This seems to be specific to this "kill three in-sync raid10 images" case.


Core was generated by `vgs'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f8a17aede37 in __strncpy_sse2 () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f8a17aede37 in __strncpy_sse2 () from /lib64/libc.so.6
#1  0x00007f8a18d3dada in lvmcache_info_from_pvid (pvid=<value optimized out>, valid_only=0) at /usr/include/bits/string3.h:121
#2  0x00007f8a18d8990d in _check_or_repair_pv_ext (cmd=<value optimized out>, vgname=<value optimized out>, vgid=0x7f8a00000000 <Address 0x7f8a00000000 out of bounds>, warn_flags=423090032,
    consistent=<value optimized out>, precommitted=0) at metadata/metadata.c:3752
#3  _vg_read (cmd=<value optimized out>, vgname=<value optimized out>, vgid=0x7f8a00000000 <Address 0x7f8a00000000 out of bounds>, warn_flags=423090032, consistent=<value optimized out>,
    precommitted=0) at metadata/metadata.c:4308
#4  0x00007f8a18d8d0a8 in vg_read_internal (cmd=<value optimized out>, vgname=0x7f8a1936fd28 "black_bird", vgid=<value optimized out>, warn_flags=1, consistent=0x7ffdd7370198)
    at metadata/metadata.c:4461
#5  0x00007f8a18d8d9ad in _recover_vg (cmd=0x7f8a1932a110, vg_name=0x7f8a1936fd28 "black_bird", vgid=0x7f8a1936fd00 "ro7f88KddlxD0DTXVdckkq2isNePid1j", read_flags=262144,
    lockd_state=<value optimized out>) at metadata/metadata.c:5189
#6  _vg_lock_and_read (cmd=0x7f8a1932a110, vg_name=0x7f8a1936fd28 "black_bird", vgid=0x7f8a1936fd00 "ro7f88KddlxD0DTXVdckkq2isNePid1j", read_flags=262144, lockd_state=<value optimized out>)
    at metadata/metadata.c:5499
#7  vg_read (cmd=0x7f8a1932a110, vg_name=0x7f8a1936fd28 "black_bird", vgid=0x7f8a1936fd00 "ro7f88KddlxD0DTXVdckkq2isNePid1j", read_flags=262144, lockd_state=<value optimized out>)
    at metadata/metadata.c:5585
#8  0x00007f8a18d274ce in _process_vgnameid_list (cmd=0x7f8a1932a110, argc=<value optimized out>, argv=<value optimized out>, one_vgname=<value optimized out>, read_flags=3610706576,
    handle=0x7ffdd7370410, process_single_vg=0x7f8a18d22590 <_vgs_single>) at toollib.c:1967
#9  process_each_vg (cmd=0x7f8a1932a110, argc=<value optimized out>, argv=<value optimized out>, one_vgname=<value optimized out>, read_flags=3610706576, handle=0x7ffdd7370410,
    process_single_vg=0x7f8a18d22590 <_vgs_single>) at toollib.c:2281
#10 0x00007f8a18d216d3 in _report (cmd=0x7f8a1932a110, argc=0, argv=0x7ffdd7370770, report_type=VGS) at reporter.c:920
#11 0x00007f8a18d13559 in lvm_run_command (cmd=0x7f8a1932a110, argc=0, argv=0x7ffdd7370770) at lvmcmdline.c:1655
#12 0x00007f8a18d177e9 in lvm2_main (argc=1, argv=0x7ffdd7370768) at lvmcmdline.c:2121
#13 0x00007f8a17a7ed1d in __libc_start_main () from /lib64/libc.so.6
#14 0x00007f8a18cfc269 in _start ()




================================================================================
Iteration 0.1 started at Thu Jan 12 17:00:06 CST 2017
================================================================================
Scenario kill_three_synced_raid10_3legs: Kill three legs (none of which share the same stripe leg) of synced 3 leg raid10 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_three_raid10_3legs_1
* sync:               1
* type:               raid10
* -m |-i value:       3
* leg devices:        /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdh1 /dev/sdf1
* spanned legs:       0
* manual repair:      0
* failpv(s):          /dev/sdb1 /dev/sdd1 /dev/sdh1
* additional snap:    /dev/sdc1
* failnode(s):        host-081
* lvmetad:            0
* raid fault policy:  allocate
******************************************************

Creating raids(s) on host-081...
host-081: lvcreate --type raid10 -i 3 -n synced_three_raid10_3legs_1 -L 500M black_bird /dev/sdb1:0-2400 /dev/sdc1:0-2400 /dev/sdd1:0-2400 /dev/sde1:0-2400 /dev/sdh1:0-2400 /dev/sdf1:0-2400

Current mirror/raid device structure(s):
  LV                                     Attr       LSize   Cpy%Sync Devices
   synced_three_raid10_3legs_1            rwi-a-r--- 504.00m 0.00     synced_three_raid10_3legs_1_rimage_0(0),synced_three_raid10_3legs_1_rimage_1(0),synced_three_raid10_3legs_1_rimage_2(0),synced_three_raid10_3legs_1_rimage_3(0),synced_three_raid10_3legs_1_rimage_4(0),synced_three_raid10_3legs_1_rimage_5(0)
   [synced_three_raid10_3legs_1_rimage_0] Iwi-aor--- 168.00m          /dev/sdb1(1)
   [synced_three_raid10_3legs_1_rimage_1] Iwi-aor--- 168.00m          /dev/sdc1(1)
   [synced_three_raid10_3legs_1_rimage_2] Iwi-aor--- 168.00m          /dev/sdd1(1)
   [synced_three_raid10_3legs_1_rimage_3] Iwi-aor--- 168.00m          /dev/sde1(1)
   [synced_three_raid10_3legs_1_rimage_4] Iwi-aor--- 168.00m          /dev/sdh1(1)
   [synced_three_raid10_3legs_1_rimage_5] Iwi-aor--- 168.00m          /dev/sdf1(1)
   [synced_three_raid10_3legs_1_rmeta_0]  ewi-aor---   4.00m          /dev/sdb1(0)
   [synced_three_raid10_3legs_1_rmeta_1]  ewi-aor---   4.00m          /dev/sdc1(0)
   [synced_three_raid10_3legs_1_rmeta_2]  ewi-aor---   4.00m          /dev/sdd1(0)
   [synced_three_raid10_3legs_1_rmeta_3]  ewi-aor---   4.00m          /dev/sde1(0)
   [synced_three_raid10_3legs_1_rmeta_4]  ewi-aor---   4.00m          /dev/sdh1(0)
   [synced_three_raid10_3legs_1_rmeta_5]  ewi-aor---   4.00m          /dev/sdf1(0)

* NOTE: not enough available devices for allocation fault polices to fully work *
(well technically, since we have 1, some allocation should work)

Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on host-081...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on host-081...

PV=/dev/sdd1
        synced_three_raid10_3legs_1_rimage_2: 1.P
        synced_three_raid10_3legs_1_rmeta_2: 1.P
PV=/dev/sdh1
        synced_three_raid10_3legs_1_rimage_4: 1.P
        synced_three_raid10_3legs_1_rmeta_4: 1.P
PV=/dev/sdb1
        synced_three_raid10_3legs_1_rimage_0: 1.P
        synced_three_raid10_3legs_1_rmeta_0: 1.P

Creating a snapshot volume of each of the raids
Writing verification files (checkit) to mirror(s) on...
        ---- host-081 ----

<start name="host-081_synced_three_raid10_3legs_1"  pid="9966" time="Thu Jan 12 17:00:48 2017" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-081 ----

Disabling device sdb on host-081
Disabling device sdd on host-081
Disabling device sdh on host-081

Getting recovery check start time from /var/log/messages: Jan 12 17:01
Attempting I/O to cause mirror down conversion(s) on host-081
dd if=/dev/zero of=/mnt/synced_three_raid10_3legs_1/ddfile count=10 bs=4M
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.311566 s, 135 MB/s

Verifying current sanity of lvm after the failure

Current mirror/raid device structure(s):
  Couldn't find device with uuid RNNCKN-Jrty-rA0j-xBfP-PcWX-9o5S-DMej09.
  Couldn't find device with uuid GGAkpd-uJGq-dfyR-kul4-NdEQ-TkQv-OUyf7r.
  Couldn't find device with uuid ELYzqw-yZzR-ggjZ-uIvv-oQJt-YVLk-MKy39E.
  LV                                     Attr       LSize   Cpy%Sync Devices
   bb_snap1                               swi-a-s--- 252.00m          /dev/sdc1(43)
   synced_three_raid10_3legs_1            owi-aor-p- 504.00m 100.00   synced_three_raid10_3legs_1_rimage_0(0),synced_three_raid10_3legs_1_rimage_1(0),synced_three_raid10_3legs_1_rimage_2(0),synced_three_raid10_3legs_1_rimage_3(0),synced_three_raid10_3legs_1_rimage_4(0),synced_three_raid10_3legs_1_rimage_5(0)
   [synced_three_raid10_3legs_1_rimage_0] iwi-a-r-p- 168.00m          unknown device(1)
   [synced_three_raid10_3legs_1_rimage_1] iwi-aor--- 168.00m          /dev/sdc1(1)
   [synced_three_raid10_3legs_1_rimage_2] iwi-a-r-p- 168.00m          unknown device(1)
   [synced_three_raid10_3legs_1_rimage_3] iwi-aor--- 168.00m          /dev/sde1(1)
   [synced_three_raid10_3legs_1_rimage_4] iwi-aor--- 168.00m          /dev/sda1(1)
   [synced_three_raid10_3legs_1_rimage_5] iwi-aor--- 168.00m          /dev/sdf1(1)
   [synced_three_raid10_3legs_1_rmeta_0]  ewi-a-r-p-   4.00m          unknown device(0)
   [synced_three_raid10_3legs_1_rmeta_1]  ewi-aor---   4.00m          /dev/sdc1(0)
   [synced_three_raid10_3legs_1_rmeta_2]  ewi-a-r-p-   4.00m          unknown device(0)
   [synced_three_raid10_3legs_1_rmeta_3]  ewi-aor---   4.00m          /dev/sde1(0)
   [synced_three_raid10_3legs_1_rmeta_4]  ewi-aor---   4.00m          /dev/sda1(0)
   [synced_three_raid10_3legs_1_rmeta_5]  ewi-aor---   4.00m          /dev/sdf1(0)


Verifying FAILED device /dev/sdb1 is *NOT* in the volume(s)
Verifying FAILED device /dev/sdd1 is *NOT* in the volume(s)
Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s)
Verifying IMAGE device /dev/sdc1 *IS* in the volume(s)
Verifying IMAGE device /dev/sde1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdf1 *IS* in the volume(s)
Verify the rimage/rmeta dm devices remain after the failures

Checking EXISTENCE and STATE of synced_three_raid10_3legs_1_rimage_2 on: host-081 
Checking EXISTENCE and STATE of synced_three_raid10_3legs_1_rmeta_2 on: host-081 
Checking EXISTENCE and STATE of synced_three_raid10_3legs_1_rimage_4 on: host-081 
Checking EXISTENCE and STATE of synced_three_raid10_3legs_1_rmeta_4 on: host-081 
Checking EXISTENCE and STATE of synced_three_raid10_3legs_1_rimage_0 on: host-081 
Checking EXISTENCE and STATE of synced_three_raid10_3legs_1_rmeta_0 on: host-081 

Verify the raid image order is what's expected based on raid fault policy
EXPECTED LEG ORDER: unknown /dev/sdc1 unknown /dev/sde1 unknown /dev/sdf1
ACTUAL LEG ORDER: unknown /dev/sdc1 unknown /dev/sde1 /dev/sda1 /dev/sdf1

Verifying files (checkit) on mirror(s) on...
        ---- host-081 ----

Enabling device sdb on host-081 
        Running vgs to make LVM update metadata version if possible (will restore a-m PVs)
  Couldn't find device with uuid RNNCKN-Jrty-rA0j-xBfP-PcWX-9o5S-DMej09.
  Couldn't find device with uuid ELYzqw-yZzR-ggjZ-uIvv-oQJt-YVLk-MKy39E.

Enabling device sdd on host-081 
        Running vgs to make LVM update metadata version if possible (will restore a-m PVs)
  Couldn't find device with uuid ELYzqw-yZzR-ggjZ-uIvv-oQJt-YVLk-MKy39E.
  WARNING: Inconsistent metadata found for VG black_bird - updating to use version 7
  Missing device /dev/sdd1 reappeared, updating metadata for VG black_bird to version 7.
  Device still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing.
  Missing device /dev/sdb1 reappeared, updating metadata for VG black_bird to version 7.
  Device still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing.
  Missing device unknown device reappeared, updating metadata for VG black_bird to version 7.

Simple vgs after device enable failed after brining sdd online

[root@host-081 ~]# lvs -a -o +devices
  Couldn't find device with uuid ELYzqw-yZzR-ggjZ-uIvv-oQJt-YVLk-MKy39E.
  LV                                     VG         Attr       LSize   Origin                      Data% Cpy%Sync Devices
  bb_snap1                               black_bird swi-a-s--- 252.00m synced_three_raid10_3legs_1 28.26          /dev/sdc1(43)
  synced_three_raid10_3legs_1            black_bird owi-aor-p- 504.00m                                   100.00   synced_three_raid10_3legs_1_rimage_0(0),synced_three_raid10_3legs_1_rimage_1(0),synced_three_raid10_3legs_1_rimage_2(0),synced_three_raid10_3legs_1_rimage_3(0),synced_three_raid10_3legs_1_rimage_4(0),synced_three_raid10_3legs_1_rimage_5(0)
  [synced_three_raid10_3legs_1_rimage_0] black_bird iwi-a-r-p- 168.00m                                            /dev/sdb1(1)
  [synced_three_raid10_3legs_1_rimage_1] black_bird iwi-aor--- 168.00m                                            /dev/sdc1(1)
  [synced_three_raid10_3legs_1_rimage_2] black_bird iwi-a-r-p- 168.00m                                            /dev/sdd1(1)
  [synced_three_raid10_3legs_1_rimage_3] black_bird iwi-aor--- 168.00m                                            /dev/sde1(1)
  [synced_three_raid10_3legs_1_rimage_4] black_bird iwi-aor--- 168.00m                                            /dev/sda1(1)
  [synced_three_raid10_3legs_1_rimage_5] black_bird iwi-aor--- 168.00m                                            /dev/sdf1(1)
  [synced_three_raid10_3legs_1_rmeta_0]  black_bird ewi-a-r-p-   4.00m                                            /dev/sdb1(0)
  [synced_three_raid10_3legs_1_rmeta_1]  black_bird ewi-aor---   4.00m                                            /dev/sdc1(0)
  [synced_three_raid10_3legs_1_rmeta_2]  black_bird ewi-a-r-p-   4.00m                                            /dev/sdd1(0)
  [synced_three_raid10_3legs_1_rmeta_3]  black_bird ewi-aor---   4.00m                                            /dev/sde1(0)
  [synced_three_raid10_3legs_1_rmeta_4]  black_bird ewi-aor---   4.00m                                            /dev/sda1(0)
  [synced_three_raid10_3legs_1_rmeta_5]  black_bird ewi-aor---   4.00m                                            /dev/sdf1(0)


Version-Release number of selected component (if applicable):
2.6.32-682.el6.x86_64

lvm2-2.02.143-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
lvm2-libs-2.02.143-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
lvm2-cluster-2.02.143-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
udev-147-2.73.el6_8.2    BUILT: Tue Aug 30 08:17:19 CDT 2016
device-mapper-1.02.117-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
device-mapper-libs-1.02.117-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
device-mapper-event-1.02.117-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
device-mapper-event-libs-1.02.117-12.el6    BUILT: Wed Jan 11 09:35:04 CST 2017
device-mapper-persistent-data-0.6.2-0.1.rc7.el6    BUILT: Tue Mar 22 08:58:09 CDT 2016


How reproducible:
Everytime as long as lvmetad isn't running.

Comment 4 Jonathan Earl Brassow 2017-10-03 22:38:12 UTC
fixed in rhel7.4, should be able to fix here.

Comment 5 Jan Kurik 2017-12-06 10:58:26 UTC
Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/


Note You need to log in before you can comment on or make changes to this bug.