RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1633167 - lvm tried to deactivate subLV of raid1 lv in a tagged VG when Vg deactivated
Summary: lvm tried to deactivate subLV of raid1 lv in a tagged VG when Vg deactivated
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1577173
TreeView+ depends on / blocked
 
Reported: 2018-09-26 10:40 UTC by nikhil kshirsagar
Modified: 2021-09-03 12:55 UTC (History)
15 users (show)

Fixed In Version: lvm2-2.02.184-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-06 13:10:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
test results (130.39 KB, application/zip)
2019-07-02 09:14 UTC, Roman Bednář
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3609201 0 None None None 2019-09-02 04:10:00 UTC
Red Hat Product Errata RHBA-2019:2253 0 None None None 2019-08-06 13:11:12 UTC

Description nikhil kshirsagar 2018-09-26 10:40:53 UTC
Description of problem:
When a raid1 type lv is in a tagged VG, and we try to deactivate the VG, lvm tries to deactivate the underlying sublv (_rmeta_0 and _rmeta_1) while the raid1 lv is active, therefore causing failures in cluster resource agents and scripts.

Version-Release number of selected component (if applicable):
lvm2-2.02.177-4.el7.x86_64


Additional info:

[nkshirsa@foobar sosreport-em50008.02175265-20180914100002]$ cat sos_commands/lvm2/vgs_-v_-o_vg_mda_count_vg_mda_free_vg_mda_size_vg_mda_used_count_vg_tags_--config_global_locking_type_0 
    Reloading config files
    Wiping internal VG cache
  WARNING: Locking disabled. Be careful! This could corrupt your metadata.
  VG              Attr   Ext   #PV #LV #SN VSize   VFree   VG UUID                                VProfile #VMda VMdaFree  VMdaSize  #VMdaUse VG Tags  
  lsv30013_rootvg wz--n- 4.00m   1  18   0 <35.51g   2.50g 1ayAcQ-PhDw-jK93-eZvV-BGqT-krAo-ED7XOm              1   501.50k  1020.00k        1          
  murex_raid_vg   wz--n- 4.00m   2   3   0 339.99g 139.98g ptuQYT-7cnI-vspJ-zGWl-D3B9-K3b0-CPLKbt              2   505.50k  1020.00k        2 pacemaker
    Reloading config files
    Wiping internal VG cache
[nkshirsa@foobar sosreport-em50008.02175265-20180914100002]$ cat sos_commands/lvm2/lvs_-a_-o_lv_tags_devices_--config_global_locking_type_0 
  WARNING: Locking disabled. Be careful! This could corrupt your metadata.
  LV                        VG              Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert LV Tags Devices                                              
  lv_home                   lsv30013_rootvg -wi-ao----  32.00m                                                             /dev/mapper/mpatha2(6458)                            
  lv_opt                    lsv30013_rootvg -wi-ao----   1.00g                                                             /dev/mapper/mpatha2(6202)                            
  lv_opt_controlm           lsv30013_rootvg -wi-ao----   1.00g                                                             /dev/mapper/mpatha2(5946)                            
  lv_opt_manage             lsv30013_rootvg -wi-ao----  32.00m                                                             /dev/mapper/mpatha2(5938)                            
  lv_opt_managesoft         lsv30013_rootvg -wi-ao---- 200.00m                                                             /dev/mapper/mpatha2(5888)                            
  lv_opt_microsoft          lsv30013_rootvg -wi-ao---- 256.00m                                                             /dev/mapper/mpatha2(5760)                            
  lv_opt_tivoli_cit         lsv30013_rootvg -wi-ao---- 256.00m                                                             /dev/mapper/mpatha2(5824)                            
  lv_root                   lsv30013_rootvg -wi-ao---- 768.00m                                                             /dev/mapper/mpatha2(6466)                            
  lv_swap                   lsv30013_rootvg -wi-ao----   2.00g                                                             /dev/mapper/mpatha2(6658)                            
  lv_tmp                    lsv30013_rootvg -wi-ao----   5.00g                                                             /dev/mapper/mpatha2(4480)                            
  lv_usr                    lsv30013_rootvg -wi-ao----   9.00g                                                             /dev/mapper/mpatha2(2176)                            
  lv_usr_local              lsv30013_rootvg -wi-ao---- 256.00m                                                             /dev/mapper/mpatha2(2112)                            
  lv_var                    lsv30013_rootvg -wi-ao----   3.00g                                                             /dev/mapper/mpatha2(1344)                            
  lv_var_crash              lsv30013_rootvg -wi-ao----   7.00g                                                             /dev/mapper/mpatha2(832)                             
  lv_var_crash              lsv30013_rootvg -wi-ao----   7.00g                                                             /dev/mapper/mpatha2(7170)                            
  lv_var_log                lsv30013_rootvg -wi-ao----   2.00g                                                             /dev/mapper/mpatha2(288)                             
  lv_var_log_audit          lsv30013_rootvg -wi-ao----   1.00g                                                             /dev/mapper/mpatha2(32)                              
  lv_var_opt_besclient      lsv30013_rootvg -wi-ao---- 128.00m                                                             /dev/mapper/mpatha2(800)                             
  lv_var_opt_managesoft     lsv30013_rootvg -wi-ao---- 128.00m                                                             /dev/mapper/mpatha2(0)                               
  lv_murex_raid1            murex_raid_vg   rwi-a-r--- 100.00g                                    100.00                   lv_murex_raid1_rimage_0(0),lv_murex_raid1_rimage_1(0)
  [lv_murex_raid1_rimage_0] murex_raid_vg   iwi-aor--- 100.00g                                                             /dev/mapper/mpathb(1)                                
  [lv_murex_raid1_rimage_1] murex_raid_vg   iwi-aor--- 100.00g                                                             /dev/mapper/mpathc(1)                                
  lv_murex_raid1_rmeta_0    murex_raid_vg   ewi-aor---   4.00m                                                             /dev/mapper/mpathb(0)                                
  lv_murex_raid1_rmeta_1    murex_raid_vg   ewi-aor---   4.00m                                                             /dev/mapper/mpathc(0)                                
[nkshirsa@foobar sosreport-em50008.02175265-20180914100002]$ 





script that customer ran was


# pcs resource disable murex_grp 
# vgchange -anvvv murex_raid_vg 
# vgchange --deltag pacemaker murex_raid_vg Reply with results.
# vgchange --addtag pacemaker murex_raid_vg 
# vgchange -ayvvv murex_raid_vg --config 'activation { volume_list = [ "@pacemaker" ]}' 
# vgchange -anvvv murex_raid_vg


]0;root@lsv30013:/home/em50008[root@lsv30013 em50008]# vgchange -anvvv murex_raid_vg[1@ [1@-
        Parsing: vgchange -an -vvv murex_raid_vg
        Recognised command vgchange_activate (id 116 / enum 98).
      devices/global_filter not found in config: defaulting to global_filter = [ "a|.*/|" ]
      Setting global/locking_type to 1


...


        Processing command: vgchange -an -vvv murex_raid_vg

    Deactivated 3 logical volumes in volume group murex_raid_vg
        Getting device info for murex_raid_vg-lv_murex_raid1 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv [ noopencount flush ]   [16384] (*1)
        Getting device info for murex_raid_vg-lv_murex_raid1_rmeta_0 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtI1aB5gfkZghsmJHKjcxOP7jqV6Kn7v75].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtI1aB5gfkZghsmJHKjcxOP7jqV6Kn7v75 [ noopencount flush ]   [16384] (*1)
        Getting device info for murex_raid_vg-lv_murex_raid1_rmeta_1 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtczgD1345cAYx7nNAFGKDwntWENDxADy5].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtczgD1345cAYx7nNAFGKDwntWENDxADy5 [ noopencount flush ]   [16384] (*1)
        Counted 0 active LVs in VG murex_raid_vg
  0 logical volume(s) in volume group "murex_raid_vg" now active


        Completed: vgchange -an -vvv murex_raid_vg
]0;root@lsv30013:/home/em50008[root@lsv30013 em50008]# 
[K[root@lsv30013 em50008]# vgchange --deltag pacemaker murex_raid_vg
  Volume group "murex_raid_vg" successfully changed
]0;root@lsv30013:/home/em50008[root@lsv30013 em50008]# vgchange --addtag pacemaker murex_raid_vg
  Volume group "murex_raid_vg" successfully changed
]0;root@lsv30013:/home/em50008[root@lsv30013 em50008]# vgchange -ayvvv murex_raid_vg --config 'activation { volume_list = [ "@pacemaker"

        Parsing: vgchange -ay -vvv murex_raid_vg --config 'activation { volume_list = [ "@pacemaker" ]}'
        Recognised command vgchange_activate (id 116 / enum 98).
    Reloading config files
        Syncing device names
    Wiping internal VG cache
        Freeing VG #orphans_lvm1 at 0x5646a23883f0.
        Freeing VG #orphans_pool at 0x5646a238cc10.
        Freeing VG #orphans_lvm2 at 0x5646a2391430.

...
...

        Processing command: vgchange -ay -vvv murex_raid_vg --config 'activation { volume_list = [ "@pacemaker" ]}'

...
...

        murex_raid_vg-lv_murex_raid1: Skipping NODE_ADD (253,27) 0:6 0660 [trust_udev]
        murex_raid_vg-lv_murex_raid1: Processing NODE_READ_AHEAD 8192 (flags=1)
        murex_raid_vg-lv_murex_raid1 (253:27): read ahead is 256
        murex_raid_vg-lv_murex_raid1 (253:27): Setting read ahead to 8192
    Activated 3 logical volumes in volume group murex_raid_vg
...

...


]0;root@lsv30013:/home/em50008[root@lsv30013 em50008]# vgchange -anvvv murex_raid_vg[C[C[1@ [1@-
        Parsing: vgchange -an -vvv murex_raid_vg

...
        Processing command: vgchange -an -vvv murex_raid_vg

...
...


        /dev/mapper/murex_raid_vg-lv_murex_raid1_rmeta_0: Aliased to /dev/dm-23 in device cache (preferred name) (253:23)
        /dev/murex_raid_vg/lv_murex_raid1_rmeta_0: Aliased to /dev/mapper/murex_raid_vg-lv_murex_raid1_rmeta_0 in device cache (preferred name) (253:23)
        /dev/dm-24: Added to device cache (253:24)
        /dev/mapper/murex_raid_vg-lv_murex_raid1_rimage_0: Aliased to /dev/dm-24 in device cache (preferred name) (253:24)
        /dev/dm-25: Added to device cache (253:25)
        /dev/mapper/murex_raid_vg-lv_murex_raid1_rmeta_1: Aliased to /dev/dm-25 in device cache (preferred name) (253:25)
        /dev/murex_raid_vg/lv_murex_raid1_rmeta_1: Aliased to /dev/mapper/murex_raid_vg-lv_murex_raid1_rmeta_1 in device cache (preferred name) (253:25)
        /dev/dm-26: Added to device cache (253:26)
        /dev/mapper/murex_raid_vg-lv_murex_raid1_rimage_1: Aliased to /dev/dm-26 in device cache (preferred name) (253:26)
        /dev/dm-27: Added to device cache (253:27)
        /dev/disk/by-id/dm-name-murex_raid_vg-lv_murex_raid1: Aliased to /dev/dm-27 in device cache (preferred name) (253:27)
        /dev/disk/by-id/dm-uuid-LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv: Aliased to /dev/disk/by-id/dm-name-murex_raid_vg-lv_murex_raid1 in device cache (253:27)
        /dev/disk/by-uuid/7931b442-613f-41eb-aba4-e04db87ddac3: Aliased to /dev/disk/by-id/dm-name-murex_raid_vg-lv_murex_raid1 in device cache (253:27)
        /dev/mapper/murex_raid_vg-lv_murex_raid1: Aliased to /dev/disk/by-id/dm-name-murex_raid_vg-lv_murex_raid1 in device cache (preferred name) (253:27)
        /dev/murex_raid_vg/lv_murex_raid1: Aliased to /dev/mapper/murex_raid_vg-lv_murex_raid1 in device cache (preferred name) (253:27)
        /dev/dm-3: Added to device cache (253:3)



...
...

        /dev/mapper/mpathc: Found metadata at 69632 size 2711 with wrap 0 (in area at 4096 size 1044480) for murex_raid_vg (ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbt)
        lvmcache has no info for vgname "murex_raid_vg" with VGID ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbt.
        lvmcache has no info for vgname "murex_raid_vg".
        lvmcache /dev/mapper/mpathc: now in VG murex_raid_vg with 1 mda(s).
        lvmcache /dev/mapper/mpathc: VG murex_raid_vg: set VGID to ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbt.
        lvmcache /dev/mapper/mpathc: VG murex_raid_vg: set creation host to lsv30013.linux.internalcorp.net.
        lvmcache /dev/mapper/mpathc: VG murex_raid_vg: stored metadata checksum 0x9537abd9 with size 2711.

...
..
       /dev/mapper/mpathb: Using cached metadata at 69632 size 2711 with wrap 0 (in area at 4096 size 1044480) for murex_raid_vg (ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbt)
        lvmcache /dev/mapper/mpathb: now in VG murex_raid_vg (ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbt) with 1 mda(s).
...
...

     Stack murex_raid_vg/lv_murex_raid1:0[0] on LV murex_raid_vg/lv_murex_raid1_rmeta_0:0.
      Adding murex_raid_vg/lv_murex_raid1:0 as an user of murex_raid_vg/lv_murex_raid1_rmeta_0.
      Stack murex_raid_vg/lv_murex_raid1:0[0] on LV murex_raid_vg/lv_murex_raid1_rimage_0:0.
      Adding murex_raid_vg/lv_murex_raid1:0 as an user of murex_raid_vg/lv_murex_raid1_rimage_0.
      Stack murex_raid_vg/lv_murex_raid1:0[1] on LV murex_raid_vg/lv_murex_raid1_rmeta_1:0.
      Adding murex_raid_vg/lv_murex_raid1:0 as an user of murex_raid_vg/lv_murex_raid1_rmeta_1.
      Stack murex_raid_vg/lv_murex_raid1:0[1] on LV murex_raid_vg/lv_murex_raid1_rimage_1:0.
      Adding murex_raid_vg/lv_murex_raid1:0 as an user of murex_raid_vg/lv_murex_raid1_rimage_1.
        Read murex_raid_vg metadata (16) from /dev/mapper/mpathc at 69632 size 2711
        Widening request for 512 bytes at 4096 to 4096 bytes at 4096 on /dev/mapper/mpathb (for VG metadata header)
        Read  /dev/mapper/mpathb:    4096 bytes (sync) at 4096 (for VG metadata header)
        Widening request for 130 bytes at 69632 to 4096 bytes at 69632 on /dev/mapper/mpathb (for VG metadata content)
        Read  /dev/mapper/mpathb:    4096 bytes (sync) at 69632 (for VG metadata content)
        Widening request for 2711 bytes at 69632 to 4096 bytes at 69632 on /dev/mapper/mpathb (for VG metadata content)
        Read  /dev/mapper/mpathb:    4096 bytes (sync) at 69632 (for VG metadata content)
        Skipped reading metadata from /dev/mapper/mpathb at 69632 size 2711 with matching checksum.
        lvmcache /dev/mapper/mpathb: VG murex_raid_vg: set system_id to .
        lvmcache: VG murex_raid_vg (ptuQYT-7cnI-vspJ-zGWl-D3B9-K3b0-CPLKbt) stored (2711 bytes).
      /dev/mapper/mpathb: using cached size 356515840 sectors
      /dev/mapper/mpathc: using cached size 356515840 sectors
        /dev/mapper/mpathb 0:      0      1: lv_murex_raid1_rmeta_0(0:0)
        /dev/mapper/mpathb 1:      1  25600: lv_murex_raid1_rimage_0(0:0)
        /dev/mapper/mpathb 2:  25601  17918: NULL(0:0)
        /dev/mapper/mpathc 0:      0      1: lv_murex_raid1_rmeta_1(0:0)
        /dev/mapper/mpathc 1:      1  25600: lv_murex_raid1_rimage_1(0:0)
        /dev/mapper/mpathc 2:  25601  17918: NULL(0:0)
      Processing VG murex_raid_vg ptuQYT-7cnI-vspJ-zGWl-D3B9-K3b0-CPLKbt
        Getting device info for murex_raid_vg-lv_murex_raid1 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv [ opencount flush ]   [16384] (*1)
        Getting device info for murex_raid_vg-lv_murex_raid1_rmeta_0 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtI1aB5gfkZghsmJHKjcxOP7jqV6Kn7v75].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtI1aB5gfkZghsmJHKjcxOP7jqV6Kn7v75 [ opencount flush ]   [16384] (*1)
        Getting device info for murex_raid_vg-lv_murex_raid1_rmeta_1 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtczgD1345cAYx7nNAFGKDwntWENDxADy5].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtczgD1345cAYx7nNAFGKDwntWENDxADy5 [ opencount flush ]   [16384] (*1)
        Counted 2 open LVs in VG murex_raid_vg.
        Getting device info for murex_raid_vg-lv_murex_raid1 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtyX8RBceOdQpc1TzhNrPBGNgv5fWFDTGv [ opencount flush ]   [16384] (*1)
        Getting device info for murex_raid_vg-lv_murex_raid1_rmeta_0 [LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtI1aB5gfkZghsmJHKjcxOP7jqV6Kn7v75].
        dm info  LVM-ptuQYT7cnIvspJzGWlD3B9K3b0CPLKbtI1aB5gfkZghsmJHKjcxOP7jqV6Kn7v75 [ opencount flush ]   [16384] (*1)
  Logical volume murex_raid_vg/lv_murex_raid1_rmeta_0 is used by another device.
  Can't deactivate volume group "murex_raid_vg" with 2 open logical volume(s)
        Unlock: Memlock counters: prioritized:0 locked:0 critical:0 daemon:0 suspended:0
        Syncing device namesca
        lvmcache: VG murex_raid_vg wiped.
      Unlocking /run/lock/lvm/V_murex_raid_vg
        _undo_flock /run/lock/lvm/V_murex_raid_vg
        Closed /dev/mapper/mpathc
        Closed /dev/mapper/mpathb
        Freeing VG murex_raid_vg at 0x564dc9540580.
      Setting global/notify_dbus to 1
        Completed: vgchange -an -vvv murex_raid_vg

Comment 8 Zdenek Kabelac 2018-10-16 13:18:59 UTC
Looking over th BZ description - there is one issue standing out -
when using 'cluster' (clvmd)  and passing --config  option with activation command (vgchnage) is unsupported combination as the 'clvmd' activation service is running autonomously from the command itself - going to take deeper analysis of the original issue.

Comment 10 Zdenek Kabelac 2018-10-17 21:43:48 UTC
So I'm still unclear what is going on - but how has happened that _rmeta_  LVs are visible ??

I'm looking over the archive and it seems the LV itself was created on different node ?

The core trouble seems to be, that  _rmeta_ LVs are present with this tag - which likely failed to be checked on validation code - and code let it store in metadata  (also in 'lvs -a' you can notice  missing [] around the name).

So is there something special going on during lvcreate ?

I've not yet been able to reproduce creation of 'raid1' LV with visible _rmeta_ devices.

Comment 11 Zdenek Kabelac 2018-10-17 21:56:11 UTC
There is definitely a bug in validation code - it's not checking internal LVs is invisible - so during run of command:

vgchange -an 

all visible LVs are tried for deactivation and if _rmeta_ is visible and is used by raid1 LV itself - user will get error about problem with deactivation of LV which is openned.

A workaround should be  to  'vgcfgbackup' such VG - remove "VISIBLE" attribute from _rmeta_  LVs and vgcfgrestore them back.

Also user should be always able to deactivate direct raid1 LV by name with  'lvchange -an  vg/lv' command.

lvm2 surely needs extension in validation of metadata - so trial to store '_rmeta_' is caught before it hits metadata storage.

Comment 12 Zdenek Kabelac 2018-10-18 14:04:44 UTC
We need to see 'sos' report from all nodes - please provide attachemnt from all clustered nodes.

The reported case is not reproducible with shown steps so far.

Sos reports is needed for better analysis how the raid's _rmeta_ images were left visible in metadata.  Provided single  'sos' report file just shows  raid LV appeared there already invalid.

Comment 22 Zdenek Kabelac 2018-10-23 11:21:06 UTC
So after verification - there really is  during  RAID creation as short moment, when  raid LV is committed on this with VISIBLE flag -  those _rmeta_ LVs are then zerod and new commit making them invisible follows shortly.

However if there is  'crash' just in this tiny time window -  raid LV is left in unknown state  - user likely should remove such LV and create it again.

This is seen as creation sequence bug in raidLV.

If the customer has deadlocking issue - he should probably switch to use older  '--type mirror'  - where the order sequence with zeroing is correct.

Comment 23 Heinz Mauelshagen 2018-10-24 14:49:19 UTC
Upstream commit16ae968d24b4fe3264dc9b46063345ff2846957b to avoid commiting SubLVs for wipinng thus not causing them to turn remnant on crashs.

Comment 24 Heinz Mauelshagen 2018-10-24 16:07:28 UTC
(In reply to Zdenek Kabelac from comment #22)
> So after verification - there really is  during  RAID creation as short
> moment, when  raid LV is committed on this with VISIBLE flag -  those
> _rmeta_ LVs are then zerod and new commit making them invisible follows
> shortly.
> 
> However if there is  'crash' just in this tiny time window -  raid LV is
> left in unknown state  - user likely should remove such LV and create it
> again.
> 
> This is seen as creation sequence bug in raidLV.
> 
> If the customer has deadlocking issue - he should probably switch to use
> older  '--type mirror'  - where the order sequence with zeroing is correct.

Switching to 'mirror' type is not mandatory with patch as of previous comment.

Comment 25 Zdenek Kabelac 2018-10-25 08:42:00 UTC
The patch 16ae968d24b4fe3264dc9b46063345ff2846957b  still seems to lack fixing the commit of raid metadata with visible  _rmeta_ LVs.

And the patch is also introducing other potential problem where the _rmeta_ LVs present in DM table can be misidentified as other type of block devices.

While the very generic activation for wipe_fs was a bit 'annoying' we've been sure there is later zero chance to see this device being i.e. identified as some other device/fs/mdraid member.

With clearing only 1st. sector as proposed patch is doing - we are leaving other signatures in place - so in case there is any issue with raid activation - such device can be possibly misused.  To stay 'reasonably' secure we probably need to erase at least  64K from the front and end of the device - although there are weird filesystem like ZFS where its signature is stored (and later identified) in way more complicated way.

It would be probably inefficient to clear whole _rmeta_ device though - since in case of large extent size we might end with erasing lots of space.

Another slight advantage from previous 'activation' method was the option to introduce usage of TRIM ioctl in a very simple way for such device - although we already do provide support for trimming 'removed' PV space - so we might consider to generalize this concept of use it also for 'allocation' of new LVs.

Comment 26 Heinz Mauelshagen 2018-10-25 11:02:52 UTC
(In reply to Zdenek Kabelac from comment #25)
> The patch 16ae968d24b4fe3264dc9b46063345ff2846957b  still seems to lack
> fixing the commit of raid metadata with visible  _rmeta_ LVs.
> 
> And the patch is also introducing other potential problem where the _rmeta_
> LVs present in DM table can be misidentified as other type of block devices.
> 

The patch does not introduce the potential problem you describe anew at all!

It keeps the semantics we had before (wiping one sector at the beginning).

So your statement points at an enhancement request to reduce any potential bogus discoveries by e.g. libblkid even further from what we always allowed for.

I can add an additional patch changing the given semantics after studying which signatures of what size we actually should take into consideration so that we can minimize wiping overhead.

Comment 27 Heinz Mauelshagen 2018-10-25 12:35:34 UTC
(In reply to Heinz Mauelshagen from comment #26)
> (In reply to Zdenek Kabelac from comment #25)
> > The patch 16ae968d24b4fe3264dc9b46063345ff2846957b  still seems to lack
> > fixing the commit of raid metadata with visible  _rmeta_ LVs.
> > 
> > And the patch is also introducing other potential problem where the _rmeta_
> > LVs present in DM table can be misidentified as other type of block devices.
> > 
> 
> The patch does not introduce the potential problem you describe anew at all!
> 

As of discussion with Zdenek: not using wipe_lv actually looses us wiping all known signatures, so we need to keep using that API for the time being.

Reverting and coming up with a new patch...

Comment 29 Heinz Mauelshagen 2018-12-11 17:10:48 UTC
lvm2 upstream commit commit dd5716ddf258c4a44819fa90d3356833ccf767b4

Comment 31 Marian Csontos 2019-03-22 13:43:15 UTC
Stable branch commit 9b04851fc574ce9cffd30a51d2b750955239f316

Comment 33 Roman Bednář 2019-07-02 09:14:35 UTC
Created attachment 1586603 [details]
test results

Marking verified. Testing consisted of running the reproducer from initial comment (100 times) and raid sanity regression check for raid1 in singlenode environment.

See attachment for logs and reproducer script.

Regression run: https://beaker.cluster-qe.lab.eng.brq.redhat.com/bkr/jobs/96718


lvm2-2.02.185-2.el7.x86_64

Comment 35 errata-xmlrpc 2019-08-06 13:10:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2253


Note You need to log in before you can comment on or make changes to this bug.