1015514 – Check for DM_UDEV_DISABLE_OTHER_RULES_FLAG instead of DM_UDEV_DISABLE_DISK_RULES_FLAG in 65-md-incremental.rules

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1015514 - Check for DM_UDEV_DISABLE_OTHER_RULES_FLAG instead of DM_UDEV_DISABLE_DISK_RULES_FLAG in 65-md-incremental.rules

Summary: Check for DM_UDEV_DISABLE_OTHER_RULES_FLAG instead of DM_UDEV_DISABLE_DISK_RU...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	mdadm
Sub Component:
Version:	6.5
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Jes Sorensen
QA Contact:	XiaoNi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-10-04 12:29 UTC by Peter Rajnoha
Modified:	2013-11-21 12:08 UTC (History)
CC List:	4 users (show)
Fixed In Version:	mdadm-3.2.6-7.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1015515 1015521 (view as bug list)
Environment:
Last Closed:	2013-11-21 12:08:09 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Check for DM_UDEV_DISABLE_OTHER_RULES_FLAG (655 bytes, patch) 2013-10-04 12:29 UTC, Peter Rajnoha	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2013:1643	0	normal	SHIPPED_LIVE	mdadm bug fix and enhancement update	2013-11-20 21:53:30 UTC

Description Peter Rajnoha 2013-10-04 12:29:51 UTC

Created attachment 807607 [details]
Check for DM_UDEV_DISABLE_OTHER_RULES_FLAG

There's DM_UDEV_DISABLE_DISK_RULES_FLAG used in /lib/udev/rules.d/65-md-incremental.rules to avoid scanning of inappropriate device-mapper devices. Though this flag should be used only in 13-dm-disk.rules (so this flag is for internal device-mapper's use as these rules belong to device-mapper).

Any external rules should check for DM_UDEV_DISABLE_OTHER_RULES_FLAG for proper coordination with device-mapper devices. Please, see attached patch.

Comment 4 XiaoNi 2013-10-21 07:39:41 UTC

Hi Jes

   I have checked, the patch is already in mdadm-3.2.6-7.el6. But I have no good idea to reproduce it, could you give me some suggestion about reproducing? Thanks very much.

Comment 5 Jes Sorensen 2013-11-06 09:43:35 UTC

Xiao,

This was requested by Peter for the DM tools, the flags are set on their side
and I don't directly deal with them on the mdadm side, so I think we need to
ask Peter for recommendations here.

Thanks,
Jes

Comment 6 Peter Rajnoha 2013-11-06 12:02:42 UTC

Sorry for the delay and no answer for comment #3, I forgot to write back as I was busy with other things...

Well, for example, this may be used to test the functionality:

---> STEP 1 - create PV/VG and 2 LVs

[root@rhel6-a ~]# pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created

[root@rhel6-a ~]# vgcreate vg /dev/sda
  Volume group "vg" successfully created

[root@rhel6-a ~]# lvcreate -L512m vg
  Logical volume "lvol0" created

[root@rhel6-a ~]# lvcreate -L512m vg
  Logical volume "lvol1" created


---> STEP 2 - create an MD array on top of LVs, e.g. a mirror. Use 1.1 metadata format (so we're sure it's at the beginning of the LVs, wipefs will show that in the "offset" as well

[root@rhel6-a ~]# mdadm --create /dev/md0 --metadata="1.1" --level=1 --raid-devices=2 /dev/vg/lvol0 /dev/vg/lvol1
mdadm: array /dev/md0 started.

[root@rhel6-a ~]# lsblk /dev/sda
NAME              MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                 8:0    0     4G  0 disk  
|-vg-lvol0 (dm-2) 253:2    0   512M  0 lvm   
| `-md0             9:0    0 511.7M  0 raid1 
`-vg-lvol1 (dm-3) 253:3    0   512M  0 lvm   
  `-md0             9:0    0 511.7M  0 raid1 



[root@rhel6-a ~]# wipefs /dev/vg/lvol0
offset               type
----------------------------------------------------------------
0x0                  linux_raid_member   [raid]
                     LABEL: rhel6-a:0
                     UUID:  81189229-3816-fda3-8bed-9f6e3a5d45e5

[root@rhel6-a ~]# wipefs /dev/vg/lvol1
offset               type
----------------------------------------------------------------
0x0                  linux_raid_member   [raid]
                     LABEL: rhel6-a:0
                     UUID:  81189229-3816-fda3-8bed-9f6e3a5d45e5


---> STEP 3 - stop the MD array and remove LVs

[root@rhel6-a ~]# mdadm -S /dev/md0
mdadm: stopped /dev/md0

[root@rhel6-a ~]# lvremove -ff vg/lvol0 vg/lvol1
  Logical volume "lvol0" successfully removed
  Logical volume "lvol1" successfully removed

[root@rhel6-a ~]# lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda    8:0    0   4G  0 disk 


---> STEP 4 - kill running udevd daemon and run it in debug mode (so we can check what it's doing exactly)

[root@rhel6-a ~]# killall udevd 

[root@rhel6-a ~]# udevd --debug &> udev_log


---> STEP 5 - create the same LVs again - they will fit exactly on the original LVs (the same offsets...)

[root@rhel6-a ~]# lvcreate -L512M vg
  Logical volume "lvol0" created

[root@rhel6-a ~]# lvcreate -L512M vg
  Logical volume "lvol1" created

[root@rhel6-a ~]# lsblk /dev/sda
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                 8:0    0    4G  0 disk 
|-vg-lvol0 (dm-2) 253:2    0  512M  0 lvm  
`-vg-lvol1 (dm-3) 253:3    0  512M  0 lvm 

---> STEP 6 - if everything works correctly, LVM should have a chance to wipe the MD array signature during "LV zeroing" (wiping part of the LV so it's clear and ready for use). To prove this, there should be no scan done in udev as the MD array signature is cleared by LVM *before* udev has a chance to see the signature and therefore there's no interference with LVM and recreating the old MD array (from any stale metadata that should be ignored)

[root@rhel6-a ~]# grep mdadm udev_log

[root@rhel6-a ~]# echo $?
1

---> no mdadm call found, which is what we want!

====

Now, to prove that it *does* interfere if the DM_UDEV_DISABLE_* flags are not used correctly, you can do:

---> do STEP 3

---> edit /lib/udev/rules.d/13-dm-disk.rules and comment out this line like this:

  #ENV{DM_NOSCAN}=="1", GOTO="dm_watch"

---> also, put back the DM_UDEV_DISABLE_DISK_RULES back instead of the correct DM_UDEV_DISABLE_OTHER_RULES in /lib/udev/rules.d/65-md-incremental.rules:

  ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}=="1", GOTO="dm_change_end"
    changed to original and incorrect
  ENV{DM_UDEV_DISABLE_DISK_RULES_FLAG}=="1",GOTO="dm_change_end"

---> do STEP 4, STEP 5 and STEP 6

---> now, STEP 6 should show interference with mdadm:

[root@rhel6-a ~]# grep mdadm udev_log
1383737656.891739 [3119] udev_rules_apply_to_event: RUN '/sbin/mdadm -I $env{DEVNAME}' /lib/udev/rules.d/65-md-incremental.rules:28
1383737656.894028 [3119] util_run_program: '/sbin/mdadm -I /dev/dm-2' started
1383737656.907670 [3119] util_run_program: '/sbin/mdadm' (stderr) 'mdadm: /dev/dm-2 attached to /dev/md/0, not enough to start safely.'
1383737656.907798 [3119] util_run_program: '/sbin/mdadm -I /dev/dm-2' returned with exitcode 0
1383737656.920539 [3119] udev_rules_apply_to_event: RUN '/sbin/mdadm -I $env{DEVNAME}' /lib/udev/rules.d/65-md-incremental.rules:56
1383737656.921442 [3119] util_run_program: '/sbin/mdadm -I /dev/dm-2' started
1383737656.922217 [3119] util_run_program: '/sbin/mdadm' (stderr) 'mdadm: cannot open /dev/dm-2: Device or resource busy.'
1383737656.922303 [3119] util_run_program: '/sbin/mdadm -I /dev/dm-2' returned with exitcode 1
1383737658.030467 [3118] udev_rules_apply_to_event: RUN '/sbin/mdadm -I $env{DEVNAME}' /lib/udev/rules.d/65-md-incremental.rules:28
1383737658.031526 [3118] util_run_program: '/sbin/mdadm -I /dev/dm-3' started
1383737658.041573 [3118] util_run_program: '/sbin/mdadm' (stderr) 'mdadm: metadata mismatch between /dev/dm-3 and chosen array /dev/md/0'
1383737658.041728 [3118] util_run_program: '/sbin/mdadm -I /dev/dm-3' returned with exitcode 2
1383737658.046832 [3118] udev_rules_apply_to_event: RUN '/sbin/mdadm -I $env{DEVNAME}' /lib/udev/rules.d/65-md-incremental.rules:56
1383737658.047541 [3118] util_run_program: '/sbin/mdadm -I /dev/dm-3' started
1383737658.055847 [3118] util_run_program: '/sbin/mdadm' (stderr) 'mdadm: metadata mismatch between /dev/dm-3 and chosen array /dev/md/0'
1383737658.056052 [3118] util_run_program: '/sbin/mdadm -I /dev/dm-3' returned with exitcode 2

[root@rhel6-a ~]# echo $?
0

And this is wrong as mdadm is touching the LV before it has a chance to wipe its start and remove any stale metadata of any other block subsystems or whatever else might be there... So that's the other way round - to also prove that before the patch, these things worked incorrectly.

Hope that helps, if anything, feel free to ask... Again, sorry for the delay.

Comment 7 Peter Rajnoha 2013-11-06 12:12:11 UTC

(this is just simple and a bit artificial example, other tests might be quite complex, but this is the gist of the solution that should be tested - if it works here, it works in any other and more comple situations)

Comment 8 errata-xmlrpc 2013-11-21 12:08:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1643.html

Note You need to log in before you can comment on or make changes to this bug.