Created attachment 1924328 [details] journalctl Description of problem: UDEV is trying to incrementally assemble MD devices even though this is disallowed in the /etc/mdadm.conf Version-Release number of selected component (if applicable): localhost ~]$ uname -a Linux localhost.localdomain 4.18.0-425.3.1.el8.x86_64 #1 SMP Fri Sep 30 11:45:06 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux localhost ~]$ mdadm -V mdadm - v4.2 - 2021-12-30 - 5 How reproducible: I was able to reproduce the issue easily Steps to Reproduce: 1. Added scsi devices 2. Added below to multipath.conf: defaults { user_friendly_names yes find_multipaths no } 3. Added below to# cat /etc/mdadm.conf AUTO +imsm -1.x -all 4. localhost ~]$ cat /proc/mdstat Personalities : unused devices: <none> Actual results: We are seeing the same errors the customer encountered in journalctl: Nov 14 12:39:13 localhost.localdomain systemd-udevd[878]: Process '/sbin/mdadm -I /dev/dm-2' failed with exit code 1. Nov 14 12:39:13 localhost.localdomain systemd-udevd[880]: Process '/sbin/mdadm -I /dev/dm-3' failed with exit code 1. Expected results: UDEV should not assemble the MD devices as stated in the /etc/mdadm.conf file and the error should not be seen Additional info: Below layout of devices and versions from reproduce: localhost ~]$ uname -a Linux localhost.localdomain 4.18.0-425.3.1.el8.x86_64 #1 SMP Fri Sep 30 11:45:06 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux localhost ~]$ mdadm -V mdadm - v4.2 - 2021-12-30 - 5 localhost ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 256M 0 disk └─mpathb 253:3 0 256M 0 mpath sdb 8:16 0 256M 0 disk └─mpatha 253:2 0 256M 0 mpath sr0 11:0 1 1024M 0 rom vda 252:0 0 20G 0 disk ├─vda1 252:1 0 1G 0 part /boot └─vda2 252:2 0 19G 0 part ├─rhel-root 253:0 0 17G 0 lvm / └─rhel-swap 253:1 0 2G 0 lvm [SWAP] localhost ~]$ rpm -qf /lib/udev/rules.d/64-md-raid-assembly.rules mdadm-4.2-5.el8.x86_64 journalctl: Nov 14 12:39:13 localhost.localdomain systemd-udevd[878]: Process '/sbin/mdadm -I /dev/dm-2' failed with exit code 1. Nov 14 12:39:13 localhost.localdomain systemd-udevd[880]: Process '/sbin/mdadm -I /dev/dm-3' failed with exit code 1. The customer was able to work around the issue with the below patch: # diff /lib/udev/rules.d/65-md-incremental.rules /etc/udev/rules.d/65-md-incremental.rules 58a59,60 > KERNEL=="dm-*", SUBSYSTEM=="block", ACTION=="change", ENV{ID_FS_TYPE}=="linux_raid_member", \ > PROGRAM="/usr/bin/egrep -c ^AUTO.*-1\.x.*$ /etc/mdadm.conf", RESULT=="1", GOTO="dm_change_end"
The debug logs for the issue are attached as journalctl.out
(In reply to Diana Negrete from comment #2) > Hi Xiao. Could you please look at this bug when you're able? The customer > is wanting to eliminate the error messages seen. Thanks for any help! Hi Diana It's an expected result. Incremental and Assemble all return -1 when AUTO is used to deny some metadata type. Because raid has been existed for a long time and there are many customers use it. We can't change this behavior. If we change it to return 0, maybe some customers use -1 in their scripts to check the return value. Thanks Xiao
If you specify --verbose when incremental the member disk, you'll see the output: [root@storageqe-104 mdadm]# mdadm -I /dev/sdb -v mdadm: /dev/sdb has metadata type 1.x for which auto-assembly is disabled [root@storageqe-104 mdadm]# echo $? 1
(In reply to XiaoNi from comment #4) > If you specify --verbose when incremental the member disk, you'll see the > output: > > [root@storageqe-104 mdadm]# mdadm -I /dev/sdb -v > mdadm: /dev/sdb has metadata type 1.x for which auto-assembly is disabled > [root@storageqe-104 mdadm]# echo $? > 1 Hi, Well no one stated that the issue was with the mdadm itself. It is and it should return that RC when called. However, UDEV is calling for incremental assembly on a timer which in our opinion should not happen as it's disallowed and thus it's bound to have an error in the logs. So, why call a function that you know it's going to fail ? Best Regards, Petar.
Thanks for responding Xiao. Is it possible that we change the log level on the message so it will only show up with debugging? Or should the udev rules be changed as mentioned in comment 5? Thanks!
(In reply to Petar Ivanov from comment #5) > (In reply to XiaoNi from comment #4) > > If you specify --verbose when incremental the member disk, you'll see the > > output: > > > > [root@storageqe-104 mdadm]# mdadm -I /dev/sdb -v > > mdadm: /dev/sdb has metadata type 1.x for which auto-assembly is disabled > > [root@storageqe-104 mdadm]# echo $? > > 1 > > Hi, > > Well no one stated that the issue was with the mdadm itself. It is and it > should return that RC when called. However, UDEV is calling for incremental Hi Petar What's RC here? > assembly on a timer which in our opinion should not happen as it's The incremental functions is called in 65-md-incremental.rules rather than a timer, right? What's the timer here? > disallowed and thus it's bound to have an error in the logs. So, why call a > function that you know it's going to fail ? It should know the questions mentioned above first, then we can look at this question. Thanks Xiao
(In reply to Diana Negrete from comment #6) > Thanks for responding Xiao. Is it possible that we change the log level on > the message so it will only show up with debugging? Or should the udev > rules be changed as mentioned in comment 5? Thanks! Hi Diana The log is output by udev. mdadm returns -1 and udev detects the error, then it outputs the log. I don't know if we can change the log level to remove this log.
(In reply to Diana Negrete from comment #17) > > CU replied to Johns comment with the following: > > I need a some time to prepare a detailed reply on this case (it's been more > than 6 months). Don't close it! > > Having said that I do have some points: > 1) No one was talking about an IMSM or DDF type arrays and I don't see how > they will be impacted. As you can probably see in the workaround I have > provided that the only type of arrays that will be impacted is the ones with > ENV{ID_FS_TYPE}=="linux_raid_member" which I beleive are of metadata type > "0.9" and "1.x". > IMSM would have a value of "isw_raid_member" and DDF respectively > "ddf_raid_member". Both of those types are not a subject to a "CHANGE" event > provided to UDEV (which is causing the issues). > > 2) We're not saying the UDEV and MDADM are not working as intended. We're > only saying they are not working together and since UDEV is calling MDADM it > probably should respect the configuration provided in /etc/mdadm.conf. Yes, I understand it. Because Petar said this in comment 5. mdadm.conf is used by mdadm. mdadm is used by udev rule. But I don't think it's good to do a filter like the workaround patch in the udev rule. > > I would very much like to see the arguments of the developers on how the > IMSM arrays would be impacted since there is no explanation in the the > bugzilla. I want to say, there are different requests for different customers. In the workaround patch, it specifies the device type(dm) and it doesn't want to assemble raid with super 1.x. If another customer doesn't want to do auto assemble on raw devices which has super 1.x super, do we need to add similar filter in the udev rule? And maybe some customers tell us they set the same rule in mdadm.conf, but they only want to disallow auto assemble on raw devices. They want the udev rule work on dm devices. What should we do? Thanks Xiao
Hi Xiao, Sorry to barge in on the conversation, but I would like to say that the provided workaround was designed to state the obvious i.e. that's it's a configuration issue and that it can be done. Nobody is expecting from you to merge it in. It's a crude thing that's not fit for general purpose usage. Now, what are we to do - just ignore the errors we're getting in the logs? Best Regards, Peter.
Hi Peter Sorry, from my side, I have no better methods. Now mdadm and udev do what they should do. To resolve this, mdadm can't return error. But it may cause other customer's cases. Do you have some ideas? Thanks Xiao
Hi Xiao, Yes, as stated - not call mdadm when incremenatal assembly is not allowed for the specific array type, regardless if it's dm or raw. Now, what you're telling me is that the issue that we're having is not important enough to even try to resolve it - fair enough. Best Regards, Peter.