Bug 1310264
Summary: | [RHEL7] raid1 udev/mdadm timeouts at boot with very large mdadm devices | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | John Pittman <jpittman> |
Component: | kernel | Assignee: | XiaoNi <xni> |
kernel sub component: | Multiple Devices (MD) | QA Contact: | ChanghuiZhong <czhong> |
Status: | CLOSED WONTFIX | Docs Contact: | |
Severity: | high | ||
Priority: | unspecified | CC: | alex.wang, bubrown, cww, guazhang, harald, heinzm, jmagrini, jshortt, loberman, ncroxon, nweddle, rune, tcleveng, xni, yizhan |
Version: | 7.2 | Keywords: | Reopened |
Target Milestone: | rc | ||
Target Release: | 7.3 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-06-25 03:28:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1298243, 1550088 |
Description
John Pittman
2016-02-19 22:20:34 UTC
74 legs? Second, please provide some _reasonable_ log files. The files > 300MB and go back 5 days - please reduce it to something that can actually be analyzed. Last, you talk about failure - did the array synchronize prior to reboot, or is it still in the process of needing to be synchronized? It sounds like systemd kills the script during startup because it takes too long. How long does it take if you assemble the array on the command line? Jes Thanks for looking into this Jes; I will get the information you asked for asap. Wanted to mention as well, the customer has a second environment with 44 disk raid6 array showing the same symptoms. I tried a recreate with 45 disks but was not successful; I am assuming due to disk size. For clarity /proc/mdstat from 74 disk array after boot. md1 : active raid6 sdbl1[28] sdbn1[30] sdbr1[34] sdbu1[37] sdbd1[20] sdce1[48] sdau1[10] sdax1[13] sdaz1[15] sdbj1[26] sdbq1[33] sdbg1[23] sdbs1[35] sdca1[44] sdby1[41] sdv1[69] sds1[66] sdbf1[22] sdaw1[12] sdcc1[46] sdcf1[49] sdbx1[40] sdbi1[25] sdaq1[6] sdbk1[27] sdar1[7] sdap1[5] sdcb1[45] sdbt1[36] sdav1[11] sday1[14] sdbv1[38] sdbw1[39] sdao1[4] sdy1[72] sdbe1[21] sdba1[17] sdcg1[50] sdat1[9] sdal1[1] sdcd1[47] sdas1[8] sdbz1[42] sdam1[2] sdw1[70] sdbm1[29] sdbp1[32] sdbh1[24] sdbc1[19] sdbo1[31] sdx1[71] sdu1[68] sdbb1[18] sdt1[67] sdan1[3] sdak1[0] sdr1[65] sdz1[73] sdq1[64] sdp1[63] sdk1[58] sdj1[57] sdi1[56] sdh1[55] sdg1[54] sde1[52] sdf1[53] sdd1[51] sdo1[62] sdm1[60] sdl1[59] sdn1[61] sdc1[43] sdb1[16] 421948071936 blocks super 1.2 level 6, 512k chunk, algorithm 2 [74/74] [UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU] bitmap: 20/44 pages [80KB], 65536KB chunk John Hi Jes, I attached a smaller log file; only the messages from Feb 13th. The customer is working within their next maintenance window to grab us the additional data you asked for. Will update as soon as I get the next set of data. John Hello re_install OS and same test result as before test steps: [root@storageqe-09 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931.5G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 930.5G 0 part ├─rhel_storageqe--09-root 253:0 0 50G 0 lvm / ├─rhel_storageqe--09-swap 253:1 0 7.8G 0 lvm [SWAP] └─rhel_storageqe--09-home 253:2 0 872.7G 0 lvm /home ├─sdg1 8:97 0 1.8T 0 part └─sdg2 8:98 0 1.8T 0 part sdh 8:112 0 3.7T 0 disk [root@storageqe-09 ~]# mdadm -CR /dev/md0 -l1 -n2 /dev/sdg[12] --assume-clean mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. [root@storageqe-09 ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdg2[1] sdg1[0] 1953376256 blocks super 1.2 [2/2] [UU] bitmap: 0/15 pages [0KB], 65536KB chunk mkfs.ext4 /dev/md0 mount /dev/md0 /mnt/bug sdg 8:96 0 3.7T 0 disk ├─sdg1 8:97 0 1.8T 0 part │ └─md0 9:0 0 1.8T 0 raid1 /mnt/bug └─sdg2 8:98 0 1.8T 0 part └─md0 9:0 0 1.8T 0 raid1 /mnt/bug vim /etc/fstable /dev/md0 /mnt/bug default 1 2 reboot [ TIME ] Timed out waiting for device dev-md0.device. [DEPEND] Dependency failed for /mnt/bug. [DEPEND] Dependency failed for Local File Systems. [DEPEND] Dependency failed for Relabel all filesystems, if necessary. [DEPEND] Dependency failed for Migrate local... structure to the new structure. [DEPEND] Dependency failed for File System Check on /dev/md0. system drop into "Welcome to emergGive root password for maintenance" ├─sdg1 8:97 0 1.8T 0 part │ └─md127 9:127 0 1.8T 0 raid1 └─sdg2 8:98 0 1.8T 0 part └─md127 9:127 0 1.8T 0 raid1 sdh 8:112 0 3.7T 0 disk [root@localhost ~]# cat /proc/mdstat Personalities : [raid1] md127 : active (auto-read-only) raid1 sdg1[0] sdg2[1] 1953376256 blocks super 1.2 [2/2] [UU] bitmap: 0/15 pages [0KB], 65536KB chunk Awesome job Guazhang Thank you. It's reproduced. Xiao this is very easy as you can see now to reproduce. Please can we now make progress. Perhaps Nigel Croxon can also help. Regards Laurence I am able to reproduce this issue at will: # mdadm --examine --scan --verbose ARRAY /dev/md/0 level=raid1 metadata=1.2 num-devices=2 UUID=91d394e5:23b21c73:0715f5f2:8e9a5607 name=0 devices=/dev/sdd,/dev/sdc # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdd[1] sdc[0] 1953383488 blocks super 1.2 [2/2] [UU] [======>..............] resync = 31.3% (613192512/1953383488) finish=172.4min speed=129488K/sec bitmap: 11/15 pages [44KB], 65536KB chunk Adding either the mdraid uuid or /dev/md0 to fstab results in the same boot failure: #UUID=91d394e5-23b2-1c73-0715-f5f28e9a5607 /test xfs defaults 1 2 /dev/md0 /test xfs defaults 1 2 [ TIME ] Timed out waiting for device dev-md0.device. [DEPEND] Dependency failed for /test. [DEPEND] Dependency failed for Local File Systems. [DEPEND] Dependency failed for File Systems Check on /dev/md0. After further debugging the root cause appears to be some contention and/or race condition between dmraid and mdraid components. The key to the reproducer is raid devices need to be attached to Intel RST controller. 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05) pci-0000:00:11.4-ata-3.0 -> ../../sdc pci-0000:00:11.4-ata-4.0 -> ../../sdd # dmraid -b /dev/sdd: 3907029168 total, "WD-WCC4M0267803" /dev/sdc: 3907029168 total, "WD-WCC4M0262885" By default both dmraid and mdraid components are enabled, which it seems dmraid fires first and activates the raid w ism devname and uuid which then results in fstab and dependency failures. The solution is to disable dmraid component by adding rd.dm=0 to GRUB_CMDLINE_LINUX in /etc/default/grub: GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=vg_root/root rd.luks=0 rd.dm=0 rd.lvm.lv=vg_root/swap rhgb quiet" Rebuild grub2: # grub2-mkconfig -o /boot/grub2/grub.cfg # grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg reboot. The system will now reboot successfully with either mdadm's devname /dev/md0 or uuid. Hello, asking jon and john as they have been more recently active in this case. Regards Laurence Hi Xiao, from the initial description we see "Started Activation of DM RAID sets.", which makes it seem as if Jon's suspicion in comment 46 is correct. Also, in regard to the rc.local edit, that was a workaround provided to the customer by Laurence, so that would not negate Jon's suspicion if I am understanding correctly. What we thought was due to large, numerous devices, may be due to the dmraid component. The customer has long been unresponsive. If you have a test package, we can likely test on Jon's system. Regardless of mdadm test package I can no longer reproduce on the latest kernel, even using various versions of mdadm and systemd. The largest devices I have available are 2TB each which was used below. [root@storage-test ~]# uname -r 3.10.0-957.10.1.el7.x86_64 [root@storage-test ~]# rpm -qa |grep mdadm mdadm-4.1-rc1_2.el7.x86_64 [root@storage-test ~]# lsscsi [3:0:0:0] disk ATA Samsung SSD 850 2B6Q /dev/sda [4:0:0:0] disk ATA WDC WD20EZRX-19D 0A81 /dev/sdb [5:0:0:0] disk ATA WDC WD20EZRX-19D 0A81 /dev/sdc [root@storage-test ~]# dmraid -b /dev/sdc: 3907029168 total, "WD-WCC4M0375096" /dev/sdb: 3907029168 total, "WD-WCC4M0258187" /dev/sda: 500118192 total, "S251NXAH345920N" [root@storage-test ~]# mdadm -C /dev/md0 --metadata=1.2 --level raid1 -n 2 /dev/sd[b-c] mdadm: array /dev/md0 started. [root@storage-test ~]# mkfs.xfs /dev/md0 [root@storage-test ~]# mount /dev/md0 /test Tested mounting by dev name and uuid: UUID=7debbf22-0c1d-450c-9569-920ab0357c22 /test xfs defaults 1 2 /dev/md0 /test xfs defaults 1 2 [root@storage-test ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-957.10.1.el7.x86_64 root=/dev/mapper/vg_root-root ro crashkernel=auto rd.lvm.lv=vg_root/root rd.lvm.lv=vg_root/swap rhgb quiet LANG=en_US.UTF-8 [root@storage-test ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-862.14.4.el7.x86_64 root=/dev/mapper/vg_root-root ro crashkernel=auto rd.lvm.lv=vg_root/root rd.lvm.lv=vg_root/swap rhgb quiet [root@storage-test ~]# yum history Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager ID | Command line | Date and time | Action(s) | Altered ------------------------------------------------------------------------------- 61 | update systemd | 2019-04-10 09:46 | Update | 10 60 | update mdadm | 2019-04-10 09:40 | Update | 1 59 | update kernel | 2019-04-10 09:39 | E, I, U | 14 EE [root@storage-test ~]# yum history undo 61 [root@storage-test ~]# yum history undo 60 4.0-13.el7.x86_64 @anaconda/7.5 4.1-rc1_2.el7.x86_64 @rhel-7-server-rpms The one thing that differs from original reproducer from latest is the following is now observed: --- Apr 10 10:16:33 storage-test systemd: Stopped Timer to wait for more drives before activating degraded array md0.. Which looks to be logged from unit mdadm-last-resort My raid is still resync'ing which is painfully taking a long time, but I will continue testing once this completes and see if behavior changes. Will also downgrade udev. [root@storage-test ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc[1] sdb[0] 1953382464 blocks super 1.2 [2/2] [UU] [====>................] resync = 20.0% (391244352/1953382464) finish=201.8min speed=128992K/sec bitmap: 13/15 pages [52KB], 65536KB chunk unused devices: <none> |