Bug 1931610
Summary: | iSCSI Linux RAID disk is always resyncing after reboot | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Jon Magrini <jmagrini> |
Component: | mdadm | Assignee: | Nigel Croxon <ncroxon> |
Status: | CLOSED NOTABUG | QA Contact: | Storage QE <storage-qe> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.3 | CC: | dledford, heinzm, jbrassow, jdonohue, mhoyer, ncroxon |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-19 19:13:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jon Magrini
2021-02-22 19:17:53 UTC
During normal shutdown, the system is solely relying on the safe mode delay timer to mark the device in sync. The default minimum safe delay time-limit is 201 milliseconds. So if manually unmounting first, the timer will trigger and force md to update its metadata before a later reboot request can stop iscsi. But if left to systemd, the final writes from unmounting the filesystem leaves a race window where systemd can stop iscsi soon enough after the unmount that the timer only triggers after the iscsi devices are no longer usable. And using some of the logs, note that md reported it was unable to update its metadata a little over 200ms after the filesystem unmounted, then iscsi was stopped. [ 126.523984] XFS (dm-4): Unmounting Filesystem ... [ 126.535742] systemd[1]: Unmounted /data. ... [ 126.550247] systemd[6813]: iscsi-shutdown.service: Executing: /usr/sbin/iscsiadm -m node --logoutall=all [ 126.560603] systemd[1]: Got cgroup empty notification for: /system.slice/data.mount [ 126.573028] sd 7:0:0:0: [sda] Synchronizing SCSI cache [ 126.580379] sd 7:0:0:1: [sdb] Synchronizing SCSI cache ... [ 126.761855] md: super_written gets error=10 This looks like the safe mode delay timer triggered, but too late for the I/Os to mark the disks in sync to succeed. Nothing in the systemd units look to try and force it sooner. We have specifically logged out of iSCSI sessions at shutdown (unless they're used for the root filesystem) because of storage arrays that don't like having resources tied up in iSCSI session and TCP connection state if we just drop the connection. I don't think that's going to change. In general, it's probably best to manage RAID on the storage target side and expose the set as a single volume over iSCSI. As a workaround, I suppose you could edit the iscsi-shutdown.service unit file to manually stop the md RAID set before logging out of the iSCSI sessions. This seems to work for me. # systemctl edit --full iscsi-shutdown.service (I could not get multiple ExecStop ordering right with override files, hence the --full) add an ExecStop line for mdadm before the existing iscsiadm line [Service] ... ExecStop=-/usr/sbin/mdadm --stop --scan ExecStop=-/usr/sbin/iscsiadm -m node --logoutall=all This will write a modifed service file to /etc/systemd/system/iscsi-shutdown.service It's probably also possible to do this with a new service file that orders itself After iscsi-shutdown (After gets executed before on shutdown, with the real work being done in ExecStop) Hello Jon, If you could give this .service file a test? /usr/lib/systemd/system/mdadm-clean-shutdown.service [Unit] Description=Wait for a update to clean the SB and bitmap before shutdown DefaultDependencies=no Requires=local-fs.target Before=iscsi-shutdown.service After=unmount.target [Service] Type=oneshot ExecStart=BINDIR/mdadm --stop --scan Jon, Is the initiator and the target on the same machine? (In reply to Nigel Croxon from comment #6) > Jon, Is the initiator and the target on the same machine? The initiator and target are not the same system. The target is an HPE NAS appliance. I will try and test the unit file. When I have the MD (/dev/md0) placed in the /etc/fstab to auto mount, it hangs/stalls on boot with: A start job is running for dev-md0.device (xxx min / yyy min). It eventually times out and falls into emergency mode. I edit /etc/fstab and remove the md0 reference, the boot continues. I think there is a power up sequence issue. puting _netdev as an option in /etc/fstab resolved my issue of booting in the above comment. What does your /etc/fstab look like? fstab is as follows, adding x-systemd.after=iscsi.service addressed a few random shutdown issues. --- /dev/storeasy/veeamrepo /srv xfs _netdev,x-systemd.after=iscsi.service 0 0 I'm still unable to reproduce the issue as you have reported. ok, maybe I spoke too soon # dmesg |grep md0 [ 4.466948] systemd[1]: dev-md0.device: Dependency Before=network-online.target ignored (.device units cannot be delayed) [ 4.467690] systemd[1]: dev-md0.device: Dependency Before=network.target ignored (.device units cannot be delayed) [ 8.790559] md/raid1:md0: not clean -- starting background reconstruction [ 8.791167] md/raid1:md0: active with 3 out of 3 mirrors [ 8.791756] md0: detected capacity change from 0 to 103809024 [ 8.809653] md: resync of RAID array md0 [ 9.037098] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null) [ 12.732727] md: md0: resync done. I'm getting consistent results (clean on entry) when I don't have your addition to fstab: [root@virt2 ~]# cat /etc/fstab /dev/mapper/rhel-root / xfs defaults 0 0 /dev/mapper/rhel-swap none swap defaults 0 0 /dev/md0 /mdtest ext4 _netdev 0 0 [root@virt2 ~]# dmesg |grep md0 [ 4.533658] systemd[1]: dev-md0.device: Dependency Before=network-online.target ignored (.device units cannot be delayed) [ 4.534402] systemd[1]: dev-md0.device: Dependency Before=network.target ignored (.device units cannot be delayed) [ 9.099737] md/raid1:md0: active with 1 out of 3 mirrors [ 9.100373] md0: detected capacity change from 0 to 103809024 [ 9.247297] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null) [root@virt2 ~]# cat /usr/lib/systemd/system/mdadm-clean-shutdown.service [Unit] Description=Wait for a update to clean the SB and bitmap before shutdown DefaultDependencies=no Requires=local-fs.target Before=iscsi-shutdown.service After=unmount.target [Service] Type=oneshot ExecStart=/sbin/mdadm --stop --scan Nigel, Your comment #14 is still utilizing the modified shutdown service unit correct? I will retest and also ask of the customer to remove the fstab entry and modify the mdadm-clean-shutdown.service and provide results. Thanks. As there is a working solution to this problem, I am closing this bz. |