Description of problem: engine expectes devices to be available immediately after logging in to the portal, however they might not be available this fast and the calling operation will fail Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Move an iscsi domain to maintenance 2. Make sure, on vdsm, that it is not connected - `iscsiadm -m session` should not show any connections 3. detach the domain Actual results: The operation would fail with: Traceback (most recent call last): File "/usr/lib/python3.7/site-packages/vdsm/storage/task.py", line 881, in _run return fn(*args, **kargs) File "<decorator-gen-27>", line 2, in detachStorageDomain File "/usr/lib/python3.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.7/site-packages/vdsm/storage/hsm.py", line 914, in detachStorageDomain pool.detachSD(sdUUID) File "/usr/lib/python3.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python3.7/site-packages/vdsm/storage/sp.py", line 1051, in detachSD dom = sdCache.produce(sdUUID) File "/usr/lib/python3.7/site-packages/vdsm/storage/sdc.py", line 115, in produce domain.getRealDomain() File "/usr/lib/python3.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python3.7/site-packages/vdsm/storage/sdc.py", line 139, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python3.7/site-packages/vdsm/storage/sdc.py", line 156, in _findDomain return findMethod(sdUUID) File "/usr/lib/python3.7/site-packages/vdsm/storage/blockSD.py", line 1718, in findDomain return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID)) File "/usr/lib/python3.7/site-packages/vdsm/storage/blockSD.py", line 1576, in findDomainPath raise se.StorageDomainDoesNotExist(sdUUID) vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist: ('c89ebab1-d683-4f43-9c72-cd1b365b0542',) Because the VG is not yet seen by the host in the journal it can be seen it is added only a second after VGS lookup[1] Feb 25 14:59:13 dhcp-0-177.tlv.redhat.com multipathd[693]: sda [8:0]: path added to devmap 360014054fb4b51f91f049bfa72d27933 Expected results: Should not fail Additional info: [1] 2020-02-25 14:59:12,835+0200 DEBUG (jsonrpc/5) [common.commands] /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /usr/sbin/lvm vgs --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/360014054fb4b51f91f049bfa72d27933$|^/dev/mapper/36001405d9aaa9f2f71043499b1d4aba4$|", "r|.*|"] hints="none" } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name c89ebab1-d683-4f43-9c72-cd1b365b0542 (cwd None) (commands:153) 2020-02-25 14:59:12,888+0200 DEBUG (jsonrpc/5) [common.commands] FAILED: <err> = b' Configuration setting "global/event_activation" unknown.\n Configuration setting "devices/scan_lvs" unknown.\n Configuration setting "devices/hints" unknown.\n Configuration setting "devices/hints" unknown.\n Volume group "c89ebab1-d683-4f43-9c72-cd1b365b0542" not found\n Cannot process volume group c89ebab1-d683-4f43-9c72-cd1b365b0542\n Configuration setting "global/event_activation" unknown.\n Configuration setting "devices/scan_lvs" unknown.\n Configuration setting "devices/hints" unknown.\n'; <rc> = 5 (commands:185) This was seen on fedora 30 and el8, also, this will only affect when there is only one iscsi domain
I reproduced it with RHEL 8.2. vdsm.log: 1. Starting connectStorageServer 2020-02-25 10:35:56,467-0500 INFO (jsonrpc/0) [vdsm.api] START connectStorageServer(domType=3, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'password': '******* *', 'port': '3260', 'iqn': 'iqn.2003-01.org.sparse-local.1', 'connection': '192.168.122.1', 'ipv6_enabled': 'false', 'id': '857486a9-0b97-4095-af46-c6e5db5a3073', 'user': '' , 'tpgt': '1'}], options=None) from=::ffff:192.168.122.10,49164, flow_id=a3010be9-9fc8-465f-a7cd-aa985fd71fb3, task_id=108c3679-fbbf-4638-8fe8-5dfee91efbbd (api:48) 2. Logging in to target 2020-02-25 10:35:56,664-0500 DEBUG (jsonrpc/0) [common.commands] /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /sbin/iscsiadm -m node -T iqn.2003-01.org.sparse-local.1 -I default -p 192.168.122.1:3260,1 -l (cwd None) (commands:153) 2020-02-25 10:35:56,870-0500 DEBUG (jsonrpc/0) [common.commands] SUCCESS: <err> = b''; <rc> = 0 (commands:98) 3. Waiting for udev events 2020-02-25 10:35:57,195-0500 DEBUG (jsonrpc/0) [common.commands] /usr/bin/taskset --cpu-list 0-1 /sbin/udevadm settle --timeout=5 (cwd None) (commands:153) 2020-02-25 10:35:57,260-0500 DEBUG (jsonrpc/0) [common.commands] SUCCESS: <err> = b''; <rc> = 0 (commands:98) udevadm settle completed, so vdsm assumes that all devices were detected and added to the system. 4. Starting refresh storage 2020-02-25 10:35:57,260-0500 INFO (jsonrpc/0) [storage.StorageDomainCache] Refreshing storage domain cache (resize=True) (sdc:80) 5. Waiting for udev events again (refreshing storage can discover new devices or changes in existing devices). 2020-02-25 10:35:57,564-0500 DEBUG (jsonrpc/0) [common.commands] /usr/bin/taskset --cpu-list 0-1 /sbin/udevadm settle --timeout=5 (cwd None) (commands:153) 2020-02-25 10:35:57,587-0500 DEBUG (jsonrpc/0) [common.commands] SUCCESS: <err> = b''; <rc> = 0 (commands:98) 4. We have multipath devices without scsi devices - this is leftover from the previous active domain. 2020-02-25 10:35:57,589-0500 WARN (jsonrpc/0) [storage.Multipath] Map '360014053b18095bd13c48158687153a5' has no slaves (multipath:120) This shows that the kernel dis not find yet the scsi devices backing this multipath device. 5. Refreshing storage done 2020-02-25 10:35:57,589-0500 INFO (jsonrpc/0) [storage.StorageDomainCache] Refreshing storage domain cache: 0.33 seconds (utils:390) 6. Looking for block storage domains 2020-02-25 10:35:57,592-0500 DEBUG (jsonrpc/0) [common.commands] /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /sbin/lvm vgs --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/360014053b18095bd13c48158687153a5$|", "r|.*|"] hints="none" } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name (cwd None) (commands:153) 2020-02-25 10:35:57,771-0500 DEBUG (jsonrpc/0) [common.commands] SUCCESS: <err> = b''; <rc> = 0 (commands:185) 2020-02-25 10:35:57,771-0500 DEBUG (jsonrpc/0) [storage.HSM] Found SD uuids: () (hsm:2384) No domains found, because the multipath device 360014053b18095bd13c48158687153a5 is not ready yet. 7. Vdsm knows about the domain 91630622-c645-4397-a9fe-9ddf26690500, since it was active few minutes ago. 2020-02-25 10:35:57,772-0500 DEBUG (jsonrpc/0) [storage.HSM] knownSDs: {91630622-c645-4397-a9fe-9ddf26690500: vdsm.storage.blockSD.findDomain, 55255570-983a-4d82-907a-19b964abf7ed: vdsm.storage.nfsSD.findDomain} (hsm:2451) 8. Connecting to storage succeeded, but vdsm did not find any block storage domain. 2020-02-25 10:35:57,772-0500 INFO (jsonrpc/0) [vdsm.api] FINISH connectStorageServer return={'statuslist': [{'id': '857486a9-0b97-4095-af46-c6e5db5a3073', 'status': 0}]} from=::ffff:192.168.122.10,49164, flow_id=a3010be9-9fc8-465f-a7cd-aa985fd71fb3, task_id=108c3679-fbbf-4638-8fe8-5dfee91efbbd (api:54) 2020-02-25 10:35:57,773-0500 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call StoragePool.connectStorageServer succeeded in 1.30 seconds (__init__:312) 9. Engine try to get storage domain info for 91630622-c645-4397-a9fe-9ddf26690500 2020-02-25 10:35:57,788-0500 INFO (jsonrpc/2) [vdsm.api] START getStorageDomainInfo(sdUUID='91630622-c645-4397-a9fe-9ddf26690500', options=None) from=::ffff:192.168.122.10,49164, flow_id=a3010be9-9fc8-465f-a7cd-aa985fd71fb3, task_id=15f49771-7c4a-4a44-95ec-781b827c846b (api:48) 10. Checking the VG 2020-02-25 10:35:57,789-0500 DEBUG (jsonrpc/2) [common.commands] /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /sbin/lvm vgck --config 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/360014053b18095bd13c48158687153a5$|", "r|.*|"] hints="none" } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }' 91630622-c645-4397-a9fe-9ddf26690500 (cwd None) (commands:153) 2020-02-25 10:35:58,083-0500 DEBUG (jsonrpc/2) [common.commands] FAILED: <err> = b' Volume group "91630622-c645-4397-a9fe-9ddf26690500" not found.\n Cannot process volume group 91630622-c645-4397-a9fe-9ddf26690500\n'; <rc> = 5 (commands:185) The command failed because the VG is not accessible yet. 2020-02-25 10:35:58,086-0500 INFO (jsonrpc/2) [vdsm.api] FINISH getStorageDomainInfo error=Domain is either partially accessible or entirely inaccessible: ('91630622-c645-4397-a9fe-9ddf26690500: [\' Volume group "91630622-c645-4397-a9fe-9ddf26690500" not found.\', \' Cannot process volume group 91630622-c645-4397-a9fe-9ddf26690500\']',) from=::ffff:192.168.122.10,49164, flow_id=a3010be9-9fc8-465f-a7cd-aa985fd71fb3, task_id=15f49771-7c4a-4a44-95ec-781b827c846b (api:52) 2020-02-25 10:35:58,086-0500 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='15f49771-7c4a-4a44-95ec-781b827c846b') Unexpected error (task:874) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 881, in _run return fn(*args, **kargs) File "<decorator-gen-129>", line 2, in getStorageDomainInfo File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2752, in getStorageDomainInfo dom = self.validateSdUUID(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 310, in validateSdUUID sdDom.validate() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 1182, in validate lvm.chkVG(self.sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1247, in chkVG raise se.StorageDomainAccessError("%s: %s" % (vgName, err)) vdsm.storage.exception.StorageDomainAccessError: Domain is either partially accessible or entirely inaccessible: ('91630622-c645-4397-a9fe-9ddf26690500: [\' Volume group "91630622-c645-4397-a9fe-9ddf26690500" not found.\', \' Cannot process volume group 91630622-c645-4397-a9fe-9ddf26690500\']',) Looking in journalctl we see: # journalctl -n 1000 -o short-precise ... Feb 25 10:35:56.744699 host4 sudo[7793]: pam_unix(sudo:session): session opened for user root by (uid=0) Feb 25 10:35:56.753902 host4 kernel: scsi host8: iSCSI Initiator over TCP/IP Feb 25 10:35:56.754157 host4 iscsid[1030]: iscsid: Could not set session9 priority. READ/WRITE throughout and latency could be affected. Feb 25 10:35:56.758676 host4 iscsid[1030]: iscsid: Connection9:0 to [target: iqn.2003-01.org.sparse-local.1, portal: 192.168.122.1,3260] through [iface: default] is operati> Feb 25 10:35:56.761338 host4 kernel: scsi 8:0:0:0: Direct-Access LIO-ORG 1-00 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.761623 host4 kernel: scsi 8:0:0:0: alua: supports implicit and explicit TPGS Feb 25 10:35:56.761860 host4 kernel: scsi 8:0:0:0: alua: device naa.60014053b18095bd13c48158687153a5 port group 0 rel port 1 Feb 25 10:35:56.762971 host4 kernel: sd 8:0:0:0: Attached scsi generic sg2 type 0 Feb 25 10:35:56.767529 host4 kernel: sd 8:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.767753 host4 kernel: scsi 8:0:0:9: Direct-Access LIO-ORG 1-09 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.767908 host4 kernel: sd 8:0:0:0: [sdl] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.768035 host4 kernel: sd 8:0:0:0: [sdl] Write Protect is off Feb 25 10:35:56.768157 host4 kernel: sd 8:0:0:0: [sdl] Mode Sense: 43 00 00 08 Feb 25 10:35:56.768279 host4 kernel: scsi 8:0:0:9: alua: supports implicit and explicit TPGS Feb 25 10:35:56.768443 host4 kernel: scsi 8:0:0:9: alua: device naa.6001405d3fa4e536e3842b2b17647ff1 port group 0 rel port 1 Feb 25 10:35:56.768574 host4 kernel: sd 8:0:0:0: [sdl] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.768698 host4 kernel: sd 8:0:0:0: [sdl] Optimal transfer size 8388608 bytes Feb 25 10:35:56.772790 host4 kernel: sd 8:0:0:9: Attached scsi generic sg3 type 0 Feb 25 10:35:56.775382 host4 kernel: sd 8:0:0:9: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.775557 host4 kernel: sd 8:0:0:9: [sdm] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.775683 host4 kernel: sd 8:0:0:9: [sdm] Write Protect is off Feb 25 10:35:56.775819 host4 kernel: sd 8:0:0:9: [sdm] Mode Sense: 43 00 00 08 Feb 25 10:35:56.775948 host4 kernel: sd 8:0:0:9: [sdm] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.776072 host4 kernel: sd 8:0:0:9: [sdm] Optimal transfer size 8388608 bytes Feb 25 10:35:56.778994 host4 kernel: scsi 8:0:0:8: Direct-Access LIO-ORG 1-08 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.781529 host4 kernel: scsi 8:0:0:8: alua: supports implicit and explicit TPGS Feb 25 10:35:56.781802 host4 kernel: scsi 8:0:0:8: alua: device naa.6001405dc0840c200954f2ba05597371 port group 0 rel port 1 Feb 25 10:35:56.782031 host4 kernel: sd 8:0:0:8: Attached scsi generic sg4 type 0 Feb 25 10:35:56.784497 host4 kernel: sd 8:0:0:0: [sdl] Attached SCSI disk Feb 25 10:35:56.788418 host4 kernel: sd 8:0:0:8: [sdn] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.788674 host4 kernel: scsi 8:0:0:7: Direct-Access LIO-ORG 1-07 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.788789 host4 kernel: sd 8:0:0:8: [sdn] Write Protect is off Feb 25 10:35:56.788912 host4 kernel: sd 8:0:0:8: [sdn] Mode Sense: 43 00 00 08 Feb 25 10:35:56.789033 host4 kernel: sd 8:0:0:8: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.789152 host4 kernel: sd 8:0:0:8: [sdn] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.791512 host4 kernel: sd 8:0:0:8: [sdn] Optimal transfer size 8388608 bytes Feb 25 10:35:56.793474 host4 kernel: scsi 8:0:0:7: alua: supports implicit and explicit TPGS Feb 25 10:35:56.793677 host4 kernel: scsi 8:0:0:7: alua: device naa.600140569f55ca7b11e410980ee795bc port group 0 rel port 1 Feb 25 10:35:56.793833 host4 kernel: sd 8:0:0:7: Attached scsi generic sg5 type 0 Feb 25 10:35:56.798585 host4 kernel: sd 8:0:0:9: [sdm] Attached SCSI disk Feb 25 10:35:56.798823 host4 kernel: scsi 8:0:0:6: Direct-Access LIO-ORG 1-06 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.799050 host4 kernel: sd 8:0:0:7: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.799197 host4 kernel: scsi 8:0:0:6: alua: supports implicit and explicit TPGS Feb 25 10:35:56.799336 host4 kernel: scsi 8:0:0:6: alua: device naa.6001405183d33921b85475093bfeb1e2 port group 0 rel port 1 Feb 25 10:35:56.799518 host4 kernel: sd 8:0:0:7: [sdo] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.799638 host4 kernel: sd 8:0:0:7: [sdo] Write Protect is off Feb 25 10:35:56.799787 host4 kernel: sd 8:0:0:7: [sdo] Mode Sense: 43 00 00 08 Feb 25 10:35:56.799936 host4 kernel: sd 8:0:0:6: Attached scsi generic sg6 type 0 Feb 25 10:35:56.800085 host4 kernel: sd 8:0:0:7: [sdo] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.801625 host4 kernel: sd 8:0:0:7: [sdo] Optimal transfer size 8388608 bytes Feb 25 10:35:56.806502 host4 kernel: scsi 8:0:0:5: Direct-Access LIO-ORG 1-05 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.806599 host4 kernel: sd 8:0:0:6: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.806782 host4 kernel: sd 8:0:0:6: [sdp] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.808551 host4 kernel: sd 8:0:0:6: [sdp] Write Protect is off Feb 25 10:35:56.808706 host4 kernel: sd 8:0:0:6: [sdp] Mode Sense: 43 00 00 08 Feb 25 10:35:56.810000 host4 kernel: sd 8:0:0:6: [sdp] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.810150 host4 kernel: scsi 8:0:0:5: alua: supports implicit and explicit TPGS Feb 25 10:35:56.810291 host4 kernel: scsi 8:0:0:5: alua: device naa.60014050441ed7bb5004603965c1510a port group 0 rel port 1 Feb 25 10:35:56.810424 host4 kernel: sd 8:0:0:6: [sdp] Optimal transfer size 8388608 bytes Feb 25 10:35:56.813717 host4 kernel: sd 8:0:0:5: Attached scsi generic sg7 type 0 Feb 25 10:35:56.818585 host4 kernel: scsi 8:0:0:4: Direct-Access LIO-ORG 1-04 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.818672 host4 kernel: sd 8:0:0:5: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.822515 host4 kernel: sd 8:0:0:8: [sdn] Attached SCSI disk Feb 25 10:35:56.822792 host4 kernel: scsi 8:0:0:4: alua: supports implicit and explicit TPGS Feb 25 10:35:56.822944 host4 kernel: scsi 8:0:0:4: alua: device naa.60014055eee40bf770c4bd9a7a120238 port group 0 rel port 1 Feb 25 10:35:56.824548 host4 kernel: sd 8:0:0:4: Attached scsi generic sg8 type 0 Feb 25 10:35:56.826423 host4 kernel: sd 8:0:0:5: [sdq] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.828614 host4 kernel: sd 8:0:0:5: [sdq] Write Protect is off Feb 25 10:35:56.828869 host4 kernel: sd 8:0:0:5: [sdq] Mode Sense: 43 00 00 08 Feb 25 10:35:56.828994 host4 kernel: scsi 8:0:0:3: Direct-Access LIO-ORG 1-03 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.830514 host4 kernel: sd 8:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.832072 host4 kernel: sd 8:0:0:5: [sdq] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.833754 host4 kernel: sd 8:0:0:5: [sdq] Optimal transfer size 8388608 bytes Feb 25 10:35:56.833967 host4 kernel: sd 8:0:0:4: [sdr] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.834127 host4 kernel: scsi 8:0:0:3: alua: supports implicit and explicit TPGS Feb 25 10:35:56.834266 host4 kernel: scsi 8:0:0:3: alua: device naa.60014057ae80179b71e42fe884794e64 port group 0 rel port 1 Feb 25 10:35:56.837314 host4 kernel: sd 8:0:0:4: [sdr] Write Protect is off Feb 25 10:35:56.837477 host4 kernel: sd 8:0:0:4: [sdr] Mode Sense: 43 00 00 08 Feb 25 10:35:56.837591 host4 kernel: sd 8:0:0:3: Attached scsi generic sg9 type 0 Feb 25 10:35:56.842546 host4 kernel: sd 8:0:0:4: [sdr] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.848960 host4 kernel: sd 8:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.852648 host4 kernel: scsi 8:0:0:2: Direct-Access LIO-ORG 1-02 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.852855 host4 kernel: sd 8:0:0:4: [sdr] Optimal transfer size 8388608 bytes Feb 25 10:35:56.852974 host4 kernel: sd 8:0:0:7: [sdo] Attached SCSI disk Feb 25 10:35:56.853089 host4 kernel: sd 8:0:0:3: [sds] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.853201 host4 kernel: sd 8:0:0:3: [sds] Write Protect is off Feb 25 10:35:56.853310 host4 kernel: sd 8:0:0:3: [sds] Mode Sense: 43 00 00 08 Feb 25 10:35:56.853456 host4 kernel: scsi 8:0:0:2: alua: supports implicit and explicit TPGS Feb 25 10:35:56.853699 host4 kernel: scsi 8:0:0:2: alua: device naa.6001405271fe76b24b542bf858aaeef7 port group 0 rel port 1 Feb 25 10:35:56.853903 host4 kernel: sd 8:0:0:2: Attached scsi generic sg10 type 0 Feb 25 10:35:56.862773 host4 kernel: sd 8:0:0:3: [sds] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.863016 host4 kernel: sd 8:0:0:6: [sdp] Attached SCSI disk Feb 25 10:35:56.863219 host4 kernel: sd 8:0:0:3: [sds] Optimal transfer size 8388608 bytes Feb 25 10:35:56.863465 host4 kernel: scsi 8:0:0:1: Direct-Access LIO-ORG 1-01 4.0 PQ: 0 ANSI: 5 Feb 25 10:35:56.863719 host4 kernel: scsi 8:0:0:1: alua: supports implicit and explicit TPGS Feb 25 10:35:56.863968 host4 kernel: scsi 8:0:0:1: alua: device naa.60014051ce5179112ae4fb98e72d9ba9 port group 0 rel port 1 Feb 25 10:35:56.864147 host4 kernel: sd 8:0:0:2: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.864342 host4 kernel: sd 8:0:0:5: [sdq] Attached SCSI disk [iscsiadm login returned here] Feb 25 10:35:56.871830 host4 kernel: sd 8:0:0:1: [sdu] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.872069 host4 kernel: sd 8:0:0:4: [sdr] Attached SCSI disk Feb 25 10:35:56.872284 host4 kernel: sd 8:0:0:1: [sdu] Write Protect is off Feb 25 10:35:56.872505 host4 kernel: sd 8:0:0:1: [sdu] Mode Sense: 43 00 00 08 Feb 25 10:35:56.872723 host4 kernel: sd 8:0:0:1: [sdu] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.872953 host4 kernel: sd 8:0:0:1: [sdu] Optimal transfer size 8388608 bytes Feb 25 10:35:56.873152 host4 kernel: sd 8:0:0:1: Attached scsi generic sg11 type 0 Feb 25 10:35:56.873355 host4 kernel: sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:56.885149 host4 kernel: sd 8:0:0:2: [sdt] 209715200 512-byte logical blocks: (107 GB/100 GiB) Feb 25 10:35:56.857096 host4 rsyslogd[1258]: imjournal: journal files changed, reloading... [v8.1911.0-3.el8 try https://www.rsyslog.com/e/0 ] Feb 25 10:35:56.867609 host4 sudo[7793]: pam_unix(sudo:session): session closed for user root Feb 25 10:35:56.889977 host4 kernel: sd 8:0:0:2: [sdt] Write Protect is off Feb 25 10:35:56.890177 host4 kernel: sd 8:0:0:2: [sdt] Mode Sense: 43 00 00 08 Feb 25 10:35:56.908721 host4 kernel: sd 8:0:0:2: [sdt] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Feb 25 10:35:56.912030 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:56.912064 host4 systemd[1]: Stopping User Manager for UID 0... Feb 25 10:35:56.913837 host4 systemd[7786]: Stopped target Default. Feb 25 10:35:56.915580 host4 kernel: sd 8:0:0:2: [sdt] Optimal transfer size 8388608 bytes Feb 25 10:35:56.913922 host4 systemd[7786]: Stopping D-Bus User Message Bus... Feb 25 10:35:56.924746 host4 systemd[7786]: Stopped D-Bus User Message Bus. Feb 25 10:35:56.926287 host4 systemd[7786]: Stopped target Basic System. Feb 25 10:35:56.927385 host4 systemd[7786]: Stopped target Sockets. Feb 25 10:35:56.927430 host4 systemd[7786]: Stopped target Timers. Feb 25 10:35:56.927440 host4 systemd[7786]: Stopped target Paths. Feb 25 10:35:56.927911 host4 systemd[7786]: Closed D-Bus User Message Bus Socket. Feb 25 10:35:56.927953 host4 systemd[7786]: Reached target Shutdown. Feb 25 10:35:56.929657 host4 systemd[7786]: Starting Exit the Session... Feb 25 10:35:56.931812 host4 kernel: sd 8:0:0:3: [sds] Attached SCSI disk Feb 25 10:35:56.932097 host4 kernel: sd 8:0:0:1: [sdu] Attached SCSI disk Feb 25 10:35:56.953212 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:56.966355 host4 systemd[1]: Stopped User Manager for UID 0. Feb 25 10:35:56.966389 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:56.966838 host4 systemd[1]: Removed slice User Slice of UID 0. Feb 25 10:35:56.967836 host4 systemd[1]: Stopping /run/user/0 mount wrapper... Feb 25 10:35:56.980485 host4 multipathd[829]: 360014053b18095bd13c48158687153a5 Last path deleted, disabling queueing Feb 25 10:35:56.982638 host4 multipathd[829]: 360014053b18095bd13c48158687153a5: map in use Feb 25 10:35:56.982638 host4 multipathd[829]: 360014053b18095bd13c48158687153a5: can't flush Feb 25 10:35:56.982638 host4 multipathd[829]: 360014053b18095bd13c48158687153a5: queueing disabled Feb 25 10:35:56.992646 host4 systemd[1]: Stopped /run/user/0 mount wrapper. Feb 25 10:35:57.015840 host4 kernel: sd 8:0:0:2: [sdt] Attached SCSI disk Feb 25 10:35:57.021947 host4 kernel: scsi 7:0:0:0: alua: Detached Feb 25 10:35:57.022065 host4 multipathd[829]: 360014053b18095bd13c48158687153a5: load table [0 209715200 multipath 0 1 alua 0 0] Feb 25 10:35:57.022065 host4 multipathd[829]: sdb [8:16]: path removed from map 360014053b18095bd13c48158687153a5 Feb 25 10:35:57.022065 host4 multipathd[829]: 36001405d3fa4e536e3842b2b17647ff1 Last path deleted, disabling queueing Feb 25 10:35:57.028457 host4 kernel: scsi 7:0:0:9: alua: Detached Feb 25 10:35:57.028566 host4 multipathd[829]: 36001405d3fa4e536e3842b2b17647ff1: map flushed Feb 25 10:35:57.028566 host4 multipathd[829]: 36001405d3fa4e536e3842b2b17647ff1: removed map after removing all paths Feb 25 10:35:57.028566 host4 multipathd[829]: 36001405dc0840c200954f2ba05597371 Last path deleted, disabling queueing Feb 25 10:35:57.030665 host4 systemd[1]: Created slice User Slice of UID 0. Feb 25 10:35:57.031518 host4 systemd[1]: Started /run/user/0 mount wrapper. Feb 25 10:35:57.032376 host4 systemd[1]: Starting User Manager for UID 0... Feb 25 10:35:57.053169 host4 multipathd[829]: 36001405dc0840c200954f2ba05597371: map flushed Feb 25 10:35:57.053169 host4 multipathd[829]: 36001405dc0840c200954f2ba05597371: removed map after removing all paths Feb 25 10:35:57.053169 host4 multipathd[829]: 3600140569f55ca7b11e410980ee795bc Last path deleted, disabling queueing Feb 25 10:35:57.053491 host4 kernel: scsi 7:0:0:8: alua: Detached Feb 25 10:35:57.056391 host4 systemd[1]: Started Session c292 of user root. Feb 25 10:35:57.080475 host4 kernel: scsi 7:0:0:7: alua: Detached Feb 25 10:35:57.080581 host4 multipathd[829]: 3600140569f55ca7b11e410980ee795bc: map flushed Feb 25 10:35:57.080581 host4 multipathd[829]: 3600140569f55ca7b11e410980ee795bc: removed map after removing all paths Feb 25 10:35:57.080581 host4 multipathd[829]: 36001405183d33921b85475093bfeb1e2 Last path deleted, disabling queueing Feb 25 10:35:57.086715 host4 kernel: scsi 7:0:0:6: alua: Detached Feb 25 10:35:57.086808 host4 multipathd[829]: 36001405183d33921b85475093bfeb1e2: map flushed Feb 25 10:35:57.086808 host4 multipathd[829]: 36001405183d33921b85475093bfeb1e2: removed map after removing all paths Feb 25 10:35:57.086808 host4 multipathd[829]: 360014050441ed7bb5004603965c1510a Last path deleted, disabling queueing Feb 25 10:35:57.108659 host4 systemd[7883]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Feb 25 10:35:57.112485 host4 kernel: scsi 7:0:0:5: alua: Detached Feb 25 10:35:57.112574 host4 multipathd[829]: 360014050441ed7bb5004603965c1510a: map flushed Feb 25 10:35:57.112574 host4 multipathd[829]: 360014050441ed7bb5004603965c1510a: removed map after removing all paths Feb 25 10:35:57.112574 host4 multipathd[829]: 360014055eee40bf770c4bd9a7a120238 Last path deleted, disabling queueing Feb 25 10:35:57.131364 host4 multipathd[829]: 360014055eee40bf770c4bd9a7a120238: map flushed Feb 25 10:35:57.131364 host4 multipathd[829]: 360014055eee40bf770c4bd9a7a120238: removed map after removing all paths Feb 25 10:35:57.131364 host4 multipathd[829]: 360014057ae80179b71e42fe884794e64 Last path deleted, disabling queueing Feb 25 10:35:57.131507 host4 kernel: scsi 7:0:0:4: alua: Detached Feb 25 10:35:57.167567 host4 systemd[7883]: Reached target Timers. Feb 25 10:35:57.167586 host4 systemd[7883]: Reached target Paths. [udevadm settle started here] Feb 25 10:35:57.169356 host4 systemd[7883]: Starting D-Bus User Message Bus Socket. Feb 25 10:35:57.179509 host4 systemd[7883]: Listening on D-Bus User Message Bus Socket. Feb 25 10:35:57.179816 host4 systemd[7883]: Reached target Sockets. Feb 25 10:35:57.179829 host4 systemd[7883]: Reached target Basic System. Feb 25 10:35:57.179896 host4 systemd[7883]: Reached target Default. Feb 25 10:35:57.179909 host4 systemd[7883]: Startup finished in 62ms. Feb 25 10:35:57.181058 host4 systemd[1]: Started User Manager for UID 0. Feb 25 10:35:57.183557 host4 sudo[7843]: pam_unix(sudo:session): session opened for user root by (uid=0) Feb 25 10:35:57.189366 host4 kernel: scsi 7:0:0:3: alua: Detached Feb 25 10:35:57.189502 host4 multipathd[829]: 360014057ae80179b71e42fe884794e64: map flushed Feb 25 10:35:57.189502 host4 multipathd[829]: 360014057ae80179b71e42fe884794e64: removed map after removing all paths Feb 25 10:35:57.189502 host4 multipathd[829]: 36001405271fe76b24b542bf858aaeef7 Last path deleted, disabling queueing Feb 25 10:35:57.192910 host4 sudo[7843]: pam_unix(sudo:session): session closed for user root Feb 25 10:35:57.208689 host4 multipathd[829]: 36001405271fe76b24b542bf858aaeef7: map flushed Feb 25 10:35:57.208689 host4 multipathd[829]: 36001405271fe76b24b542bf858aaeef7: removed map after removing all paths Feb 25 10:35:57.208689 host4 multipathd[829]: 360014051ce5179112ae4fb98e72d9ba9 Last path deleted, disabling queueing Feb 25 10:35:57.210279 host4 kernel: scsi 7:0:0:2: alua: Detached Feb 25 10:35:57.220653 host4 multipathd[829]: 360014051ce5179112ae4fb98e72d9ba9: map flushed Feb 25 10:35:57.222326 host4 kernel: scsi 7:0:0:1: alua: Detached Feb 25 10:35:57.231893 host4 multipathd[829]: 360014051ce5179112ae4fb98e72d9ba9: removed map after removing all paths Feb 25 10:35:57.245716 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.245750 host4 systemd[1]: Stopping User Manager for UID 0... Feb 25 10:35:57.247339 host4 systemd[7883]: Stopped target Default. Feb 25 10:35:57.247355 host4 systemd[7883]: Stopped target Basic System. Feb 25 10:35:57.247364 host4 systemd[7883]: Stopped target Timers. Feb 25 10:35:57.247371 host4 systemd[7883]: Stopped target Paths. Feb 25 10:35:57.247377 host4 systemd[7883]: Stopped target Sockets. Feb 25 10:35:57.247847 host4 systemd[7883]: Closed D-Bus User Message Bus Socket. Feb 25 10:35:57.247868 host4 systemd[7883]: Reached target Shutdown. Feb 25 10:35:57.248670 host4 systemd[7883]: Starting Exit the Session... [udevadm settle returned here] Feb 25 10:35:57.265233 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.276159 host4 systemd[1]: Stopped User Manager for UID 0. Feb 25 10:35:57.276216 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.278843 host4 systemd[1]: Stopping /run/user/0 mount wrapper... Feb 25 10:35:57.279268 host4 systemd[1]: Removed slice User Slice of UID 0. Feb 25 10:35:57.299980 host4 systemd[1]: Stopped /run/user/0 mount wrapper. Feb 25 10:35:57.320605 host4 systemd[1]: Created slice User Slice of UID 0. Feb 25 10:35:57.321713 host4 systemd[1]: Started /run/user/0 mount wrapper. Feb 25 10:35:57.322763 host4 systemd[1]: Starting User Manager for UID 0... Feb 25 10:35:57.346081 host4 systemd[1]: Started Session c293 of user root. Feb 25 10:35:57.370464 host4 systemd[7964]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Feb 25 10:35:57.420304 host4 systemd[7964]: Reached target Timers. Feb 25 10:35:57.424201 host4 systemd[7964]: Starting D-Bus User Message Bus Socket. Feb 25 10:35:57.424270 host4 systemd[7964]: Reached target Paths. Feb 25 10:35:57.441208 host4 systemd[7964]: Listening on D-Bus User Message Bus Socket. Feb 25 10:35:57.441304 host4 systemd[7964]: Reached target Sockets. Feb 25 10:35:57.441331 host4 systemd[7964]: Reached target Basic System. Feb 25 10:35:57.441521 host4 systemd[7964]: Reached target Default. Feb 25 10:35:57.441544 host4 systemd[7964]: Startup finished in 60ms. Feb 25 10:35:57.442334 host4 systemd[1]: Started User Manager for UID 0. Feb 25 10:35:57.450083 host4 sudo[7956]: pam_unix(sudo:session): session opened for user root by (uid=0) Feb 25 10:35:57.461238 host4 kernel: sd 8:0:0:0: alua: supports implicit and explicit TPGS Feb 25 10:35:57.463531 host4 kernel: sd 8:0:0:0: alua: device naa.60014053b18095bd13c48158687153a5 port group 0 rel port 1 Feb 25 10:35:57.464809 host4 kernel: sd 8:0:0:1: alua: supports implicit and explicit TPGS Feb 25 10:35:57.465449 host4 kernel: sd 8:0:0:1: alua: device naa.60014051ce5179112ae4fb98e72d9ba9 port group 0 rel port 1 Feb 25 10:35:57.466072 host4 kernel: sd 8:0:0:2: alua: supports implicit and explicit TPGS Feb 25 10:35:57.466607 host4 kernel: sd 8:0:0:2: alua: device naa.6001405271fe76b24b542bf858aaeef7 port group 0 rel port 1 Feb 25 10:35:57.467105 host4 kernel: sd 8:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.467589 host4 kernel: sd 8:0:0:3: alua: supports implicit and explicit TPGS Feb 25 10:35:57.468085 host4 kernel: sd 8:0:0:3: alua: device naa.60014057ae80179b71e42fe884794e64 port group 0 rel port 1 Feb 25 10:35:57.468586 host4 kernel: sd 8:0:0:4: alua: supports implicit and explicit TPGS Feb 25 10:35:57.469004 host4 kernel: sd 8:0:0:4: alua: device naa.60014055eee40bf770c4bd9a7a120238 port group 0 rel port 1 Feb 25 10:35:57.469984 host4 kernel: sd 8:0:0:5: alua: supports implicit and explicit TPGS Feb 25 10:35:57.470468 host4 kernel: sd 8:0:0:5: alua: device naa.60014050441ed7bb5004603965c1510a port group 0 rel port 1 Feb 25 10:35:57.471041 host4 kernel: sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.471490 host4 kernel: sd 8:0:0:6: alua: supports implicit and explicit TPGS Feb 25 10:35:57.471919 host4 kernel: sd 8:0:0:6: alua: device naa.6001405183d33921b85475093bfeb1e2 port group 0 rel port 1 Feb 25 10:35:57.472356 host4 kernel: sd 8:0:0:2: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.472809 host4 kernel: sd 8:0:0:7: alua: supports implicit and explicit TPGS Feb 25 10:35:57.473299 host4 kernel: sd 8:0:0:7: alua: device naa.600140569f55ca7b11e410980ee795bc port group 0 rel port 1 Feb 25 10:35:57.473791 host4 kernel: sd 8:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.474186 host4 kernel: sd 8:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.474707 host4 kernel: sd 8:0:0:8: alua: supports implicit and explicit TPGS Feb 25 10:35:57.475075 host4 kernel: sd 8:0:0:8: alua: device naa.6001405dc0840c200954f2ba05597371 port group 0 rel port 1 Feb 25 10:35:57.475488 host4 kernel: sd 8:0:0:9: alua: supports implicit and explicit TPGS Feb 25 10:35:57.476411 host4 kernel: sd 8:0:0:9: alua: device naa.6001405d3fa4e536e3842b2b17647ff1 port group 0 rel port 1 Feb 25 10:35:57.476574 host4 kernel: sd 8:0:0:5: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.480927 host4 kernel: sd 8:0:0:6: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.478572 host4 sudo[7956]: pam_unix(sudo:session): session closed for user root Feb 25 10:35:57.487437 host4 kernel: sd 8:0:0:7: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.487648 host4 kernel: sd 8:0:0:8: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.487842 host4 kernel: sd 8:0:0:9: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:57.501019 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.501040 host4 systemd[1]: Stopping User Manager for UID 0... Feb 25 10:35:57.502885 host4 systemd[7964]: Stopped target Default. Feb 25 10:35:57.502905 host4 systemd[7964]: Stopped target Basic System. Feb 25 10:35:57.502914 host4 systemd[7964]: Stopped target Paths. Feb 25 10:35:57.502923 host4 systemd[7964]: Stopped target Sockets. Feb 25 10:35:57.503243 host4 systemd[7964]: Closed D-Bus User Message Bus Socket. Feb 25 10:35:57.503265 host4 systemd[7964]: Reached target Shutdown. Feb 25 10:35:57.504243 host4 systemd[7964]: Starting Exit the Session... Feb 25 10:35:57.504267 host4 systemd[7964]: Stopped target Timers. Feb 25 10:35:57.518783 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.524670 host4 systemd[1]: Stopped User Manager for UID 0. Feb 25 10:35:57.524717 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.526307 host4 systemd[1]: Removed slice User Slice of UID 0. Feb 25 10:35:57.527140 host4 systemd[1]: Stopping /run/user/0 mount wrapper... Feb 25 10:35:57.553360 host4 systemd[1]: Stopped /run/user/0 mount wrapper. Feb 25 10:35:57.629344 host4 systemd[1]: Started /run/user/0 mount wrapper. Feb 25 10:35:57.629693 host4 systemd[1]: Created slice User Slice of UID 0. Feb 25 10:35:57.630391 host4 systemd[1]: Starting User Manager for UID 0... Feb 25 10:35:57.640320 host4 systemd[1]: Started Session c294 of user root. Feb 25 10:35:57.669948 host4 systemd[7990]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Feb 25 10:35:57.709664 host4 systemd[7990]: Reached target Paths. Feb 25 10:35:57.709712 host4 systemd[7990]: Reached target Timers. Feb 25 10:35:57.712521 host4 systemd[7990]: Starting D-Bus User Message Bus Socket. Feb 25 10:35:57.720170 host4 systemd[7990]: Listening on D-Bus User Message Bus Socket. Feb 25 10:35:57.720255 host4 systemd[7990]: Reached target Sockets. Feb 25 10:35:57.720271 host4 systemd[7990]: Reached target Basic System. Feb 25 10:35:57.720365 host4 systemd[7990]: Reached target Default. Feb 25 10:35:57.720385 host4 systemd[7990]: Startup finished in 39ms. Feb 25 10:35:57.721144 host4 systemd[1]: Started User Manager for UID 0. Feb 25 10:35:57.724680 host4 sudo[7987]: pam_unix(sudo:session): session opened for user root by (uid=0) Feb 25 10:35:57.750554 host4 kernel: blk_update_request: I/O error, dev dm-2, sector 0 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0 Feb 25 10:35:57.768724 host4 sudo[7987]: pam_unix(sudo:session): session closed for user root Feb 25 10:35:57.819197 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.819226 host4 systemd[1]: Stopping User Manager for UID 0... Feb 25 10:35:57.825349 host4 systemd[7990]: Stopped target Default. Feb 25 10:35:57.825441 host4 systemd[7990]: Stopped target Basic System. Feb 25 10:35:57.825452 host4 systemd[7990]: Stopped target Paths. Feb 25 10:35:57.825461 host4 systemd[7990]: Stopped target Sockets. Feb 25 10:35:57.826057 host4 systemd[7990]: Closed D-Bus User Message Bus Socket. Feb 25 10:35:57.826083 host4 systemd[7990]: Reached target Shutdown. Feb 25 10:35:57.827585 host4 systemd[7990]: Starting Exit the Session... Feb 25 10:35:57.827660 host4 systemd[7990]: Stopped target Timers. Feb 25 10:35:57.844030 host4 systemd[1]: user: Killing process 8003 (systemctl) with signal SIGKILL. Feb 25 10:35:57.844182 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.851944 host4 systemd[1]: Stopped User Manager for UID 0. Feb 25 10:35:57.851989 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:57.858952 host4 systemd[1]: Stopping /run/user/0 mount wrapper... Feb 25 10:35:57.859287 host4 systemd[1]: Removed slice User Slice of UID 0. Feb 25 10:35:57.877237 host4 systemd[1]: Stopped /run/user/0 mount wrapper. Feb 25 10:35:57.892716 host4 systemd[1]: Started /run/user/0 mount wrapper. Feb 25 10:35:57.893203 host4 systemd[1]: Created slice User Slice of UID 0. Feb 25 10:35:57.894122 host4 systemd[1]: Starting User Manager for UID 0... Feb 25 10:35:57.930248 host4 systemd[1]: Started Session c295 of user root. Feb 25 10:35:57.951999 host4 systemd[8013]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Feb 25 10:35:58.001008 host4 systemd[8013]: Reached target Paths. Feb 25 10:35:58.003483 host4 systemd[8013]: Starting D-Bus User Message Bus Socket. Feb 25 10:35:58.003532 host4 systemd[8013]: Reached target Timers. Feb 25 10:35:58.016299 host4 systemd[8013]: Listening on D-Bus User Message Bus Socket. Feb 25 10:35:58.016373 host4 systemd[8013]: Reached target Sockets. Feb 25 10:35:58.016388 host4 systemd[8013]: Reached target Basic System. Feb 25 10:35:58.016813 host4 systemd[8013]: Reached target Default. Feb 25 10:35:58.016829 host4 systemd[8013]: Startup finished in 52ms. Feb 25 10:35:58.017205 host4 systemd[1]: Started User Manager for UID 0. Feb 25 10:35:58.028047 host4 sudo[8002]: pam_unix(sudo:session): session opened for user root by (uid=0) Feb 25 10:35:58.059712 host4 kernel: blk_update_request: I/O error, dev dm-2, sector 0 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0 Feb 25 10:35:58.080856 host4 sudo[8002]: pam_unix(sudo:session): session closed for user root Feb 25 10:35:58.102838 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:58.102869 host4 systemd[1]: Stopping User Manager for UID 0... Feb 25 10:35:58.105537 host4 systemd[8013]: Stopped target Default. Feb 25 10:35:58.105564 host4 systemd[8013]: Stopped target Basic System. Feb 25 10:35:58.105573 host4 systemd[8013]: Stopped target Paths. Feb 25 10:35:58.105580 host4 systemd[8013]: Stopped target Timers. Feb 25 10:35:58.105588 host4 systemd[8013]: Stopped target Sockets. Feb 25 10:35:58.106128 host4 systemd[8013]: Closed D-Bus User Message Bus Socket. Feb 25 10:35:58.106152 host4 systemd[8013]: Reached target Shutdown. Feb 25 10:35:58.107273 host4 systemd[8013]: Starting Exit the Session... Feb 25 10:35:58.123186 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:58.126662 host4 systemd[1]: Stopped User Manager for UID 0. Feb 25 10:35:58.126722 host4 systemd[1]: user-runtime-dir: Unit not needed anymore. Stopping. Feb 25 10:35:58.128396 host4 systemd[1]: Stopping /run/user/0 mount wrapper... Feb 25 10:35:58.128872 host4 systemd[1]: Removed slice User Slice of UID 0. Feb 25 10:35:58.145481 host4 systemd[1]: Stopped /run/user/0 mount wrapper. Feb 25 10:35:58.265796 host4 multipathd[829]: 360014053b18095bd13c48158687153a5: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 8:176 1] Feb 25 10:35:58.265796 host4 multipathd[829]: sdl [8:176]: path added to devmap 360014053b18095bd13c48158687153a5 Feb 25 10:35:58.297793 host4 multipathd[829]: 36001405dc0840c200954f2ba05597371: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 8:208 1] Feb 25 10:35:58.297793 host4 multipathd[829]: sdn [8:208]: path added to devmap 36001405dc0840c200954f2ba05597371 Feb 25 10:35:58.297793 host4 multipathd[829]: dm-10: map alias mismatch: have "36001405dc0840c200954f2ba05597371", got "36001405d3fa4e536e3842b2b17647ff1") Feb 25 10:35:58.297793 host4 multipathd[829]: dm-11: devmap not registered, can't remove Feb 25 10:35:58.297793 host4 multipathd[829]: dm-12: devmap not registered, can't remove Feb 25 10:35:58.322001 host4 multipathd[829]: 36001405d3fa4e536e3842b2b17647ff1: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 8:192 1] Feb 25 10:35:58.322001 host4 multipathd[829]: sdm [8:192]: path added to devmap 36001405d3fa4e536e3842b2b17647ff1 Feb 25 10:35:58.322862 host4 kernel: sd 8:0:0:8: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.341989 host4 multipathd[829]: 3600140569f55ca7b11e410980ee795bc: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 8:224 1] Feb 25 10:35:58.341989 host4 multipathd[829]: sdo [8:224]: path added to devmap 3600140569f55ca7b11e410980ee795bc Feb 25 10:35:58.341989 host4 multipathd[829]: dm-18: devmap not registered, can't remove Feb 25 10:35:58.341989 host4 multipathd[829]: dm-14: devmap not registered, can't remove Feb 25 10:35:58.353455 host4 kernel: sd 8:0:0:9: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.364161 host4 multipathd[829]: 36001405183d33921b85475093bfeb1e2: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 8:240 1] Feb 25 10:35:58.364161 host4 multipathd[829]: sdp [8:240]: path added to devmap 36001405183d33921b85475093bfeb1e2 Feb 25 10:35:58.364161 host4 multipathd[829]: dm-13: map alias mismatch: have "36001405183d33921b85475093bfeb1e2", got "360014055eee40bf770c4bd9a7a120238") Feb 25 10:35:58.364161 host4 multipathd[829]: dm-15: devmap not registered, can't remove Feb 25 10:35:58.369526 host4 kernel: sd 8:0:0:7: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.400968 host4 kernel: sd 8:0:0:6: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.401291 host4 multipathd[829]: 360014050441ed7bb5004603965c1510a: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 65:0 1] Feb 25 10:35:58.401291 host4 multipathd[829]: sdq [65:0]: path added to devmap 360014050441ed7bb5004603965c1510a Feb 25 10:35:58.426525 host4 multipathd[829]: 360014055eee40bf770c4bd9a7a120238: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 65:16 1] Feb 25 10:35:58.426525 host4 multipathd[829]: sdr [65:16]: path added to devmap 360014055eee40bf770c4bd9a7a120238 Feb 25 10:35:58.438519 host4 kernel: sd 8:0:0:5: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.461660 host4 multipathd[829]: 360014057ae80179b71e42fe884794e64: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 65:32 1] Feb 25 10:35:58.461660 host4 multipathd[829]: sds [65:32]: path added to devmap 360014057ae80179b71e42fe884794e64 Feb 25 10:35:58.461660 host4 multipathd[829]: dm-17: devmap not registered, can't remove Feb 25 10:35:58.461660 host4 multipathd[829]: dm-16: map alias mismatch: have "360014057ae80179b71e42fe884794e64", got "360014051ce5179112ae4fb98e72d9ba9") Feb 25 10:35:58.471707 host4 kernel: sd 8:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.497441 host4 kernel: sd 8:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.503680 host4 multipathd[829]: 36001405271fe76b24b542bf858aaeef7: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 65:48 1] Feb 25 10:35:58.503680 host4 multipathd[829]: sdt [65:48]: path added to devmap 36001405271fe76b24b542bf858aaeef7 Feb 25 10:35:58.537482 host4 kernel: sd 8:0:0:2: alua: port group 00 state A non-preferred supports TOlUSNA Feb 25 10:35:58.539667 host4 multipathd[829]: 360014051ce5179112ae4fb98e72d9ba9: load table [0 209715200 multipath 1 queue_if_no_path 1 alua 1 1 service-time 0 1 1 65:64 1] Feb 25 10:35:58.539667 host4 multipathd[829]: sdu [65:64]: path added to devmap 360014051ce5179112ae4fb98e72d9ba9 Feb 25 10:35:58.580486 host4 kernel: sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA So looks like multipath took about 2 seconds to discover the new scsi devices and recover the map 360014057ae80179b71e42fe884794e64. Having a map without devices is related to bug 1544370. We have a patch in review, so this issue should be solved soon. Vdsm is waiting on "udevadm settle" after connecting to iscsi server, but looks like this wait does not help now, since the updating the multipath device with the new devices takes few seconds after udevadm settle retruns. Vdsm does not have a way to wait for the expected multipath devices, since engine does not tell vdsm which devices should exist after connecting to the storage server. For quick fix, we can introduce a short wait after connecting to iSCSI storage, before we refresh storage, so multipath devices are likely to be updated. This will introduce a delay for every connectStorageServer call with iSCSI storage. To avoid this delay engine need to send the devices associated with every connection, so vdsm can wait until these devices become available before the call returns. We can introduce this change gradually - old vdsm will ignore unexpected devices list in connection info. New vdsm can use the device list for waiting until storage is actually connected.
raising severity and marking as automation blocker since it fails on OST CQ jobs. Proposing as blocker for 4.4.0 beta.
Ben, can you recommend a way to wait until multipath devices are ready after connecting to a new iscsi target? It seems that newer kernels (Fedora 30, RHEL 8.2) changed the behavior, and now we need to wait more time until devices are ready after iSCSI login. You can see in comment 1 how iscsiadm login completes early, before all SCSI devices were added, and how after all the devices were added, it took about 1 second until the multpath map using these device was ready. We have 2 interesting cases: 1. connecting to target in the first time. We don't know which devices we expect. after connecting, ovirt list the devices available on the host to learn which devices are available. If a device is not available because ovirt asked to list the devices while the kernel was busy adding slaves to the multipath map, oVirt will not list the device. The user can try gain later. In this case I don't see how we can wait for completion, because we don't know anything about the expected results. 2. connecting to target used by oVirt storage domain or direct LUN. oVirt knows which multipath devices are expected to appear on the system after the connection, since it already discovered these devices on another host. We can poll the system and wait until all the expected devices are ready (e.g have at least one active path). If some devices are not ready we can time out. It would be nicer if the kernel would provide an event when it finished to handle devices related to a new iSCSI session. For example if the kernel discover the device by sending SCSI commands to the server, and it knows if the last reply was received, and if all new detected devices are handled.
Without knowing what devices will appear, I don't think there is any way for multipath to be able to know that it has waited long enough. It will simply get uevents after udev has finished processing the devices, and process them as it gets them. multipathd does have a command # multipathd show status which will tell you if it has pending uevents to handle. It would be nice if you could run "udevadm settle" to know that the uevents had completed, and then wait for "multipathd show status" to not be busy, but that won't really work. udevadm settle doesn't (and really, can't) wait until all process listening for a udev processed uevent have received it, so there is a window where udev could think it was caught up, and multipathd could think that it was caught up, but in reality, there are uevents that multipathd hasn't received yet. If you know the multipath devices you are waiting on, then you make something work, since creating the multipath devices will cause uevents. If you wait for a change uevent for the multipath devices, you will know that they are ready to use. If you could get notified that all new iscsi devices have been detected, and you know what their device names are, then if you run # multipathd show devices and wait till those devices are either listed as "blacklisted" or "monitored", you should be able to know you have waited long enough. This currently requires polling. I might be able to come up with an easier method, but this method will always knowing what devices you are looking for (by name or major:minor. That's all multipathd can guarantee that it will know about them, if they are blacklisted). The benefit over the uevent method is that you don't need to know what multipath devices you are waiting for, just the path devices, which is at least in theory knowable as soon as the devices are created in kernel.
(In reply to Ben Marzinski from comment #4) > Without knowing what devices will appear, I don't think there is any way for > multipath to be able to know that it has waited long enough. It will simply > get uevents after udev has finished processing the devices, and process them > as it gets them. > > multipathd does have a command > > # multipathd show status > > which will tell you if it has pending uevents to handle. It would be nice > if you could run "udevadm settle" to know that the uevents had completed, > and then wait for "multipathd show status" to not be busy, but that won't > really work. udevadm settle doesn't (and really, can't) wait until all > process listening for a udev processed uevent have received it, so there is > a window where udev could think it was caught up, and multipathd could think > that it was caught up, but in reality, there are uevents that multipathd > hasn't received yet. > > If you know the multipath devices you are waiting on, then you make > something work, since creating the multipath devices will cause uevents. If > you wait for a change uevent for the multipath devices, you will know that > they are ready to use. > > If you could get notified that all new iscsi devices have been detected, and > you know what their device names are, then if you run > > # multipathd show devices > > and wait till those devices are either listed as "blacklisted" or > "monitored", you should be able to know you have waited long enough. This > currently requires polling. I might be able to come up with an easier > method, but this method will always knowing what devices you are looking for > (by name or major:minor. That's all multipathd can guarantee that it will > know about them, if they are blacklisted). The benefit over the uevent > method is that you don't need to know what multipath devices you are waiting > for, just the path devices, which is at least in theory knowable as soon as > the devices are created in kernel. Hi, We are looking into solving this using `multipathd show devices`. From my test waiting for blacklisted/monitored seems to resolve our issue. However there is a concern that `multipathd show devices` will not show all devices immediately after logging into an iSCSI target, is it possible? Can we assume that multiple runs of `multipathd show devices` will not show a different number of devices (assuming nothing changes on the target)?
(In reply to Benny Zlotnik from comment #5) > Hi, > We are looking into solving this using `multipathd show devices`. From my > test waiting for blacklisted/monitored seems to resolve our issue. > However there is a concern that `multipathd show devices` will not show all > devices immediately after logging into an iSCSI target, is it possible? > Can we assume that multiple runs of `multipathd show devices` will not show > a different number of devices (assuming nothing changes on the target)? This command will look through all the devices in /sys/block, and give their state. The devices will say that they are unmonitored until multipathd has pulled them into its internal path list. There are a lot of caveats about relying on this, however. 1. Until a device appears in /sys/block, it won't appear on the list. So, if the kernel hasn't finished discovering devices on an iSCSI target, the list won't be complete 2. A device will be listed as monitored as soon as it is added to the path list. This doesn't mean that a multipath device has been created on it, however. Usually, if a multipath device was supposed to be created on that path device, it will be, when the device is listed as monitored. However there are corner cases where a device can be added to the list, but multipath will no have created the multipath device yet. 3. It is possible for a device to remain in the "devnode whitelisted, unmonitored" state forever. This can happen if the device is blacklisted by some other method than the "devnode" blacklist rules. 4. Just because a device is listed as monitored, doesn't mean a multipath device will ever be created for it. For instance, if a device is not blacklised, but only has a single path, and find_multipaths is enabled, multipathd will continue to monitor that device, waiting for the chance that a second device will appear. Issues 3 & 4 are only problems assuming that you don't know what path devices you are waiting for. If you know which path devices should be multipathed, you only need to worry about issues 1 & 2. I have to admit, however, that I didn't think about issue 2 before. That could happen in a case where multipathd wasn't able to get all the necessary information from udev, for instance. Multipathd will keep monitoring the device, and periodically fire off a uevent to try and force udev to update the udev database. This means that for most of the time, there will be no outstanding uevents, the path device will exist, and will be monitored by multipathd, but no multipath device will exist. However, in this case, waiting a for a bit probably won't help either. If you want a solution that doesn't require you knowing which paths you are waiting for, where all you want to do is know that multipathd has processed all the path that have appeared, you might need to scrape the multipathd log messages for the "<device>: add path (uevent)" messages. These would tell you that multipathd handled the uevent for the device.
(In reply to Ben Marzinski from comment #6) Looks like we don't have a reliable way to wait unless we know the which devices (WWN) we are waiting for. Scraping multipathd log messages is too fragile. What if multipathd will provide something like: multipathd settle [timeout seconds] That will work similar to udevadm settle. Wait until multipath processed all the block devices in the system, or timeout has expired. If we have a way to wait until all block devices were added to the system after iscsi login, this can give us a complete solution.
The issue is that right now, multipathd doesn't keep track of the devices that it looked at, but doesn't care about. Also, if a device that multipathd is monitoring isn't a multipath device, it knows why, but doesn't expose that to the user. I can probably start tracking the devices multipathd doesn't care about, and a command that will wait on any events that multipathd expects will happen shortly, such as multipathd getting uevents for devices in /sys/block that it has never seen before or that are missing udev information. You want to open a feature request for this?
(In reply to Ben Marzinski from comment #8) Sure, I'll open RFE for this.
Remove the dependency on bug 1544370, as the issue can be solved without it.
Please provide clear verification steps for this BZ
(In reply to Ilan Zuckerman from comment #11) > Please provide clear verification steps for this BZ what is not clear about the steps in the bug description? just one note is that there should only be one iscsi domain for the target
After following the reproduction steps described in BZ description (plus removing all other ISCSI domains in order to remain with only one), i encountered the following error in vdsm log. Please give your opinion on this. However, the ERROR as specified in the bz description wasnt seen. 2020-03-29 14:34:26,038+0300 ERROR (monitor/314c31a) [storage.Multipath] Could not resize device 360002ac0000000000000003500021f6b (mu ltipath:111) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line 109, in resize_devices _resize_if_needed(guid) File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line 117, in _resize_if_needed for slave in devicemapper.getSlaves(name)] File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line 117, in <listcomp> for slave in devicemapper.getSlaves(name)] File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line 187, in getDeviceSize bs, phyBs = getDeviceBlockSizes(devName) File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line 179, in getDeviceBlockSizes QUEUE, "logical_block_size")) File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line 198, in read_int with open(path, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: '/sys/block/sdf/queue/logical_block_size'
Created attachment 1674465 [details] engine and vdsm logs
(In reply to Ilan Zuckerman from comment #13) > After following the reproduction steps described in BZ description (plus > removing all other ISCSI domains in order to remain with only one), i > encountered the following error in vdsm log. > Please give your opinion on this. > > However, the ERROR as specified in the bz description wasnt seen. > > 2020-03-29 14:34:26,038+0300 ERROR (monitor/314c31a) [storage.Multipath] > Could not resize device 360002ac0000000000000003500021f6b (mu > ltipath:111) > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line > 109, in resize_devices > _resize_if_needed(guid) > File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line > 117, in _resize_if_needed > for slave in devicemapper.getSlaves(name)] > File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line > 117, in <listcomp> > for slave in devicemapper.getSlaves(name)] > File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line > 187, in getDeviceSize > bs, phyBs = getDeviceBlockSizes(devName) > File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line > 179, in getDeviceBlockSizes > QUEUE, "logical_block_size")) > File "/usr/lib/python3.6/site-packages/vdsm/storage/multipath.py", line > 198, in read_int > with open(path, "r") as f: > FileNotFoundError: [Errno 2] No such file or directory: > '/sys/block/sdf/queue/logical_block_size' does not look related, it's probably a leftover device
Thanks Benni, moving this to VERIFIED in that case.
(In reply to Nir Soffer from comment #9) > (In reply to Ben Marzinski from comment #8) > Sure, I'll open RFE for this. Did an RFE bugzilla get opened for this?
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.
(In reply to Ben Marzinski from comment #17) > > Sure, I'll open RFE for this. > > Did an RFE bugzilla get opened for this? Not yet.