Created attachment 1630169 [details] engine and vdsm logs Description of problem: Upgrade engine with working 3 ISCSI storage domains('iscsi_0', 'iscsi_1', 'iscsi_2') from 4.3.7 to 4.4. Than add the first newly installed RHEL8.1 host to the data center all other(NFS,gluster) storage domains are up but ISCSI storage domain remains down with the following errors: Engine log: 2019-10-29 12:49:49,948+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] Failed in 'GetDeviceListVDS' method 2019-10-29 12:49:49,959+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command GetDeviceListVDS failed: Error block device action: () 2019-10-29 12:49:49,960+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand' return value ' LUNListReturn:{status='Status [code=600, message=Error block device action: ()]'} ' 2019-10-29 12:49:49,961+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] HostName = host_mixed_1 2019-10-29 12:49:49,961+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] Command 'GetDeviceListVDSCommand(HostName = host_mixed_1, GetDeviceListVDSCommandParameters:{hostId='d199123d-9772-4276-9d28-c83d3ee2d999', storageType='UNKNOWN', checkStatus='false', lunIds='null'})' execution failed: VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: (), code = 600 2019-10-29 12:49:49,961+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] FINISH, GetDeviceListVDSCommand, return: , log id: 1d7be8db 2019-10-29 12:49:49,961+02 ERROR [org.ovirt.engine.core.bll.storage.pool.SyncStorageDomainsLunsCommand] (EE-ManagedThreadFactory-engine-Thread-28) [73422e2c] Command 'org.ovirt.engine.core.bll.storage.pool.SyncStorageDomainsLunsCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: (), code = 600 (Failed with error BlockDeviceActionError and code 600) 2019-10-29 12:49:58,127+02 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-30) [] domain '788f4bfb-380f-4b98-93c1-dfa967ea4f3a:iscsi_2' in problem 'STORAGE_ACCCESS_ERROR'. vds: 'host_mixed_1' 2019-10-29 12:49:58,128+02 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-30) [] domain '45f2708d-4b6d-418d-aaf1-10bea1b94552:iscsi_0' in problem 'STORAGE_ACCCESS_ERROR'. vds: 'host_mixed_1' 2019-10-29 12:49:58,129+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-30) [] Host 'd199123d-9772-4276-9d28-c83d3ee2d999' has reported new storage access problem to the following domains '788f4bfb-380f-4b98-93c1-dfa967ea4f3a,45f2708d-4b6d-418d-aaf1-10bea1b94552' marking it for storage connections and pool metadata refresh (report id: 'd0139dc9-ea67-4866-ada7-4242311f8345') ' VDSM log(5 hours behind due to timezone bug on the host): 2019-10-29 06:49:40,142-0400 WARN (monitor/45f2708) [storage.LVM] Reloading VGs failed (vgs=['45f2708d-4b6d-418d-aaf1-10bea1b94552'] rc=5 out=[] err=[' Volume group "45f2708d-4b6d-418d-aaf1-10bea1b94552" not found', ' Cannot process volume group 45f2708d-4b6d-418d-aaf1-10bea1b94552']) (lvm:501) 2019-10-29 06:49:40,156-0400 WARN (monitor/788f4bf) [storage.LVM] Reloading VGs failed (vgs=['788f4bfb-380f-4b98-93c1-dfa967ea4f3a'] rc=5 out=[] err=[' Volume group "788f4bfb-380f-4b98-93c1-dfa967ea4f3a" not found', ' Cannot process volume group 788f4bfb-380f-4b98-93c1-dfa967ea4f3a']) (lvm:501) 2019-10-29 06:49:40,241-0400 ERROR (monitor/45f2708) [storage.Monitor] Setting up monitor for 45f2708d-4b6d-418d-aaf1-10bea1b94552 failed (monitor:332) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 329, in _setupLoop self._setupMonitor() File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 351, in _setupMonitor self._produceDomain() File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 153, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 369, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 108, in produce domain.getRealDomain() File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 50, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 132, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 149, in _findDomain return findMethod(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 174, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist: ('45f2708d-4b6d-418d-aaf1-10bea1b94552',) 2019-10-29 06:49:40,242-0400 INFO (monitor/45f2708) [storage.Monitor] Domain 45f2708d-4b6d-418d-aaf1-10bea1b94552 became INVALID (monitor:472) 2019-10-29 06:49:40,247-0400 ERROR (monitor/788f4bf) [storage.Monitor] Setting up monitor for 788f4bfb-380f-4b98-93c1-dfa967ea4f3a failed (monitor:332) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 329, in _setupLoop self._setupMonitor() File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 351, in _setupMonitor self._produceDomain() File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 153, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 369, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 108, in produce domain.getRealDomain() File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 50, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 132, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 149, in _findDomain return findMethod(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 174, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist: ('788f4bfb-380f-4b98-93c1-dfa967ea4f3a',) 2019-10-29 06:49:40,248-0400 INFO (monitor/788f4bf) [storage.Monitor] Domain 788f4bfb-380f-4b98-93c1-dfa967ea4f3a became INVALID (monitor:472) Version-Release number of selected component (if applicable): rhv-4.4.0-4 (http://bob.eng.lab.tlv.redhat.com/builds/4.4/rhv-4.4.0-4/) ovirt-engine-4.4.0-0.4.master.el7.noarch vdsm-4.40.0-127.gitc628cce.el8ev.x86_64 python3-libvirt-5.0.0-5.module+el8.0.1+3755+6782b0ed.x86_64 qemu-kvm-3.1.0-30.module+el8.0.1+3755+6782b0ed.x86_64 How reproducible: Upgraded one stand so far Steps to Reproduce: 1. ENV(engine + host) with 4.3.7 connected to ISCSI+NFS storage domain 2. Upgrade Engine(RHEL7.7) with new RHV4.4 3. As hosts are RHL8.1 and no upgrade is available between RHEL7.7 -> 8.1 you need to fresh install RHEL8.1 on the hosts 4. install host packages(VDSM) 5. Via webadmin add host to engine Actual results: ISCSI SD's does not come up , NFS and gluster is ok. Expected results: ISCSI SD's should come up. Additional info:
*** Bug 1766595 has been marked as a duplicate of this bug. ***
Ok , so the issue that the SD's did not go up and no SCSI attached devices were seen is due to the fact that the initiatorname.iscsi was deleted in OS re-provisioning. When I changed the initiatorname.iscsi to their old value + iscsid service restart + reconnect ISCSI target -> the pre upgrade ISCSI SD's went up. *** But still we have the creating new SD's issue from UI, I still see that we are stuck in 'new domain' window on loading luns stage.*** *** So we are still blocked creating a new SD via UI *** I am closing this bug as ISCSI SD's will go up if you configure initiatorname.iscsi correctly. I am reopening bug 1766595 as we still cannot create new SD's via REST/UI. Now iscsi attachment is working: [root@storage-ge4-vdsm1 ~]# iscsiadm -m session -P 3 iSCSI Transport Class version 2.0-870 version 6.2.0.877-0 Target: iqn.2000-05.com.3pardata:20210002ac021f6b (non-flash) Current Portal: 10.35.146.1:3260,21 Persistent Portal: 3par-iscsi-1.scl.lab.tlv.redhat.com:3260,21 ********** Interface: ********** Iface Name: default Iface Transport: tcp Iface Initiatorname: iqn.1994-05.com.redhat:storage-ge4-vdsm1 Iface IPaddress: 10.35.82.63 Iface HWaddress: default Iface Netdev: default SID: 6 iSCSI Connection State: LOGGED IN iSCSI Session State: LOGGED_IN Internal iscsid Session State: NO CHANGE ********* Timeouts: ********* Recovery Timeout: 10 Target Reset Timeout: 30 LUN Reset Timeout: 30 Abort Timeout: 15 ***** CHAP: ***** username: <empty> password: ******** username_in: <empty> password_in: ******** ************************ Negotiated iSCSI params: ************************ HeaderDigest: None DataDigest: None MaxRecvDataSegmentLength: 262144 MaxXmitDataSegmentLength: 65536 FirstBurstLength: 16384 MaxBurstLength: 262144 ImmediateData: Yes InitialR2T: Yes MaxOutstandingR2T: 1 ************************ Attached SCSI devices: ************************ Host Number: 3 State: running scsi3 Channel 00 Id 0 Lun: 0 Attached scsi disk sdb State: running scsi3 Channel 00 Id 0 Lun: 1 Attached scsi disk sdc State: running scsi3 Channel 00 Id 0 Lun: 2 Attached scsi disk sdd State: running scsi3 Channel 00 Id 0 Lun: 254 scsi3 Channel 00 Id 0 Lun: 3 Attached scsi disk sde State: running scsi3 Channel 00 Id 0 Lun: 4 Attached scsi disk sdf State: running scsi3 Channel 00 Id 0 Lun: 5 Attached scsi disk sdg State: running Target: iqn.2000-05.com.3pardata:21210002ac021f6b (non-flash) Current Portal: 10.35.146.2:3260,121 Persistent Portal: 10.35.146.2:3260,121 ********** Interface: ********** Iface Name: default Iface Transport: tcp Iface Initiatorname: iqn.1994-05.com.redhat:storage-ge4-vdsm1 Iface IPaddress: 10.35.82.63 Iface HWaddress: default Iface Netdev: default SID: 7 iSCSI Connection State: LOGGED IN iSCSI Session State: LOGGED_IN Internal iscsid Session State: NO CHANGE ********* Timeouts: ********* Recovery Timeout: 10 Target Reset Timeout: 30 LUN Reset Timeout: 30 Abort Timeout: 15 ***** CHAP: ***** username: <empty> password: ******** username_in: <empty> password_in: ******** ************************ Negotiated iSCSI params: ************************ HeaderDigest: None DataDigest: None MaxRecvDataSegmentLength: 262144 MaxXmitDataSegmentLength: 65536 FirstBurstLength: 16384 MaxBurstLength: 262144 ImmediateData: Yes InitialR2T: Yes MaxOutstandingR2T: 1 ************************ Attached SCSI devices: ************************ Host Number: 4 State: running scsi4 Channel 00 Id 0 Lun: 0 Attached scsi disk sdh State: running scsi4 Channel 00 Id 0 Lun: 1 Attached scsi disk sdi State: running scsi4 Channel 00 Id 0 Lun: 2 Attached scsi disk sdj State: running scsi4 Channel 00 Id 0 Lun: 254 scsi4 Channel 00 Id 0 Lun: 3 Attached scsi disk sdk State: running scsi4 Channel 00 Id 0 Lun: 4 Attached scsi disk sdl State: running scsi4 Channel 00 Id 0 Lun: 5 Attached scsi disk sdm State: running
See prior comment.