Created attachment 930938 [details] Logs from egine machine and the two nodes with problems Description of problem: After a reboot in my ovirt datacenter two of three storage domain do not active anymore. I shutdown the machines last friday for maintenance during the weekend in the datacenter and boot up then again yesterday morning Since then this big trouble started I imagine i do the shutdown in the correct form. I got this messages on vdsm log StorageDomainDoesNotExist: Storage domain does not exist: (u'261671d2-0813-48e4-b349-251c86201fef',) Version-Release number of selected component (if applicable): oVirt Node - 3.0.4 - 1.0.201401291204.el6 vdsm-4.14.8.1-0.el6 How reproducible: Steps to Reproduce: 1. I Only reboot the machines 2. 3. Actual results: Storage domain do not active anymore. Expected results: Storage domain active again Additional info: I use a machine with fedora 20 like iscsi target using the tgtd daemon
On the machine with fedora 20 and the tgtd daamon: --------------------------------------------------------------- [root@localhost ~]# vgs -a VG #PV #LV #SN Attr VSize VFree 261671d2-0813-48e4-b349-251c86201fef 1 8 0 wz--n- 689,12g 11,25g fedora 1 7 0 wz--n- 709,50g 4,00m -------------------------------------------------------------- [root@localhost ~]# tgtadm --lld iscsi --op show --mode target Target 1: iqn.2014-05-07.br.mp.mppb.ovirtstoragedell:target0 System information: Driver: iscsi State: ready I_T nexus information: I_T nexus: 3 Initiator: iqn.1994-05.com.redhat:b2b7b5bc8064 alias: dellr62001.mppb.mp.br Connection: 0 IP Address: 10.0.1.131 I_T nexus: 6 Initiator: iqn.1994-05.com.redhat:d2d5fc389e7e alias: dellr620-02 Connection: 0 IP Address: 10.0.1.132 LUN information: LUN: 0 Type: controller SCSI ID: IET 00010000 SCSI SN: beaf10 Size: 0 MB, Block size: 1 Online: Yes Removable media: No Prevent removal: No Readonly: No SWP: No Thin-provisioning: No Backing store type: null Backing store path: None Backing store flags: Account information: ACL information: 10.0.1.0/24 ---------------------------------------------------------------- [root@localhost ~]# pvs -a PV VG Fmt Attr PSize PFree /dev/261671d2-0813-48e4-b349-251c86201fef/0c2d0323-d937-4fa5-b014-8dc7fb576b67 --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/f1374061-b139-4e26-aca5-8ded9a698d6a --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/ids --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/inbox --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/leases --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/master --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/metadata --- 0 0 /dev/261671d2-0813-48e4-b349-251c86201fef/outbox --- 0 0 /dev/fedora/home --- 0 0 /dev/fedora/opt_StorageiSCSI 261671d2-0813-48e4-b349-251c86201fef lvm2 a-- 689,12g 11,25g /dev/fedora/root --- 0 0 /dev/fedora/swap --- 0 0 /dev/fedora/tmp --- 0 0 /dev/fedora/usr --- 0 0 /dev/fedora/var --- 0 0 /dev/sda1 --- 0 0 /dev/sda2 fedora lvm2 a-- 709,50g 4,00m ------------------------------------------------------------------ [root@localhost ~]# getenforce Permissive
[root@localhost ~]# lvdisplay . --- Logical volume --- LV Path /dev/fedora/opt_StorageiSCSI LV Name opt_StorageiSCSI VG Name fedora LV UUID lkDhHX-mqNW-sA5n-iVJa-XGGN-7r9G-4BBFCV LV Write Access read/write LV Creation host, time localhost.localdomain, 2014-05-06 09:31:13 -0300 LV Status available # open 8 LV Size 689,46 GiB Current LE 176502 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:6
More information from Node 01 and Node 02 The Storage nodes with trouble are the IPs: 10.0.1.4 and 10.0.1.2 The node 10.0.1.5 is fine, after reboot it was initialized without problem. ################################################ Node 01: [root@dellr62001 ~]# iscsiadm -m session tcp: [4] 10.0.1.2:3260,1 iqn.2014-05-07.br.mp.mppb.ovirtstoragedell:target0 tcp: [5] 10.0.1.5:3260,1 iqn.2014-05-23.br.mp.mppb.ovirtstorageHP01:target0 tcp: [6] 10.0.1.4:3260,1 iqn.2014-05-19.br.mp.mppb.ovirtstoragedell:target0 ------------------------------------ [root@dellr62001 ~]# iscsiadm -m node 10.0.1.4:3260,1 iqn.2014-05-19.br.mp.mppb.ovirtstoragedell:target0 10.0.1.2:3260,1 iqn.2014-05-07.br.mp.mppb.ovirtstoragedell:target0 10.0.1.5:3260,1 iqn.2014-05-23.br.mp.mppb.ovirtstorageHP01:target0 ------------------------------------ [root@dellr62001 ~]# cat /etc/iscsi/initiatorname.iscsi InitiatorName=iqn.1994-05.com.redhat:b2b7b5bc8064 ------------------------------------ [root@dellr62001 ~]# vdsClient -s 0 getDeviceList [{'GUID': '36848f690e893c700191f97ec0592859a', 'capacity': '299439751168', 'devtype': 'FCP', 'fwrev': '3.13', 'logicalblocksize': '512', 'pathlist': [], 'pathstatus': [{'lun': '0', 'physdev': 'sda', 'state': 'active', 'type': 'FCP'}], 'physicalblocksize': '512', 'productID': 'PERC H710', 'pvUUID': '', 'serial': 'SDELLPERC_H710', 'status': 'used', 'vendorID': 'DELL', 'vgUUID': ''}, {'GUID': '1IET_00010001', 'capacity': '795936292864', 'devtype': 'iSCSI', 'fwrev': '0001', 'logicalblocksize': '512', 'pathlist': [{'connection': '10.0.1.5', 'initiatorname': 'default', 'iqn': 'iqn.2014-05-23.br.mp.mppb.ovirtstorageHP01:target0', 'port': '3260', 'portal': '1'}], 'pathstatus': [{'lun': '1', 'physdev': 'sdb', 'state': 'active', 'type': 'iSCSI'}], 'physicalblocksize': '512', 'productID': 'VIRTUAL-DISK', 'pvUUID': 'ZYBOcU-X1HK-FzYL-LiJS-FnYm-ZoaU-ij9r89', 'serial': 'SIET_VIRTUAL-DISK', 'status': 'used', 'vendorID': 'IET', 'vgUUID': 'hWzBW8-Tgfh-Mysm-b0xx-ItIQ-D5vi-fF3vvI'}] ##################################################### Node 02: [root@dellr620-02 ~]# iscsiadm -m session tcp: [4] 10.0.1.2:3260,1 iqn.2014-05-07.br.mp.mppb.ovirtstoragedell:target0 tcp: [5] 10.0.1.5:3260,1 iqn.2014-05-23.br.mp.mppb.ovirtstorageHP01:target0 tcp: [6] 10.0.1.4:3260,1 iqn.2014-05-19.br.mp.mppb.ovirtstoragedell:target0 ------------------------------------------ [root@dellr620-02 ~]# iscsiadm -m node 10.0.1.4:3260,1 iqn.2014-05-19.br.mp.mppb.ovirtstoragedell:target0 10.0.1.2:3260,1 iqn.2014-05-07.br.mp.mppb.ovirtstoragedell:target0 10.0.1.5:3260,1 iqn.2014-05-23.br.mp.mppb.ovirtstorageHP01:target0 ------------------------------------------ [root@dellr620-02 ~]# cat /etc/iscsi/initiatorname.iscsi InitiatorName=iqn.1994-05.com.redhat:d2d5fc389e7e ------------------------------------------ [root@dellr620-02 ~]# vdsClient -s 0 getDeviceList [{'GUID': '36848f690e8956600191f9159093c627e', 'capacity': '299439751168', 'devtype': 'FCP', 'fwrev': '3.13', 'logicalblocksize': '512', 'pathlist': [], 'pathstatus': [{'lun': '0', 'physdev': 'sda', 'state': 'active', 'type': 'FCP'}], 'physicalblocksize': '512', 'productID': 'PERC H710', 'pvUUID': '', 'serial': 'SDELLPERC_H710', 'status': 'used', 'vendorID': 'DELL', 'vgUUID': ''}, {'GUID': '1IET_00010001', 'capacity': '795936292864', 'devtype': 'iSCSI', 'fwrev': '0001', 'logicalblocksize': '512', 'pathlist': [{'connection': '10.0.1.5', 'initiatorname': 'default', 'iqn': 'iqn.2014-05-23.br.mp.mppb.ovirtstorageHP01:target0', 'port': '3260', 'portal': '1'}], 'pathstatus': [{'lun': '1', 'physdev': 'sdb', 'state': 'active', 'type': 'iSCSI'}], 'physicalblocksize': '512', 'productID': 'VIRTUAL-DISK', 'pvUUID': 'ZYBOcU-X1HK-FzYL-LiJS-FnYm-ZoaU-ij9r89', 'serial': 'SIET_VIRTUAL-DISK', 'status': 'used', 'vendorID': 'IET', 'vgUUID': 'hWzBW8-Tgfh-Mysm-b0xx-ItIQ-D5vi-fF3vvI'}]
Another information, the machine with the storage domain who is good is a Centos 6, the two with problema are Fedora 20, both using tgtd daemon
We don't know yet if this is a bug in ovirt, or in another component, this is too early to schedule this to any release.
After discussing with Allon, moving to next version.
Re-targeting to 3.5.3 since this bug has not been marked as blocker for 3.5.2 and we have already released 3.5.2 Release Candidate.
tgtd is not a "production" env, retargeting to 3.6.0 until we have a clear RCA and understand what exactly needs to be done here.
Hi Fagner, Is it still reproduces in your environment? I've been trying to reproduce this on my machine although my VDSM version is vdsm-4.17.0, and it looks that my Storage Domain worked fine after reboot.
Hello Maor Sorry, i can't reproduce it anymore, my environment was changed to VMWare.
I am waiting some machines to create another ovirt environment again. This time will be a test environment.
This is an automated message. This Bugzilla report has been opened on a version which is not maintained anymore. Please check if this bug is still relevant in oVirt 3.5.4. If it's not relevant anymore, please close it (you may use EOL or CURRENT RELEASE resolution) If it's an RFE please update the version to 4.0 if still relevant.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days