Bug 1281363
Summary: | Failed to start vm on RHEV-H after upgrade | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Artyom <alukiano> | ||||||||||||||||||
Component: | ovirt-node | Assignee: | Fabian Deutsch <fdeutsch> | ||||||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Ying Cui <ycui> | ||||||||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||||||||
Priority: | urgent | ||||||||||||||||||||
Version: | 3.5.6 | CC: | alukiano, bugs, cshao, cwu, dfediuck, ecohen, fdeutsch, gklein, huiwa, huzhao, leiwang, lsurette, mavital, mgoldboi, pstehlik, sbonazzo, stirabos, virt-bugs, yaniwang, ycui, yeylon | ||||||||||||||||||
Target Milestone: | ovirt-3.6.1 | Keywords: | Regression, ZStream | ||||||||||||||||||
Target Release: | 3.6.0 | ||||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||||
OS: | Linux | ||||||||||||||||||||
Whiteboard: | node | ||||||||||||||||||||
Fixed In Version: | ovirt-node-3.6.0-0.22.20151126git72acbb2.el7 | Doc Type: | Bug Fix | ||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||
Clone Of: | |||||||||||||||||||||
: | 1282820 (view as bug list) | Environment: | |||||||||||||||||||
Last Closed: | 2015-11-27 10:07:12 UTC | Type: | Bug | ||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
Embargoed: | |||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||
Bug Blocks: | 1282820 | ||||||||||||||||||||
Attachments: |
|
Comment 1
Fabian Deutsch
2015-11-12 11:57:26 UTC
For me it happen always, I also want to specify that host pass upgrade from Red Hat Enterprise Virtualization Hypervisor release 6.7 (20151028.0.el6ev) to Red Hat Enterprise Virtualization Hypervisor release 6.7 (20151029.0.el6ev), so maybe it reason why it failed, because I found that we encountered the same problem in past http://comments.gmane.org/gmane.comp.emulators.ovirt.user/27482 Please attach a complete sosreport and the output of "ps eZ" vdsm-4.16.28-1.el6ev.x86_64 should have been 4.16.29. Please check the RHEV-H content. moving downstream being this downstream only bug Created attachment 1093267 [details]
sos report
Just take in account we now have problem with VNX storage, and it will be many errors about it in logs.
Created attachment 1093268 [details]
output of ps eZ
Sorry, ps -eZ please See in attachments Created attachment 1093276 [details]
output of ps -eZ
libvirtd was started in the wrong context: system_u:system_r:ovirt_t:s0 14682 ? 00:00:04 libvirtd It should be in system_u:system_r:virtd_t:s0-s0:c0.c1023 or alike. I actually wonder if this is the root cause, because that message appears somewhere in the middle, at the end there are several other errors: Nov 12 10:39:31 localhost vdsm vds ERROR failed to retrieve Hosted Engine HA info#012Traceback (most recent call last):#012 File "/usr/share/vdsm/API.py", line 1707, in _getHaInfo#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 100, in get_all_stats#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 177, in _configure_broker_conn#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain#012RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': '09035925-5377-4024-919e-880eee080fa0'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'> Nov 12 10:39:31 localhost ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain FilesystemBackend dom_type=nfs3 sd_uuid=09035925-5377-4024-919e-880eee080fa0'#012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 65, in set_storage_domain#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 450, in connect#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 107, in get_domain_path#012BackendFailureException: path to storage domain 09035925-5377-4024-919e-880eee080fa0 not found in /rhev/data-center/mnt Nov 12 10:39:31 localhost vdsm vds ERROR failed to retrieve Hosted Engine HA info#012Traceback (most recent call last):#012 File "/usr/share/vdsm/API.py", line 1707, in _getHaInfo#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 100, in get_all_stats#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 177, in _configure_broker_conn#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain#012RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': '09035925-5377-4024-919e-880eee080fa0'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'> Nov 12 10:39:41 localhost ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain FilesystemBackend dom_type=nfs3 sd_uuid=09035925-5377-4024-919e-880eee080fa0'#012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 65, in set_storage_domain#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 450, in connect#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 107, in get_domain_path#012BackendFailureException: path to storage domain 09035925-5377-4024-919e-880eee080fa0 not found in /rhev/data-center/mnt Nov 12 10:39:41 localhost vdsm vds ERROR failed to retrieve Hosted Engine HA info#012Traceback (most recent call last):#012 File "/usr/share/vdsm/API.py", line 1707, in _getHaInfo#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 100, in get_all_stats#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 177, in _configure_broker_conn#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain#012RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': '09035925-5377-4024-919e-880eee080fa0'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'> Nov 12 10:39:51 localhost ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain FilesystemBackend dom_type=nfs3 sd_uuid=09035925-5377-4024-919e-880eee080fa0'#012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 65, in set_storage_domain#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 450, in connect#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 107, in get_domain_path#012BackendFailureException: path to storage domain 09035925-5377-4024-919e-880eee080fa0 not found in /rhev/data-center/mnt Nov 12 10:39:51 localhost vdsm vds ERROR failed to retrieve Hosted Engine HA info#012Traceback (most recent call last):#012 File "/usr/share/vdsm/API.py", line 1707, in _getHaInfo#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 100, in get_all_stats#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 177, in _configure_broker_conn#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain#012RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': '09035925-5377-4024-919e-880eee080fa0'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'> Here it seams that the storage wasn't available: Nov 12 10:39:41 localhost ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain FilesystemBackend dom_type=nfs3 sd_uuid=09035925-5377-4024-919e-880eee080fa0'#012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 65, in set_storage_domain#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 450, in connect#012 File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 107, in get_domain_path#012BackendFailureException: path to storage domain 09035925-5377-4024-919e-880eee080fa0 not found in /rhev/data-center/mnt Thanks. Artyom, can you please retry to reproduce this bug on other hardware and ensure that the storage is reachable? I reproduced this bug after upgrade to hypervisor6-6.7-20151029.0 or hypervisor6-6.7-20151112.0, but not 100% reproduce. I will give more info soon because testing is in process, I need to try more scenarios and different hardware. (In reply to Fabian Deutsch from comment #16) > Thanks. > > Artyom, can you please retry to reproduce this bug on other hardware and > ensure that the storage is reachable? Let me instead on this needinfo. Yes, storage is reachable, Host and Storage is active. Created attachment 1093584 [details]
sosreport_iSCSStorage_rhevh67_20151112.0
Created attachment 1093585 [details]
freshlog_varlog_rhevh67_20151112.0
Created attachment 1093586 [details]
screenshot_VM_Failed_Start
Created attachment 1093588 [details]
engine.log
Created attachment 1093589 [details]
ps -eZ output
comment 20 - 24 for fresh logs info, for further investigation. The cause is probably (as noted above) that libvirtd is running in the wrong context: [root@dhcp-test-9-205 ~]# ps -eZ | grep libvi system_u:system_r:ovirt_t:s0 14740 ? 00:00:00 libvirtd The correct context would be like: [root@dhcp-test-9-205 ~]# ps -eZ | grep libvirtd system_u:system_r:virtd_t:s0-s0:c0.c1023 14708 ? 00:00:00 libvirtd I modified the x.x.x.205 host and the VMs now come up again. The attached patch fixes this in the code. This bug only happened on RHEV-H el6, no need clone to zstream. so I verified this bug on rhevh 6.7 for 3.5.6 # cat /etc/redhat-release Red Hat Enterprise Virtualization Hypervisor release 6.7 (20151117.0.el6ev) # rpm -qa ovirt-node ovirt-node-3.2.3-28.el6.noarch Test servers: Local Storage - 2 times, PASS iSCSI Storage - 2 times, PASS Test Steps: 1. Install RHEV-H 6.7-20151015.1 the last released version 2. Register to RHEVM 3.5.6.2-0.1.el6ev successful 3. Connect storage 4. Create VM, VM can start successful. 5. Shutdown VM 6. Make Host to maintenance 7. Upgrade RHEVH 6.7 20151117.0 via RHEVM 8. Upgrade successful. 9. Host and Storage is active yet. 10. Start VM which created in step 4 - PASS. 11. Create new VM, and start VM - PASS. I will change the bug status once the bug is ON_QA. This bug only happened on RHEV-H 6.7 el6, so we can not verify it on RHEV-H 7.2 for 3.6.0 build. This issue was verified on RHEV-H 6.7 for 3.5.6 build yet. Suggest to close this bug. Thanks Closing this bug according to comment 32. Thanks Ying. This bug is valid bug when bug is reported, but just for 3.5.z, So updated the status to current release instead of notabug. |