Bug 1576108

Summary: ovirt driver doesn't reboot running VMs
Product: Red Hat OpenStack Reporter: mlammon
Component: openstack-ironic-staging-driversAssignee: Derek Higgins <derekh>
Status: CLOSED WORKSFORME QA Contact: mlammon
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: bfournie, dtantsur, ietingof, mlammon, sasha
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: DFG:HardProv
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-04 14:40:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1576570    
Bug Blocks:    

Description mlammon 2018-05-08 21:15:49 UTC
When using Ironic ovirt driver, "openstack baremetal node reboot <uuid>" when node is powered on, it does not reboot

Env:
openstack-ironic-staging-drivers-0.9.0-4.el7ost.noarch


How reproducible:
100%

Steps:
PreReq: RHV environment with undercloud

1) Config undercloud.conf and Install undercloud with osp13 latest in RHV environment

2) Configure /etc/ironic.conf  
Add staging-ovirt and credentials for ovirt
---->
enabled_hardware_types=redfish,ipmi,idrac,ilo,staging-ovirt
enabled_management_interfaces=redfish,ilo,idrac,ipmitool,fake,staging-ovirt
enabled_power_interfaces=redfish,ilo,idrac,ipmitool,fake,staging-ovirt

3) install drivers

sudo yum install -y gcc python-devel libxml2-devel libcurl-devel
python-enum34 -y
wget http://download.eng.bos.redhat.com/brewroot/packages/python-ovirt-engine-sdk4/4.2.6/1.el7ev/x86_64/python-ovirt-engine-sdk4-4.2.6-1.el7ev.x86_64.rpm
rpm -ihv python-ovirt-engine-sdk4-4.2.6-1.el7ev.x86_64.rpm

4) import / introspect 

5) with any node powered up; run openstack baremetal node reboot <uuid>

Actual results:
The node remains powered up

Expected results:
It is expected that it reboots (off/on)


Additional info:
2018-05-08 15:20:14.987 36396 DEBUG ironic.conductor.manager [req-9610c785-3e80-49b6-a20a-4070aadbd02e e0749763684d475aadb1def62c071429 acdd32457726432e9181ceadb42227fb - default default] RPC change_node_power_state called for node abde7a98-b259-4002-a900-6c628c4a8c38. The desired new state is rebooting. change_node_power_state /usr/lib/python2.7/site-packages/ironic/conductor/manager.py:243
2018-05-08 15:20:14.998 36396 DEBUG ironic.conductor.task_manager [req-9610c785-3e80-49b6-a20a-4070aadbd02e e0749763684d475aadb1def62c071429 acdd32457726432e9181ceadb42227fb - default default] Attempting to get exclusive lock on node abde7a98-b259-4002-a900-6c628c4a8c38 (for changing node power state) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:15.012 36396 DEBUG ironic.conductor.task_manager [req-9610c785-3e80-49b6-a20a-4070aadbd02e e0749763684d475aadb1def62c071429 acdd32457726432e9181ceadb42227fb - default default] Node abde7a98-b259-4002-a900-6c628c4a8c38 successfully reserved for changing node power state (took 0.01 seconds) reserve_node /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:273
2018-05-08 15:20:15.287 36396 INFO ironic.conductor.utils [req-9610c785-3e80-49b6-a20a-4070aadbd02e e0749763684d475aadb1def62c071429 acdd32457726432e9181ceadb42227fb - default default] Successfully set node abde7a98-b259-4002-a900-6c628c4a8c38 power state to power on by rebooting.
2018-05-08 15:20:15.292 36396 DEBUG ironic.conductor.task_manager [req-9610c785-3e80-49b6-a20a-4070aadbd02e e0749763684d475aadb1def62c071429 acdd32457726432e9181ceadb42227fb - default default] Successfully released exclusive lock for changing node power state on node abde7a98-b259-4002-a900-6c628c4a8c38 (lock was held 0.28 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:18.003 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._send_sensor_data' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:18.517 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.drivers.modules.drac.raid.DracRAID._query_raid_config_job_status' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:18.531 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node abde7a98-b259-4002-a900-6c628c4a8c38 (for checking async raid configuration jobs) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:18.545 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for checking async raid configuration jobs on node abde7a98-b259-4002-a900-6c628c4a8c38 (lock was held 0.01 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:18.554 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node 4bc48435-587c-41ab-819a-9d055d108cd6 (for checking async raid configuration jobs) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:18.568 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for checking async raid configuration jobs on node 4bc48435-587c-41ab-819a-9d055d108cd6 (lock was held 0.01 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:18.577 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node 77cb3fe4-f7af-4ffe-831a-f81b10d90c13 (for checking async raid configuration jobs) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:18.590 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for checking async raid configuration jobs on node 77cb3fe4-f7af-4ffe-831a-f81b10d90c13 (lock was held 0.01 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:18.602 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node 48da76b5-768e-4595-bb50-d45ecbd50fd4 (for checking async raid configuration jobs) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:18.616 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for checking async raid configuration jobs on node 48da76b5-768e-4595-bb50-d45ecbd50fd4 (lock was held 0.01 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:21.576 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._sync_local_state' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.705 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_cleanwait_timeouts' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.712 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_deploy_timeouts' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.715 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_inspect_timeouts' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.719 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_orphan_nodes' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.721 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_rescuewait_timeouts' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.726 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._sync_power_states' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639
2018-05-08 15:20:21.738 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node abde7a98-b259-4002-a900-6c628c4a8c38 (for power state sync) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:21.945 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for power state sync on node abde7a98-b259-4002-a900-6c628c4a8c38 (lock was held 0.21 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:21.955 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node 4bc48435-587c-41ab-819a-9d055d108cd6 (for power state sync) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:22.093 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for power state sync on node 4bc48435-587c-41ab-819a-9d055d108cd6 (lock was held 0.14 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:22.103 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node 77cb3fe4-f7af-4ffe-831a-f81b10d90c13 (for power state sync) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:22.238 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for power state sync on node 77cb3fe4-f7af-4ffe-831a-f81b10d90c13 (lock was held 0.13 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:22.263 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Attempting to get shared lock on node 48da76b5-768e-4595-bb50-d45ecbd50fd4 (for power state sync) __init__ /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:225
2018-05-08 15:20:22.405 36396 DEBUG ironic.conductor.task_manager [req-0a3feb7e-c3cc-40c9-a5ea-c3cd9b271b97 - - - - -] Successfully released shared lock for power state sync on node 48da76b5-768e-4595-bb50-d45ecbd50fd4 (lock was held 0.14 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352
2018-05-08 15:20:25.412 36396 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.drivers.modules.inspector.Inspector._periodic_check_result' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639

Comment 1 Ilya Etingof 2018-06-27 11:28:20 UTC
Can this bug be a duplicate of BZ#1568442?

Comment 2 Bob Fournier 2018-06-27 13:18:19 UTC
> Can this bug be a duplicate of BZ#1568442?

I don't believe so as Mike was testing with the fix for that BZ when he came across this issue.

Comment 3 Dmitry Tantsur 2018-08-14 13:35:55 UTC
I have two suspects:
1. we don't understand how reboot() works
2. we need to split reboot into on, then off, waiting for each result.

Without any oVirt expertise I cannot judge further.

Comment 4 Derek Higgins 2018-10-04 09:49:45 UTC
Using the same package versions mentioned above 

python-ovirt-engine-sdk4-4.2.6-1.el7ev.x86_64
openstack-ironic-staging-drivers-0.9.0-4.el7ost.noarch

oVirt version 4.2.6.4-1.el7

and following the steps described, the node successfully reboots(multiple attempts). The node as remains active during the reboot, 

| aae6b830-4bde-4e90-a98d-df07d3923824 | ironic-0 | 0969cbca-ad8c-42d9-86c3-4c77ca145bf6 | power on    | active             | False       |

@mlammon if you were looking at the output of "openstack baremetal node list" it wouldn't have shown the node rebooting, was this the case? Or were you ssh'd onto the node?

Comment 5 Derek Higgins 2018-10-04 14:40:59 UTC
Closing as we have no current reproducer, will reopen if its observed happening again.