Bug 1374995 - [RHV 4.0] Provision VM ends up with "Validating New Vm" endless retries
Summary: [RHV 4.0] Provision VM ends up with "Validating New Vm" endless retries
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: GA
: 5.7.0
Assignee: Piotr Kliczewski
QA Contact: Ilanit Stein
URL:
Whiteboard: rhev
Depends On: 1359139
Blocks: 1376528
TreeView+ depends on / blocked
 
Reported: 2016-09-11 15:03 UTC by Ilanit Stein
Modified: 2020-02-14 17:56 UTC (History)
5 users (show)

Fixed In Version: 5.7.0.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1376528 (view as bug list)
Environment:
Last Closed: 2017-01-11 19:55:49 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
evm.log (3.05 MB, application/x-gzip)
2016-09-11 15:20 UTC, Ilanit Stein
no flags Details
CFME_rhevm.log (279.12 KB, application/x-gzip)
2016-09-11 15:31 UTC, Ilanit Stein
no flags Details
RHV_4.0.4_engine.log (490.54 KB, application/x-gzip)
2016-09-11 15:35 UTC, Ilanit Stein
no flags Details
RHV_4.0.3_engine.log (142.10 KB, application/x-gzip)
2016-09-11 15:38 UTC, Ilanit Stein
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ManageIQ manageiq pull 11285 0 None closed Vm provision handle ems_ref properly 2020-11-30 02:06:05 UTC

Description Ilanit Stein 2016-09-11 15:03:09 UTC
Description of problem:

Provision VM from template, from CF side ends up with VM created on RHV side,
but for this VM there is no request for start from CF side.
The Request seem remain 'Active', and has "Validating New Vm" endless retries.

Last Message: "[EVM] VM [<vm name>] Step [CheckProvisioned] Status [Creating VM] Message [Validating New Vm] Current Retry Number [47]" 

Version-Release number of selected component (if applicable):
CFME 5.7.0.0.20160906172503_12f65cb
RHV-4.0.4.2-0.1.el7ev

How reproducible:
Tried to provision VM 3 times, and got the same result.

Steps to Reproduce:
1.Provision VM from template on RHV-4.0.4

Actual results:
VM is created but the request seem to be stuck in "Validating New Vm" endless retries. 

Expected results:
VM should be created and started in a reasonable time of several minutes.

Additional info:

For RHEVM-3.6, from CF, Provision VM from a template, ends up successfully. A VM is created, and also started. Request's last message: [EVM] VM [test_prov_boot] Provisioned Successfully,

For RHV-4.0.3-0.1.el7ev, From CF, provision VM from template, ends up with ERROR:
[EVM] VM [test_prov] Step [CheckProvisioned] Status [[Ovirt::MissingResourceError]: Ovirt::MissingResourceError] Message [[Ovirt::MissingResourceError]: Ovirt::MissingResourceError]
VM from template is created, but CFME do not send a request to start it.

For the same template, from RHV UI, create VM from template, add nic, and start works fine.

Comment 2 Ilanit Stein 2016-09-11 15:20:14 UTC
Created attachment 1199885 [details]
evm.log

* RHEVM-3.6 successful provision:
Request ID 1000000000001
[EVM] VM [test_prov_boot] Provisioned Successfully
Created On Sun, 11 Sep 2016 03:51:18 +0000
Last Update Sun, 11 Sep 2016 03:54:15 +0000
Completed Sun, 11 Sep 2016 03:54:15 +0000

* RHV-4.0.3 
Request ID 1000000000002
[EVM] VM [test_prov] Step [CheckProvisioned] Status [[Ovirt::MissingResourceError]: Ovirt::MissingResourceError] Message [[Ovirt::MissingResourceError]: Ovirt::MissingResourceError]
Created On Sun, 11 Sep 2016 07:05:13 +0000
Last Update Sun, 11 Sep 2016 07:06:51 +0000
Completed Sun, 11 Sep 2016 07:05:52 +0000


* RHV-4.0.4
Request ID 1000000000003
Last Message [EVM] VM [istein] Step [CheckProvisioned] Status [State=<CheckProvisioned> running raised exception: <number of retries <101> exceeded maximum of <100>>] Message [Validating New Vm]
Created On Sun, 11 Sep 2016 09:09:17 +0000
Last Update Sun, 11 Sep 2016 11:13:51 +0000
Completed Sun, 11 Sep 2016 11:13:51 +0000

(istein  VM has been deleted manually, while there were retries like:
Last Message [EVM] VM [istein] Step [CheckProvisioned] Status [Creating VM] Message [Validating New Vm] Current Retry Number [83] ....)

VMs istein, istein, istein_1, istein_2 (Request IDs: 1000000000004, 1000000000005, 1000000000005) all failed the same.

Comment 3 Ilanit Stein 2016-09-11 15:31:24 UTC
Created attachment 1199886 [details]
CFME_rhevm.log

Comment 4 Ilanit Stein 2016-09-11 15:35:37 UTC
Created attachment 1199887 [details]
RHV_4.0.4_engine.log

Comment 5 Ilanit Stein 2016-09-11 15:38:31 UTC
Created attachment 1199888 [details]
RHV_4.0.3_engine.log

Comment 6 Ilanit Stein 2016-09-15 05:57:35 UTC
I see same problem with versions: 
RHV-4.0.3-0.1.el7ev
CFME-5.6.1.2.20160810181333_8ba817b
VM is created, provision fail on CFME side, and VM is not run. Running it manually works fine.

Comment 7 Piotr Kliczewski 2016-09-15 09:11:19 UTC
Here are errors in the logs:

I see two types of issues on manageiq sied:

1. User info fetch fails

evm.log

[----] E, [2016-09-11T10:21:29.382488 #13495:12d1148] ERROR -- : <RHEVM> Ovirt::Service#parse_error_response Return from URL: <https://jenkins-vm-09.scl.lab.tlv.redhat.com/ovirt-engine/api/users/4fa6316f-4f28-4ed1-b454-37183141f4eb> Data:
[----] E, [2016-09-11T10:21:29.383134 #13495:12d1148] ERROR -- : MIQ(MiqQueue#deliver) Message id: [1000000007193], Error: [Ovirt::MissingResourceError]
[----] E, [2016-09-11T10:21:29.383430 #13495:12d1148] ERROR -- : [Ovirt::MissingResourceError]: Ovirt::MissingResourceError  Method:[rescue in deliver]
[----] E, [2016-09-11T10:21:29.383524 #13495:12d1148] ERROR -- : /opt/rh/cfme-gemset/gems/ovirt-0.13.0/lib/ovirt/service.rb:311:in `parse_error_response'
/opt/rh/cfme-gemset/gems/ovirt-0.13.0/lib/ovirt/service.rb:282:in `block in resource_verb'
/opt/rh/cfme-gemset/gems/rest-client-2.0.0/lib/restclient/request.rb:858:in `process_result'
/opt/rh/cfme-gemset/gems/rest-client-2.0.0/lib/restclient/request.rb:776:in `block in transmit'
/opt/rh/rh-ruby23/root/usr/share/ruby/net/http.rb:853:in `start'
/opt/rh/cfme-gemset/gems/rest-client-2.0.0/lib/restclient/request.rb:766:in `transmit'
/opt/rh/cfme-gemset/gems/rest-client-2.0.0/lib/restclient/request.rb:215:in `execute'
/opt/rh/cfme-gemset/gems/rest-client-2.0.0/lib/restclient/request.rb:52:in `execute'
/opt/rh/cfme-gemset/gems/rest-client-2.0.0/lib/restclient/resource.rb:51:in `get'
/opt/rh/cfme-gemset/gems/ovirt-0.13.0/lib/ovirt/service.rb:277:in `resource_verb'
/opt/rh/cfme-gemset/gems/ovirt-0.13.0/lib/ovirt/service.rb:251:in `resource_get'
/opt/rh/cfme-gemset/gems/ovirt-0.13.0/lib/ovirt/base.rb:231:in `find_by_href'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/event_parser.rb:38:in `block in event_to_hash'

rhevm.log

[----] E, [2016-09-11T10:21:29.382612 #13495:12d1148] ERROR -- : Ovirt::Service#parse_error_response Return from URL: <https://jenkins-vm-09.scl.lab.tlv.redhat.com/ovirt-engine/api/users/4fa6316f-4f28-4ed1-b454-37183141f4eb> Data:


2, Vm fetch fails

evm.log

[----] E, [2016-09-11T10:28:32.544122 #39407:12d1148] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresher#refresh) EMS: [vm09], id: [1000000000003] Refresh failed
[----] E, [2016-09-11T10:28:32.544314 #39407:12d1148] ERROR -- : [NoMethodError]: undefined method `detect' for nil:NilClass  Method:[rescue in block in refresh]
[----] E, [2016-09-11T10:28:32.544487 #39407:12d1148] ERROR -- : /var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:223:in `block in host_inv_to_switch_hashes'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:220:in `each'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:220:in `host_inv_to_switch_hashes'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:103:in `block in host_inv_to_hashes'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:78:in `each'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:78:in `host_inv_to_hashes'
/var/www/miq/vmdb/app/models/manageiq/providers/redhat/infra_manager/refresh_parser.rb:12:in `ems_inv_to_hashes'


rhevm.log

[----] E, [2016-09-11T10:28:32.127919 #39407:bd7b56c] ERROR -- : Ovirt::Service#parse_error_response Return from URL: <https://jenkins-vm-09.scl.lab.tlv.redhat.com/ovirt-engine/api/vms/08b65709-173c-44a5-851d-e760abda521d> Data:


I see both issues occurring from time to time


In the engine log I see two types of failures:

2016-09-10 10:12:52,023 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-15) [] OAuthException invalid_scope: The requested scope '[ovirt-ext=revoke:revoke-all]' is invalid, unknown, malformed, or exceeds the scope granted by the resource owner.
2016-09-10 10:12:52,192 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-12) [] OAuthException invalid_scope: The requested scope '[ovirt-ext=revoke:revoke-all]' is invalid, unknown, malformed, or exceeds the scope granted by the resource owner.

and

2016-09-10 10:33:17,422 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-63) [] Can't find relative path for class "org.ovirt.engine.api.resource.VmDisksResource", will return null
2016-09-10 10:33:17,423 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-63) [] Can't find relative path for class "org.ovirt.engine.api.resource.VmDisksResource", will return null
2016-09-10 10:33:17,508 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-28) [] Can't find relative path for class "org.ovirt.engine.api.resource.VmDisksResource", will return null
2016-09-10 10:33:17,508 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-28) [] Can't find relative path for class "org.ovirt.engine.api.resource.VmDisksResource", will return null
2016-09-10 10:33:17,553 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-60) [] Can't find relative path for class "org.ovirt.engine.api.resource.VmDisksResource", will return null
2016-09-10 10:33:17,553 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-60) [] Can't find relative path for class "org.ovirt.engine.api.resource.VmDisksResource", will return null
2016-09-10 10:33:17,694 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-61) [] Can't find relative path for class "org.ovirt.engine.api.resource.TemplateDisksResource", will return null
2016-09-10 10:33:17,694 ERROR [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-61) [] Can't find relative path for class "org.ovirt.engine.api.resource.TemplateDisksResource", will return null

Still investigating

Comment 8 Piotr Kliczewski 2016-09-15 12:37:35 UTC
This issue is related to BZ #1359139 and it was fixed in 4.0.4.

There is still issue with stating provisioned vm. When vm is stored in db using wrong url:

[----] I, [2016-09-15T08:14:35.945656 #1845:e85990]  INFO -- : MIQ(EmsRefresh.save_vms_inventory) EMS: [rhv-40-ipinto], id: [1000000000005] Updating Vm [istein] id: [1000000000069] location: [6b93cb29-4978-45e5-b6bd-e0b2276e0211.ovf] storage id: [1000000000037] uid_ems: [6b93cb29-4978-45e5-b6bd-e0b2276e0211] ems_ref: [/api/vms/6b93cb29-4978-45e5-b6bd-e0b2276e0211]


and when we look for specific vm in db we fail because we use different url:

[----] I, [2016-09-15T08:14:47.582551 #3143:e85990]  INFO -- : Q-task_id([miq_provision_1000000000003]) MIQ(ManageIQ::Providers::Redhat::InfraManager::Provision#poll_destination_in_vmdb) Unable to find Vm [istein] with ems_ref [/ovirt-engine/api/vms/6b93cb29-4978-45e5-b6bd-e0b2276e0211], will retry

Comment 10 Satoe Imaishi 2016-10-10 12:56:32 UTC
PR: https://github.com/ManageIQ/manageiq/pull/11285

Comment 11 Ilanit Stein 2016-10-13 07:34:43 UTC
Verified on CFME-5.7.0.3.20160927165516_075d0f3
and RHV-4.0.4.2-0.1.el7ev


Note You need to log in before you can comment on or make changes to this bug.