Bug 1637955 - Satellite fails to create VMs on RHV system based on a template.
Summary: Satellite fails to create VMs on RHV system based on a template.
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Compute Resources - RHEV
Version: 6.3.3
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: 6.5.0
Assignee: Ivan Necas
QA Contact: Lukáš Hellebrandt
URL:
Whiteboard:
Keywords: Regression, Triaged
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-10 11:54 UTC by Roman Hodain
Modified: 2019-05-20 10:01 UTC (History)
13 users (show)

(edit)
Clone Of:
: 1661044 (view as bug list)
(edit)
Last Closed: 2019-05-14 12:38:12 UTC


Attachments (Terms of Use)
preallocate_disk patch (943 bytes, patch)
2018-10-18 13:47 UTC, Kenny Tordeurs
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:1222 None None None 2019-05-14 12:38 UTC
Foreman Issue Tracker 25225 None None None 2018-10-17 19:53 UTC
Red Hat Knowledge Base (Solution) 3646291 None None None 2018-10-10 12:30 UTC

Description Roman Hodain 2018-10-10 11:54:08 UTC
Description of problem:
When Satellite creates a VM based on RHV template with additional disks which are based on different RHV storage domain than the template is located on. The VM creation fails with HTTP 404 error. This also happens for templates with no disks (configuration template)

Version-Release number of selected component (if applicable):
6.3.4

How reproducible:
100%

Steps to Reproduce:
1. Create  a template on RHV with no disks
2. Provision a VM based on this template from the Satellite with one disk

Actual results:
HTTP 404

Expected results:
VM is created

Additional info:
The issue is caused by the following or similar call:
https://RHV-M-FQDN/ovirt-engine/api/v3/vms

data:
<?xml version="1.0"?>
<vm>
  <name>TestVM</name>
  <description/>
  <template id="e3636a0e-68af-4478-a197-cb5da8e28f32"/>
  <quota id="5bb76c0e-025d-004e-0343-00000000012d"/>
  <cluster id="5bb76bf3-0189-0293-03a4-000000000392"/>
  <type>Server</type>
  <memory>4294967296</memory>
  <cpu>
    <topology cores="2" sockets="1"/>
  </cpu>
  <disks>
    <clone>true</clone>
    <disk id="">
      <storage_domains>
        <storage_domain id="84485ec1-873f-402b-83a9-ba9c2a0d83cb"/>
      </storage_domains>
      <format>raw</format>
      <sparse>false</sparse>
    </disk>
  </disks>
</vm>

The problem is that there are no disks at the time of the VM creation so the disks cannot be attached. The disks should get attached later after they are created. Initially, the Vm is created without any disks and the disks are attached later.

I expect the problem to start in lib/client/vm_api.rb in method Client.process_vm_opts where we set the clone flag for templates on different SDs. If the clone is set to True then we generate the "clone" xml entities in lib/ovirt/vm.rb in method VM.to_xml. The problem is that we do not do that only for disks that are part of the template, but also for the disks which were defined in the Satellite. Those should not be included in the VM creation RestAPI call.
Keep also in mind the special case where the template does not have any disks.

Comment 2 Kenny Tordeurs 2018-10-10 12:36:59 UTC
This is a regression as this worked fine on Satellite 6.2.x and only fails after upgrade to Satellite 6.3.3

Comment 7 Shira Maximov 2018-10-17 11:09:07 UTC
(In reply to Roman Hodain from comment #0)


Roman, I Try to reproduce the bug with your instructions on both 6.3 and 6.4 satellite versions, and with ovirt 4.1. 

Unforthently, I wasn't able to reproduce the bug on both of the setups.

If I understand correctly, the client Try to create the host with the `blank` template, and the blank template is not located at all on a storage domain.
So I believe that this error is unrelated to the disks.

so here some questions :
0. Which oVirt version are you using?
1. Are you able to create a host with no Disk at all (Blank template)? 
2. The template parameter is missing from the xml you provided..can you add it and try to perform a REST POST command with the xml you provided?

Thanks.

Comment 8 Jaroslav Spanko 2018-10-17 17:47:00 UTC
Hi Shira
Thanks a lot for the BJ session, hopefully i gave you all informations..
If you will need anything please let us know :)

Comment 9 Ivan Necas 2018-10-17 19:53:48 UTC
Created redmine issue https://projects.theforeman.org/issues/25225 from this bug

Comment 10 pm-sat@redhat.com 2018-10-17 20:06:53 UTC
Upstream bug assigned to inecas@redhat.com

Comment 11 pm-sat@redhat.com 2018-10-17 20:06:58 UTC
Upstream bug assigned to inecas@redhat.com

Comment 12 Ivan Necas 2018-10-17 20:28:25 UTC
I was able to reproduce the issue and here is a proposal to fix this issue: https://github.com/theforeman/foreman/pull/6152

For a background, this issue seemed to be introduced by https://bugzilla.redhat.com/show_bug.cgi?id=1399102, where we actually fixed the 'preallocated' flag for templates with disk (to clone them). Unfortunately, we didn't count on the fact of the source disk not to be there.

It seemed the patch has helped during my tests, with template without disks and adding some.

Also, for template with some disk, when not adding additional disks, everything seemed fine. The issues still seems to be the case, where there is one existing disk and one new, with 409 conflict on the following API call:

RestClient.post "https://rhv.example.com/ovirt-engine/api/v3/vms/b24466bf-6104-45a3-9228-dd3b5124a87e/disks", "<disk>\n  <storage_domains>\n    <storage_domain                   id=\"b76b3d3b-59fe-4730-9f3e-4f1ef9b90da8\"/>\n  </storage_domains>\n  <size>1073741824</size>\n  <type>data</type>\n  <bootable>false</bootable>\n  <interface>virtio</interface>\n  <format>raw</format>\n        <sparse>false</sparse>\n  <quota id=\"5bb76c0e-025d-004e-0343-00000000012d\"/>\n</disk>", "Accept"=>"application/xml", "Accept-Encoding"=>"gzip, deflate", "Authorization"=>"Basic                                  YWRtaW5AaW50ZXJuYWw6UmVkSGF0MSE=", "Content-Length"=>"327", "Content-Type"=>"application/xml", "User-Agent"=>"rest-client/2.0.0 (linux-gnu x86_64) ruby/2.3.6p384", "Version"=>"3"
1199 # => 409 Conflict | application/xml 179 bytes

I'm trying to get more data about the actual error now.

Comment 13 Ivan Necas 2018-10-17 21:05:12 UTC
So the issue with multiple disks now seems to be purely caused by the fact that the vm is still locked after the allocation, so adding additional disk fails:

409 Conflict | application/xml "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<fault>\n    <reason>Operation Failed</reason>\n    <detail>[Cannot add Virtual Disk: VM is locked. Please try again in a few minutes.]</detail>\n</fault>\n" 179 bytes

I will try adding some additional check to make sure that the vm is ready before starting with adding additional disks.

Comment 14 Ivan Necas 2018-10-17 21:26:38 UTC
I suspect that the 409 issues I've seen are more related to the test environment,  in case the same issue would be seen at the customer, we might try:

git diff app/models/compute_resources/foreman/model/ovirt.rb
diff --git a/app/models/compute_resources/foreman/model/ovirt.rb b/app/models/compute_resources/foreman/model/ovirt.rb
index 6402277..a491fcd 100644
--- a/app/models/compute_resources/foreman/model/ovirt.rb
+++ b/app/models/compute_resources/foreman/model/ovirt.rb
@@ -219,6 +219,7 @@ module Foreman::Model
       vm = super({ :first_boot_dev => 'network', :quota => ovirt_quota }.merge(args))
 
       begin
+        vm.wait_for { !vm.locked? }
         create_interfaces(vm, args[:interfaces_attributes])
         create_volumes(vm, args[:volumes_attributes])
       rescue => e


On the reproducer environment, the vm eventually got lost, we some issue of '
Failed to complete VM marta-amancio.sysmgmt.lan creation.', so the wait_for eventually ended up with 404 and didn't help much.

Anyway, I recommend trying https://github.com/theforeman/foreman/pull/6152 for now.

Comment 20 Kenny Tordeurs 2018-10-18 13:47:27 UTC
This would be the patch:


# gendiff /usr/share/foreman/app/models/compute_resources/foreman/model/ .bkp
diff -up /usr/share/foreman/app/models/compute_resources/foreman/model/ovirt.rb.bkp /usr/share/foreman/app/models/compute_resources/foreman/model/ovirt.rb
--- /usr/share/foreman/app/models/compute_resources/foreman/model/ovirt.rb.bkp	2018-10-18 09:58:32.584674354 +0200
+++ /usr/share/foreman/app/models/compute_resources/foreman/model/ovirt.rb	2018-10-18 15:44:43.119310332 +0200
@@ -229,7 +229,7 @@ module Foreman::Model
     end
 
     def preallocate_disks(args)
-      change_allocation_volumes = args[:volumes_attributes].values.select{ |x| x[:preallocate] == '1' }
+      change_allocation_volumes = args[:volumes_attributes].values.select{ |x| x[:id].present? && x[:preallocate] == '1' }
       if args[:template].present? && change_allocation_volumes.present?
         disks = change_allocation_volumes.map do |volume|
           { :id => volume[:id], :sparse => 'false', :format => 'raw', :storagedomain => volume[:storage_domain] }


To apply the patch:
- Copy the patch to the satellite server
- #  patch -d /usr/share/foreman/app/models/compute_resources/foreman/model/ -p9 < preallocate_disk.patch

To revert the patch:
#  patch -d /usr/share/foreman/app/models/compute_resources/foreman/model/ -p9 < preallocate_disk.patch
- Answer y to the prompt

Comment 21 Kenny Tordeurs 2018-10-18 13:47 UTC
Created attachment 1495298 [details]
preallocate_disk patch

Comment 31 pm-sat@redhat.com 2018-10-21 10:06:42 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/25225 has been resolved.

Comment 32 Sigbjorn Lie 2018-12-20 21:22:38 UTC
The patch we received for this bug was a 1 line code fix. Why delay the target milestone until 6.5.0? 
Should be easy enough for both a z-stream patch for 6.3 and 6.4...?

Comment 33 Lukáš Hellebrandt 2019-01-10 12:14:57 UTC
Verified with Sat 6.5 snap 10 and RHEV 4.2.7.5-0.1.

Tried (template_without_disk, template_with_one_disk) * (additional_preallocated_disk, additional_thin_disk). All machines provisioned successfully.

Comment 36 errata-xmlrpc 2019-05-14 12:38:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:1222

Comment 38 Ivan Necas 2019-05-17 07:29:12 UTC
The fix for this bug has been delivered in 6.4.2, see the cloned BZ https://bugzilla.redhat.com/show_bug.cgi?id=1661044


Note You need to log in before you can comment on or make changes to this bug.