Bug 765846 - Submit VM job - doesn't work
Summary: Submit VM job - doesn't work
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin
Version: 2.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 2.1.1
: ---
Assignee: Chad Roberts
QA Contact: Stanislav Graf
URL:
Whiteboard:
Depends On:
Blocks: 765607
TreeView+ depends on / blocked
 
Reported: 2011-12-09 14:50 UTC by Stanislav Graf
Modified: 2012-08-15 10:05 UTC (History)
5 users (show)

Fixed In Version: cumin-0.1.5180-1
Doc Type: Bug Fix
Doc Text:
Consequence: Submitting a VM job from within cumin was appearing to succeed in cumin, but failing to start the VM. Cause: The job classad produced by cumin was slightly outdated and still using VMPARAM_Kvm_Disk instead of VMPARAM_vm_Disk. Fix: Cumin now builds the job classad with VMPARAM_vm_Disk Result: VM jobs submitted from within cumin are working again.
Clone Of:
Environment:
Last Closed: 2012-02-06 18:19:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 765894 0 low CLOSED Default VM submission method is KVM, but it is not documented 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 782054 0 high CLOSED VM without VNC console doesn't start 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 782839 0 low CLOSED Cumin should report changes in condor pool as INFO (instead ERROR) 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 783398 0 unspecified CLOSED Submitting of XEN jobs doesn't work in cumin 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 848344 0 low CLOSED Problem submitting jobs from cumin via Aviary when commands have no arguments 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2012:0100 0 normal SHIPPED_LIVE Moderate: MRG Grid security, bug fix, and enhancement update 2012-02-06 23:15:47 UTC

Internal Links: 765894 782054 782839 783398 848344

Description Stanislav Graf 2011-12-09 14:50:37 UTC
Description of problem:
I was trying submit VM (kvm) job from cumin UI and it doesn't work.

Job got into held state with reason:
HoldReasonError from slot1@server-name: VMGAHP_ERR_JOBCLASSAD_X​EN_NO_DISK_PARAM

This is log from cumin's web.log:
15786 2011-12-09 15:37:23,652 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.grid%3Bmain.m%3Dgrid%3Bmain.grid.id%3D5%3Bmain.grid.view.body.m%3Dpool_submissions;widget=main.tasks;
widget=main.grid.view.body.pool_submissions.table
15786 2011-12-09 15:37:23,690 INFO Response 200 OK
15786 2011-12-09 15:37:23,691 DEBUG Response headers:
15786 2011-12-09 15:37:23,692 DEBUG   Content-Length            6753
15786 2011-12-09 15:37:23,692 DEBUG   Content-Type              text/xml
15786 2011-12-09 15:37:23,693 DEBUG   Cache-Control             no-cache
15786 2011-12-09 15:37:23,994 INFO Request POST /form.html?
15786 2011-12-09 15:37:24,005 DEBUG Validating cumin.grid.submission.VmJobSubmitForm('modes.VmJobSubmit')
15786 2011-12-09 15:37:24,015 DEBUG Starting cumin.grid.submission.VmJobSubmit
15786 2011-12-09 15:37:24,016 INFO Started cumin.grid.submission.VmJobSubmit
15786 2011-12-09 15:37:24,017 DEBUG Job ad:
15786 2011-12-09 15:37:24,017 DEBUG   !!descriptors                       {'RequestMemory': 'com.redhat.grid.Expression', 'Requirements': 'com.redhat.grid.Expression'}
15786 2011-12-09 15:37:24,018 DEBUG   Cmd                                 '/var/lib/libvirt/images/testvm.img'
15786 2011-12-09 15:37:24,018 DEBUG   DiskUsage                           0
15786 2011-12-09 15:37:24,019 DEBUG   Iwd                                 '/tmp'
15786 2011-12-09 15:37:24,019 DEBUG   JobUniverse                         13
15786 2011-12-09 15:37:24,020 DEBUG   JobVMCheckpoint                     False
15786 2011-12-09 15:37:24,021 DEBUG   JobVMMemory                         512
15786 2011-12-09 15:37:24,021 DEBUG   JobVMNetworking                     False
15786 2011-12-09 15:37:24,022 DEBUG   JobVMType                           'kvm'
15786 2011-12-09 15:37:24,022 DEBUG   JobVM_VCPUS                         1
15786 2011-12-09 15:37:24,023 DEBUG   Owner                               'cumin'
15786 2011-12-09 15:37:24,023 DEBUG   RequestDisk                         5242880
15786 2011-12-09 15:37:24,024 DEBUG   RequestMemory                       'ceiling(ifThenElse(JobVMMemory =!= undefined,JobVMMemory, ImageSize / 1024.000000))'
15786 2011-12-09 15:37:24,024 DEBUG   Requirements                        'VM_Type == "KVM" && Arch == "X86_64" && HasVM && VM_AvailNum > 0 && TotalDisk >= DiskUsage && TotalMemory >= 512 && VM_Memory >= 512'
15786 2011-12-09 15:37:24,025 DEBUG   ShouldTransferFiles                 'NEVER'
15786 2011-12-09 15:37:24,025 DEBUG   Submission                          'pokus01'
15786 2011-12-09 15:37:24,026 DEBUG   VMPARAM_Kvm_Disk                    '/var/lib/libvirt/images/testvm.img:vda:w'
15786 2011-12-09 15:37:24,595 DEBUG Exiting cumin.grid.submission.VmJobSubmit
15786 2011-12-09 15:37:24,596 INFO Exited cumin.grid.submission.VmJobSubmit
15786 2011-12-09 15:37:24,597 INFO Response 303 See Other
15786 2011-12-09 15:37:24,598 DEBUG Response headers:
15786 2011-12-09 15:37:24,598 DEBUG   Location                  index.html?frame=main.grid;main.m=grid;main.grid.id=5;main.grid.view.body.m=pool_submissions
15786 2011-12-09 15:37:24,663 DEBUG Method response for request 1323441368 received from Broker connected at: server-name:5672
15786 2011-12-09 15:37:24,664 DEBUG Response: OK (0) - {u'Id': 'server-name#13.0'}
15786 2011-12-09 15:37:24,664 DEBUG Ending cumin.grid.submission.VmJobSubmit
15786 2011-12-09 15:37:24,665 INFO Ended cumin.grid.submission.VmJobSubmit

Version-Release number of selected component (if applicable):
cumin-0.1.5098-2.el5.noarch

How reproducible:
100%

Steps to Reproduce:
1. Add broker configuration into cumin.conf and start cumin
2. Try to submit VM job with provided VM path from server administrator
3. See job status/details
  
Actual results:
Job isn't running

Expected results:
Job is running

Additional info:

Comment 1 Luigi Toscano 2011-12-09 14:56:31 UTC
VMPARAM_Kvm_Disk was renamed as VMPARAM_vm_Disk at sometime before 2.0.

Comment 2 Chad Roberts 2011-12-14 15:01:52 UTC
I think I need some more info to go on here.  I have not yet tried anything with a VM job.  Is there any special condor setup that I'll need to reproduce this properly?

Any chance that anyone has a file that I could run via condor_submit?

Where can I find the img file described in /var/lib/libvirt/images/testvm.img?  Better yet, what steps would I need to do to create an img of my own?

Based on Luigi's comment #1, I think this may wind-up being a small fix, but it seems as though I will have some learning to do to get to the point of fixing it.

Thanks

Comment 3 Luigi Toscano 2011-12-14 16:20:20 UTC
(In reply to comment #2)
> I think I need some more info to go on here.  I have not yet tried anything
> with a VM job.  Is there any special condor setup that I'll need to reproduce
> this properly?

Basic configuration for virtualization support as documented here:
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/2/html/Grid_User_Guide/chap-Grid_User_Guide-The_Virtual_Machine_Universe.html

> Any chance that anyone has a file that I could run via condor_submit?
Something like:
--------------------
Universe=vm
Log=log.$(cluster)
Executable=testvm
VM_TYPE=kvm
VM_MEMORY=512
VM_DISK=/var/lib/libvirt/images/testvm.job:vda:w
Queue
--------------------

> Where can I find the img file described in /var/lib/libvirt/images/testvm.img? 
> Better yet, what steps would I need to do to create an img of my own?
Any valid (raw or qcow2) image will work. 
virt-manager or simply virt-install can help to install a machine.

Comment 7 Chad Roberts 2011-12-15 15:14:48 UTC
Fixed in revision 5178 on trunk.

Comment 8 Chad Roberts 2011-12-15 15:14:48 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Consequence:  Submitting a VM job from within cumin was appearing to succeed in cumin, but failing to start the VM.

Cause:  The job classad produced by cumin was slightly outdated and still using VMPARAM_Kvm_Disk instead of VMPARAM_vm_Disk.

Fix:  Cumin now builds the job classad with VMPARAM_vm_Disk

Result:  VM jobs submitted from within cumin are working again.

Comment 10 Stanislav Graf 2012-01-17 21:25:24 UTC
I was able to test
cumin-0.1.5184-1.el6.noarch
condor-vm-gahp-7.6.5-0.11.el6.x86_64

And it was working with submitting KVM guest job from cumin without extra parameters.

Need to test on rhel 5/6, i386/x86_64 + XEN+KVM where supported.

Comment 11 Stanislav Graf 2012-01-18 14:35:09 UTC
RHEL5 i386, cumin-0.1.5184-1.el5.noarch

KVM job OK 
notes: I filled only job description and vm image location: /var/lib/libvirt/images/testvm.img

XEN job OK
notes:
job description and vm image location: /var/lib/xen/images/testvm.img
Probably I hit Bug 765894
So I needed to add extra params:
Requirements = True
JobVMType = xen
VMPARAM_Xen_Kernel = included
VMPARAM_vm_Disk = /var/lib/xen/images/testvm.img:xvda:w

Comment 15 Stanislav Graf 2012-01-19 11:46:41 UTC
RHEL5 x86_64, cumin-0.1.5184-1.el5.noarch
RHEL6 i386, cumin-0.1.5184-1.el6.noarch
RHEL6 x86_64, cumin-0.1.5184-1.el6.noarch

KVM job OK 
notes: I filled only job description and vm image location:
/var/lib/libvirt/images/testvm.img

XEN job OK
notes:
job description and vm image location: /var/lib/xen/images/testvm.img
Probably I hit Bug 765894
So I needed to add extra params:
Requirements = True
JobVMType = xen
VMPARAM_Xen_Kernel = included
VMPARAM_vm_Disk = /var/lib/xen/images/testvm.img:xvda:w

I also have a lot of "errors" in cumin web.log - Bug 782839
There were apparently only information from cumin that there were changes in the condor node. Web interface wasn't affected.

Comment 16 Stanislav Graf 2012-01-19 11:53:16 UTC
(In reply to comment #10)
> I was able to test
> cumin-0.1.5184-1.el6.noarch
> condor-vm-gahp-7.6.5-0.11.el6.x86_64
> 
> And it was working with submitting KVM guest job from cumin without extra
> parameters.
> 
> Need to test on rhel 5/6, i386/x86_64 + XEN+KVM where supported.

I hit during this test on my laptop Bug 782054 .

I did proper verification ( comment 11 and comment 15 ) and there was no such issue.

Comment 17 errata-xmlrpc 2012-02-06 18:19:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0100.html


Note You need to log in before you can comment on or make changes to this bug.