Bug 1001152 - ovirt-hosted-engine setup fails to start engine VM because sanlock failes to acquire lock: No space left on device
Summary: ovirt-hosted-engine setup fails to start engine VM because sanlock failes to ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-setup
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 3.3.0
Assignee: Sandro Bonazzola
QA Contact: Leonid Natapov
URL:
Whiteboard: integration
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-26 14:51 UTC by Leonid Natapov
Modified: 2015-09-22 13:09 UTC (History)
12 users (show)

Fixed In Version: ovirt-hosted-engine-setup-1.0.0-0.6.beta1.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-21 16:52:35 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0083 0 normal SHIPPED_LIVE new package: ovirt-hosted-engine-setup 2014-01-21 21:42:22 UTC

Description Leonid Natapov 2013-08-26 14:51:53 UTC
Description of problem:

ovirt-hosted-engine setup fails run VM because sanlock failes to acquire lock: No space left on device.When ovirt-hosted-engine setup creates storage domain it uses sanlock in order to lock storage while it created and right after that releases the lock. But later on VM tries to acquire a lock that doesn't exist anymore. VM tries to do that because in vm.conf file Protected key set to TRUE. 


Workaround:
Edit /etc/ovirt-hosted-engine/vm.conf and change protected from true to false.

------------------------------------------------------------------------------
</lease>
        </devices>
        <os>
                <type arch="x86_64" machine="pc">hvm</type>
                <boot dev="network"/>
                <smbios mode="sysinfo"/>
        </os>
        <sysinfo type="smbios">
                <system>
                        <entry name="manufacturer">oVirt</entry>
                        <entry name="product">oVirt Node</entry>
                        <entry name="version">6Server-6.4.0.4.el6</entry>
                        <entry name="serial">4C4C4544-0032-4310-8051-C6C04F39354A</entry>
                        <entry name="uuid">436fbe14-1508-4887-be64-ce524ab93eb5</entry>
                </system>
        </sysinfo>
        <clock adjustment="0" offset="variable">
                <timer name="rtc" tickpolicy="catchup"/>
        </clock>
        <features>
                <acpi/>
        </features>
        <cpu match="exact">
                <model>qemu64</model>
                <feature name="svm" policy="disable"/>
        </cpu>
<on_poweroff>destroy</on_poweroff><on_reboot>destroy</on_reboot><on_crash>destroy</on_crash></domain>
Thread-44::DEBUG::2013-08-26 17:19:01,812::BindingXMLRPC::974::vds::(wrapper) client [127.0.0.1]::call vmSetTicket with ('436fbe14-1508-4887-be64-ce524ab93eb5', '9986iCpd', '10800', 'disconnect', {}) {}
Thread-44::ERROR::2013-08-26 17:19:01,812::BindingXMLRPC::993::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 979, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/BindingXMLRPC.py", line 240, in vmSetTicket
    return vm.setTicket(password, ttl, existingConnAction, params)
  File "/usr/share/vdsm/API.py", line 586, in setTicket
    return v.setTicket(password, ttl, existingConnAction, params)
  File "/usr/share/vdsm/vm.py", line 4122, in setTicket
    graphics = _domParseStr(self._dom.XMLDesc(0)).childNodes[0]. \
AttributeError: 'NoneType' object has no attribute 'XMLDesc'
Thread-42::DEBUG::2013-08-26 17:19:02,579::libvirtconnection::101::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire lock: No space left on device
Thread-42::DEBUG::2013-08-26 17:19:02,579::vm::2039::vm.Vm::(_startUnderlyingVm) vmId=`436fbe14-1508-4887-be64-ce524ab93eb5`::_ongoingCreations released
Thread-42::ERROR::2013-08-26 17:19:02,579::vm::2065::vm.Vm::(_startUnderlyingVm) vmId=`436fbe14-1508-4887-be64-ce524ab93eb5`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 2025, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/vm.py", line 2920, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2645, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: Failed to acquire lock: No space left on device

Comment 1 Oved Ourfali 2013-08-27 07:14:32 UTC
The right behaviour is probably to run the VM without sanlock protection in the hosted engine setup.
Once the setup is done, it shuts down the VM and let the HA services run the engine.
At that point, the HA services should start the VM with sanlock protection.

Comment 2 Itamar Heim 2013-08-28 06:14:41 UTC
why? what if the hosted engine is an already installed engine, then you don't need to shut it down?

Comment 3 Oved Ourfali 2013-08-28 06:28:24 UTC
(In reply to Itamar Heim from comment #2)
> why? what if the hosted engine is an already installed engine, then you
> don't need to shut it down?

I also found this decision odd at start... basically the goal was to know that the HA service indeed kicks in, and start the hosted engine VM properly.
If the VM remains up in the setup then we have no indication whether the HA services are working properly.

Perhaps this one is more common now, when we are developing the feature, until we get things stable, but I think there should be some way to make sure the HA service will take care of us in case of failure.

Doron - thoughts about that?

Comment 4 Itamar Heim 2013-08-29 06:17:46 UTC
I think I'm fine for now with "setup, then move to ha" to see it works fine before we try anything more fancy.
but i still think we should do this with sanlock, so starting the ha service by admin won't corrupt the vm, etc.

Comment 8 Leonid Natapov 2013-10-17 10:50:54 UTC
fixed.

Comment 9 Charlie 2013-11-28 01:19:54 UTC
This bug is currently attached to errata RHBA-2013:15257. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 10 Sandro Bonazzola 2013-12-05 10:46:53 UTC
hosted engine is a new package, does not need errata for specific bugs during its development.

Comment 11 errata-xmlrpc 2014-01-21 16:52:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0083.html


Note You need to log in before you can comment on or make changes to this bug.