Bug 1810416 - VM with nfs disk fails to start after previewing ram snapshot
Summary: VM with nfs disk fails to start after previewing ram snapshot
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.2
: ---
Assignee: Milan Zamazal
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On: 1811728
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-05 08:21 UTC by Evelina Shames
Modified: 2020-09-18 07:13 UTC (History)
4 users (show)

Fixed In Version: ovirt-engine-4.4.2.1
Clone Of:
Environment:
Last Closed: 2020-09-18 07:13:24 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
Logs (403.14 KB, application/zip)
2020-03-05 08:21 UTC, Evelina Shames
no flags Details
libvirt_logs (2.08 MB, application/zip)
2020-03-05 15:22 UTC, Evelina Shames
no flags Details
libvirt-logs2 (1.47 MB, application/zip)
2020-03-05 15:25 UTC, Evelina Shames
no flags Details
second time hit this issue vdsm log (5.93 MB, text/plain)
2020-03-08 16:16 UTC, Ilan Zuckerman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 108345 0 None MERGED spec: require libvirt with live snapshot fix 2020-08-10 16:18:23 UTC

Description Evelina Shames 2020-03-05 08:21:35 UTC
Created attachment 1667674 [details]
Logs

Description of problem:
VM with NFS disk fails to start after previewing or committing ram snapshot with the following errors:

Engine:
2020-03-05 01:54:56,178+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-15) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm_TestCase5138_0501495714 is down with error. Exit message: Wake up from hibernation failed:(<Element 'disk' at 0x7ff6c8e0b228>, 'source').
2020-03-05 01:54:56,183+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-15) [] Rerun VM '00d0568c-0f80-49fd-9c39-12b860c0f72f'. Called from VDS 'host_mixed_3'

VDSM:
2020-03-05 01:55:03,910+0200 ERROR (vm/00d0568c) [virt.vm] (vmId='00d0568c-0f80-49fd-9c39-12b860c0f72f') Operation failed (vm:4788)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4777, in _updateVcpuTuneInfo
    self._vcpuTuneInfo = self._dom.schedulerParameters()
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2247, in schedulerParameters
    if ret is None: raise libvirtError ('virDomainGetSchedulerParameters() failed', dom=self)
libvirt.libvirtError: Requested operation is not valid: cgroup CPU controller is not mounted

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.0-0.24.master.el8ev.noarch
vdsm-4.40.5-1.el8ev.x86_64
libvirt-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create VM with NFS disk
2. Run VM
3. Create ram snapshot s1
4. Power off VM
5. Preview s1 -> try to run VM

Actual results:
Operation fails
It tries to run VM on different host and sometimes operation fails again

Expected results:
Operation should succeed

Additional info:
Logs are attached

Comment 1 Evelina Shames 2020-03-05 15:22:55 UTC
Created attachment 1667803 [details]
libvirt_logs

Comment 2 Evelina Shames 2020-03-05 15:25:57 UTC
Created attachment 1667804 [details]
libvirt-logs2

Comment 3 Michal Skrivanek 2020-03-06 08:30:25 UTC
attach vdsm log, please

_updateVcpuTuneInfo error looks unrelated and already reported in bug 1810605

Comment 4 Michal Skrivanek 2020-03-06 10:33:36 UTC
(In reply to Michal Skrivanek from comment #3)
> attach vdsm log, please

sorry, missed that, it's all there

the actual error is:
2020-03-05 01:54:55,684+0200 ERROR (vm/00d0568c) [virt.vm] (vmId='00d0568c-0f80-49fd-9c39-12b860c0f72f') The vm start process failed (vm:840)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vmxml.py", line 74, in find_first
    return next(find_all(element, tag))
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 770, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 2512, in _run
    hooks.before_vm_start(self._buildDomainXML(), self._custom)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 2085, in _buildDomainXML
    self, dom, self._devices[hwclass.DISK])
  File "/usr/lib/python3.6/site-packages/vdsm/virt/domxml_preprocess.py", line 199, in update_disks_xml_from_objs
    dev_elem, disk_obj, vm.log, replace_attribs=True)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vmdevices/storagexml.py", line 288, in update_disk_element_from_object
    source = vmxml.find_first(disk_element, 'source')
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vmxml.py", line 77, in find_first
    raise NotFound((element, tag,))
vdsm.virt.vmxml.NotFound: (<Element 'disk' at 0x7ff6c8e0b228>, 'source')
2020-03-05 01:54:55,684+0200 INFO  (vm/00d0568c) [virt.vm] (vmId='00d0568c-0f80-49fd-9c39-12b860c0f72f') Changed state to Down: (<Element 'disk' at 0x7ff6c8e0b228>, 'source') (code=1) (vm:1598)


looks like the stored parameters (again) dropped empty cdrom source. moving to virt

Comment 5 Ilan Zuckerman 2020-03-08 16:14:25 UTC
I hit the same error, only that i didnt create snapshot.
rhel8 vm with 4 NFS disks:
1 - OS disk
2 - Mounted and formatted to xfs disk
3 - not mounted disk (added through engine)
4 - same as above
vm is started.

rhv-release-4.4.0-23-001.noarch
vdsm-4.40.5-1.el8ev.x86_64

Attaching vdsm log

Comment 6 Ilan Zuckerman 2020-03-08 16:16:38 UTC
Created attachment 1668451 [details]
second time hit this issue vdsm log

Comment 7 Milan Zamazal 2020-03-11 09:35:17 UTC
It seems to be the same problem as in https://bugzilla.redhat.com/1811425, i.e. libvirt bug https://bugzilla.redhat.com/1811728. Indeed, I can reproduce the bug with libvirt 6.0.0-9, while preview of a snapshot created with libvirt 6.0.0-10 starts fine.

Comment 8 Evelina Shames 2020-08-11 11:56:09 UTC
Verified on engine-4.4.2.1-0.15.el8ev.

Comment 9 Sandro Bonazzola 2020-09-18 07:13:24 UTC
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.