Bug 1853194 - VM with disk on iscsi on environment with SELinux enforced fails to start on host - Exit message: Wake up from hibernation failed:internal error: child reported (status=125): unable to set security context
Summary: VM with disk on iscsi on environment with SELinux enforced fails to start on ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.1.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.4.3
: ---
Assignee: Liran Rotenberg
QA Contact: Evelina Shames
URL:
Whiteboard:
: 1853192 (view as bug list)
Depends On: 1772838 1877675
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-02 06:58 UTC by Evelina Shames
Modified: 2020-11-11 06:45 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-11-11 06:41:38 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
libvirt-log (2.23 MB, text/plain)
2020-07-07 16:51 UTC, Evelina Shames
no flags Details
qemu-log (10.79 KB, text/plain)
2020-07-07 16:53 UTC, Evelina Shames
no flags Details
vdsm+engine logs (1.90 MB, application/zip)
2020-08-11 05:53 UTC, Evelina Shames
no flags Details
full_logs (2.09 MB, application/x-xz)
2020-08-11 14:19 UTC, Liran Rotenberg
no flags Details

Description Evelina Shames 2020-07-02 06:58:34 UTC
Description of problem:
VM with disk on iscsi on PPC environment fails to start on host with the following errors:

Engine.log:
2020-07-02 09:18:50,521+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-17) [14c64a7d] EVENT_ID: VM_DOWN_ERROR(119), VM vm_TestCase5138_0209130682 is down with error. Exit message: Wake up from hibernation failed:internal error: child reported (status=125): unable to set security context 'system_u:object_r:virt_content_t:s0' on '/rhev/data-center/mnt/blockSD/97370c99-3549-4686-8536-d067f82e5daf/images/1bc19190-6f96-4d7e-9db3-a25c87451aaa/f5dd8cc0-7491-4574-a066-00ffcc0382fb': No such file or directory.
2020-07-02 09:18:50,527+03 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-17) [14c64a7d] Rerun VM 'ad70f664-cf00-4390-a8dc-ccb267dd283f'. Called from VDS 'host_mixed_2'
2020-07-02 09:18:50,537+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-17496) [14c64a7d] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run VM vm_TestCase5138_0209130682 on Host host_mixed_2.

VDSM.log:
2020-07-02 09:18:48,754+0300 ERROR (vm/ad70f664) [virt.vm] (vmId='ad70f664-cf00-4390-a8dc-ccb267dd283f') The vm start 
process failed (vm:871)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 801, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 2570, in _run
    fname, srcDomXML, libvirt.VIR_DOMAIN_SAVE_PAUSED)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 4851, in restoreFlags
    if ret == -1: raise libvirtError ('virDomainRestoreFlags() failed', conn=self)
libvirt.libvirtError: internal error: child reported (status=125): unable to set security context 'system_u:object_r:v
irt_content_t:s0' on '/rhev/data-center/mnt/blockSD/97370c99-3549-4686-8536-d067f82e5daf/images/1bc19190-6f96-4d7e-9db
3-a25c87451aaa/f5dd8cc0-7491-4574-a066-00ffcc0382fb': No such file or directory

Saw it in our automation. After it fails, it starts on another host.
In single host environment, the VM fails to start in the first attempt.


Version-Release number of selected component (if applicable):
rhv-4.4.1-5:
vdsm-4.40.20-1.el8ev.ppc64le
ovirt-engine-4.4.1.5-0.17.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create VM with iSCSI disk
2. Run the VM
3. Create memory snapshot
4. Power off the VM
5. Preview snapshot
6. Run the VM

Actual results:
Operation fails

Expected results:
Operation should succeed.

Additional info:
Logs are attached.

Comment 1 Evelina Shames 2020-07-02 07:23:27 UTC
*** Bug 1853192 has been marked as a duplicate of this bug. ***

Comment 2 Tal Nisan 2020-07-06 13:25:45 UTC
Seems related to the fix for bug 1840609 and bug 1842894
Liran, can you please have a look?

Comment 3 Liran Rotenberg 2020-07-06 13:44:34 UTC
I don't think it's related to the above bug. In those we went back to the way it was with older libvirt (4.3 rhv).
Did you start seeing it in the last build only?

Evelina, can you check the path /rhev/data-center/mnt/blockSD/97370c99-3549-4686-8536-d067f82e5daf/images/1bc19190-6f96-4d7e-9db
3-a25c87451aaa/f5dd8cc0-7491-4574-a066-00ffcc0382fb? Does it exist? What is the ownership on this?

Please also provide libvirt debug logs.

Comment 5 Evelina Shames 2020-07-07 16:42:19 UTC
(In reply to Liran Rotenberg from comment #3)
> I don't think it's related to the above bug. In those we went back to the
> way it was with older libvirt (4.3 rhv).
> Did you start seeing it in the last build only?
> 
> Evelina, can you check the path
> /rhev/data-center/mnt/blockSD/97370c99-3549-4686-8536-d067f82e5daf/images/
> 1bc19190-6f96-4d7e-9db
> 3-a25c87451aaa/f5dd8cc0-7491-4574-a066-00ffcc0382fb? Does it exist? What is
> the ownership on this?
> 
> Please also provide libvirt debug logs.

lrwxrwxrwx. 1 vdsm kvm 78 Jul  2 09:17 f5dd8cc0-7491-4574-a066-00ffcc0382fb -> /dev/97370c99-3549-4686-8536-d067f82e5daf/f5dd8cc0-7491-4574-a066-00ffcc0382fb

Comment 6 Evelina Shames 2020-07-07 16:51:26 UTC
Created attachment 1700189 [details]
libvirt-log

Comment 7 Evelina Shames 2020-07-07 16:53:47 UTC
Created attachment 1700191 [details]
qemu-log

Comment 8 Lukas Svaty 2020-07-15 12:52:25 UTC
Please fill severity.

Comment 9 Arik 2020-07-23 08:42:15 UTC
Does it still happen with the latest version?

Comment 10 Evelina Shames 2020-07-23 08:53:18 UTC
(In reply to Arik from comment #9)
> Does it still happen with the latest version?

Yes, tried on engine-4.4.1.8-0.7.el8ev

Comment 11 Arik 2020-07-23 09:12:05 UTC
Can you please provide engine and vdsm logs?

Comment 12 Evelina Shames 2020-07-27 12:58:00 UTC
(In reply to Arik from comment #11)
> Can you please provide engine and vdsm logs?

Don't have available PPC env at the moment.
I'll add relevant logs when I'll have PPC env.

Comment 13 Evelina Shames 2020-08-11 05:53:53 UTC
Created attachment 1711036 [details]
vdsm+engine logs

ovirt-engine-4.4.1.10-0.1.el8ev.noarch
vdsm-4.40.22-1.el8ev.ppc64le

Comment 14 Liran Rotenberg 2020-08-11 14:19:41 UTC
Created attachment 1711087 [details]
full_logs

I managed to reproduce it on the PPC environment and take the logs including libvirt debug logs.
The VM starting on host1 in the logs and failing.

libvirt-client-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64
qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.ppc64le
kernel-4.18.0-193.el8.ppc64le
selinux-policy-3.14.3-41.el8.noarch

After a close look, I didn't find anything related to the engine or VDSM.
But, the SELinux on the hosts are set to Enforced.

I tried on a normal environment(x86_64) when setting the SELinux to Enforced and it fails on the same error.

Comment 15 Liran Rotenberg 2020-08-11 14:37:02 UTC
Michal, can you please take a look? Or change it to someone who is more relevant?

Comment 16 Michal Privoznik 2020-08-19 08:15:24 UTC
(In reply to Liran Rotenberg from comment #15)
> Michal, can you please take a look? Or change it to someone who is more
> relevant?

The attached libvirtd.log is not debug log really. It was produced using log level = INFO instead of DEBUG. Can you please attach debug logs using the settings from https://wiki.libvirt.org/page/DebugLogs ?

But my rough guess is that the file doesn't exist when the domain is being resumed but it exists afterwards, when checking. E.g. a NFS is mounted afterwards? I don't think that mounts are reflected into domain namespace. One way to check if my theory is true is to disable namespaces (set namespaces=[] in qemu.conf).

Comment 18 Michal Privoznik 2020-08-20 09:41:06 UTC
Ah. This looks like bug 1772838 then. Thing is, the restore path is a block device (see comment 5) and as such it is not created in the namespace. If we want this fixed in RHEL-AV-8.2* then I guess we need to ask PMs.

Comment 20 Liran Rotenberg 2020-09-10 14:06:18 UTC
We are now using RHEL 8.3 with RHV 4.4.3. Moving to ON_QA.

Comment 21 Evelina Shames 2020-10-13 11:37:08 UTC
Verified with TestCase5138 on PPC environment with rhv-4.4.3-8.

Comment 22 Sandro Bonazzola 2020-11-11 06:41:38 UTC
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.