Bug 1719789 - dynamic_ownership enabled breaks file ownership after virtual machine migration and shutdown for disk images on Gluster SD when libgfapi is enabled
Summary: dynamic_ownership enabled breaks file ownership after virtual machine migrati...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.30.13
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.3.6
: 4.30.29
Assignee: Michal Skrivanek
QA Contact: Beni Pelled
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-12 14:46 UTC by Daniel Milewski
Modified: 2022-10-07 12:34 UTC (History)
8 users (show)

Fixed In Version: vdsm-4.30.29
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-26 19:42:47 UTC
oVirt Team: Virt
Embargoed:
michal.skrivanek: ovirt-4.3?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1666795 0 urgent CLOSED VMs migrated to 4.3 are missing the appropriate virt XML for dynamic ownership, and are reset to root:root, preventing t... 2022-10-07 13:14:45 UTC
Red Hat Issue Tracker RHV-47968 0 None None None 2022-10-07 12:34:01 UTC
oVirt gerrit 101980 0 'None' MERGED gluster: disable dynamic ownership for gluster gfapi 2020-07-01 21:16:25 UTC
oVirt gerrit 102604 0 'None' MERGED gluster: disable dynamic ownership for gluster gfapi 2020-07-01 21:16:25 UTC

Description Daniel Milewski 2019-06-12 14:46:50 UTC
Description of problem:
dynamic_ownership enabled in oVirt 4.3 changes file ownership for machines which are migrated and then shut down, so that they can't be powered on until one fixes the ownership manually. It happens only for images located on Gluster storage domain when libgfapi is enabled. The ownership change is aparrently done by libvirtd. After switching off dynamic_ownership in /etc/libvirtd/qemu.conf on oVirt hosts correct ownership is maintained.

Version-Release number of selected component (if applicable):
oVirt 4.3.3 and VDSM 4.30.13

How reproducible:
Happens every time.

Steps to Reproduce:
1. Power on a virtual machine.
2. Migrate the virtual machine to a different host.
3. Shut down the virtual machine.
4. Try to power on the virtual machine again.

Actual results:
After migration disk image ownership is changed from vdsm:kvm to qemu:qemu. When virtual machine shuts down it is changed again from qemu:qemu to root:root, preventing vdsmd and libvirtd from accessing the disk image. Engine log says:
VM vm-1 is down with error. Exit message: Bad volume specification {'protocol': 'gluster', 'address': {'bus': '0', 'controller': '0', 'type': 'drive', 'target': '0', 'unit': '0'}, 'serial': 'cb868474-52fc-46a3-9a5c-c069ba1c0e02', 'index': 0, 'iface': 'scsi', 'apparentsize': '17179869184', 'specParams': {}, 'cache': 'none', 'imageID': 'cb868474-52fc-46a3-9a5c-c069ba1c0e02', 'truesize': '17179869184', 'type': 'disk', 'domainID': '9d33b830-6fc9-4190-a33a-19940a3a8589', 'reqsize': '0', 'format': 'raw', 'poolID': '2ccac895-215d-4883-a353-003d9ea272b1', 'device': 'disk', 'path': 'portal-shared/9d33b830-6fc9-4190-a33a-19940a3a8589/images/cb868474-52fc-46a3-9a5c-c069ba1c0e02/fa4ff8bf-89ef-4ceb-95a1-6d0985f1589f', 'propagateErrors': 'off', 'name': 'sda', 'bootOrder': '1', 'volumeID': 'fa4ff8bf-89ef-4ceb-95a1-6d0985f1589f', 'diskType': 'network', 'alias': 'ua-cb868474-52fc-46a3-9a5c-c069ba1c0e02', 'hosts': [{'name': 'backend-1', 'port': '0'}], 'discard': False}.

Expected results:
libvirtd correctly manages file ownership or doesn't change it.

Additional info:
It looks similar to bug 1666795 and its duplicates/dependent bugs but that one should be already fixed in 4.3.3.

Comment 1 Ryan Barry 2019-06-13 00:10:34 UTC
Gobinda, how is this not already reported on RHHI?

Is there a gluster replication setting getting in the way here?

Comment 2 Michal Skrivanek 2019-06-13 07:23:03 UTC
well, it was in bug 1687126

however the merged solution is to do this on incoming migration:
    for disk_type in (storage.DISK_TYPE.BLOCK, storage.DISK_TYPE.FILE,):
        xpath = "./devices//disk[@type='%s']//source" % (disk_type,)
        for element in tree.findall(xpath):
            storage.disable_dynamic_ownership(element)

...doesn't gluster use the NETWORK type?

Comment 3 Daniel Milewski 2019-06-13 09:37:53 UTC
Yes, I believe that Gluster disk images use the network type when libgfapi is enabled:

<disk type='network' device='disk' snapshot='no'>
  <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/>
  <source protocol='gluster' name='portal-shared/9d33b830-6fc9-4190-a33a-19940a3a8589/images/cb868474-52fc-46a3-9a5c-c069ba1c0e02/fa4ff8bf-89ef-4ceb-95a1-6d0985f1589f'>
    <host name='backend-1' port='24007'/>
  </source>
  <backingStore/>
  <target dev='sda' bus='scsi'/>
  <serial>cb868474-52fc-46a3-9a5c-c069ba1c0e02</serial>
  <boot order='1'/>
  <alias name='ua-cb868474-52fc-46a3-9a5c-c069ba1c0e02'/>
  <address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>

As far as I can see, it's the only disk type for which dynamic ownership is not disabled in lib/vdsm/virt/vmdevices/storage.py.

Comment 4 Ivan 2019-07-02 15:49:22 UTC
We have the same problem. 

oVirt 4.3.3 - 4.3.5rc2 and VDSM 4.30.20

Comment 5 Sahina Bose 2019-07-19 06:37:48 UTC
(In reply to Ryan Barry from comment #1)
> Gobinda, how is this not already reported on RHHI?
> 
> Is there a gluster replication setting getting in the way here?

in RHHI, we do not use libfapi, and the type = FILE not NETWORK

Comment 6 Sahina Bose 2019-07-19 06:44:02 UTC
Assigning to virt, as the NETWORK disk type would need to be handled similar to bug 1666795?

Comment 7 Michal Skrivanek 2019-07-19 15:34:00 UTC
The problem is in the original code adding seclabel in vdsm which was obsoleted by https://gerrit.ovirt.org/#/c/98088/. VMs started before that(i.e. VMs from 4.2) are wrong and the original code is still needed to prevent libvirt messing ownership

Comment 8 RHV bug bot 2019-08-15 14:05:16 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Tag 'v4.30.27' doesn't contain patch 'https://gerrit.ovirt.org/102604']
gitweb: https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=shortlog;h=refs/tags/v4.30.27

For more info please contact: infra

Comment 9 Beni Pelled 2019-09-04 10:35:42 UTC
Verified on RHV 4.3.6.3-0.1.el7 with vdsm-4.30.29-1.el7ev.x86_64 and libvirt-4.5.0-23.el7.x86_64

Verification steps:

1. Verify dynamic_ownership is enabled (/etc/libvirt/qemu.conf on hosts contains dynamic_ownership=1)
2. Enabled libgfapi by 'engine-config --set LibgfApiSupported=True' (choose 4.3)
3. Reboot engine by 'systemctl restart ovirt-engine.service'
4. Create vm with disk located on gluster storage domain
5. Power on the vm
6. Migrate the vm to any host
7. Shutdown the vm
8. Power on the vm

Result:

- VM is up and running.
- VM image-file owner remains vdsm:kvm from the creation stage through migration and after the shutdown.

Comment 10 Sandro Bonazzola 2019-09-26 19:42:47 UTC
This bugzilla is included in oVirt 4.3.6 release, published on September 26th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.6 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.