Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1608931

Summary: Guest image ownership is changed to root:root after second round of migration with killing src qemu at Finish phase
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Fangge Jin <fjin>
Component: libvirtAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED WONTFIX QA Contact: Fangge Jin <fjin>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.0CC: berrange, dyuan, jdenemar, jsuchane, lizhu, lmen, xuzhang, yafu
Target Milestone: rcKeywords: Triaged
Target Release: 8.1Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-15 07:41:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirtd log none

Description Fangge Jin 2018-07-26 14:03:02 UTC
Created attachment 1470788 [details]
libvirtd log

Description of problem:
Quest disk image ownership is changed to root:root after second round of migration with killing src qemu at Finish phase

Version-Release number of selected component (if applicable):
libvirt-4.5.0-4.el7.x86_64
qemu-kvm-rhev-2.12.0-8.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Start a guest on source host with guest disk image located on nfs

2.Attach gdb to libvirtd on the destination host and set breakpoint to qemuMigrationDstFinish

3.Migrate guest to target with or without --p2p
# virsh migrate rhel7-min qemu+ssh://$target/system --live --verbose

4.Wait until gdb hits the breakpoint

5.Kill QEMU process on the source host

6.Run "continue" command in gdb

7.After migration succeeds, migrate guest back to source host with or without --p2p:
# virsh migrate rhel7-min qemu+ssh://10.66.5.190/system --live --verbose --migrateuri tcp://10.66.5.190

8.Redo step 3-6

9.After migration succeeds, check guest image ownership:
# ll /nfs/RHEL-7.5-x86_64-latest.qcow2
-rw-r--r--. 1 root root 1403650048 Jul 26 09:46 RHEL-7.5-x86_64-latest.qcow2


Actual results:
As step 9, guest image ownership is changed to root:root.

Expected results:
Guest image ownership should be restored during migration.

Additional info:
In libvirtd.log, I see migrated=0, which is not corrected:
2018-07-26 03:49:41.020+0000: 16709: debug : virSecurityDACRestoreAllLabel:1560 : Restoring security label on rhel7-min migrated=0
2018-07-26 03:49:41.020+0000: 16709: info : virSecurityDACRestoreFileLabelInternal:665 : Restoring DAC user and group on '/nfs/RHEL-7.5-x86_64-latest.qcow2'
2018-07-26 03:49:41.020+0000: 16709: info : virSecurityDACSetOwnershipInternal:567 : Setting DAC user and group on '/nfs/RHEL-7.5-x86_64-latest.qcow2' to '0:0'

Comment 2 Jiri Denemark 2018-07-26 15:09:51 UTC
I'm not quite sure why this would happen only after the second migration, but
the problem is the monitor EOF handler which is called when you kill the QEMU
process on the source does not care about the migration and just resets
everything. This is usually fine because killed domain results in failed
migration most of the time, but it doesn't work in this corner case when the
QEMU process gets killed just after the migration actually finished (i.e., at
the point libvirtd itself would kill the process).

We could perhaps somehow check the current phase of migration in the EOF
handler so that it can pass migrated=1 when appropriate.

Comment 3 Daniel Berrangé 2018-07-26 15:11:19 UTC
It might perhaps be caused if the filesystem is different on each host. ie a local ext4 FS on one host, and then exported as NFS to the second host.

Comment 4 Fangge Jin 2018-07-26 23:35:25 UTC
(In reply to Daniel Berrange from comment #3)
> It might perhaps be caused if the filesystem is different on each host. ie a
> local ext4 FS on one host, and then exported as NFS to the second host.

Hi Daniel

I checked my test env on two hosts, they are configured correctly:

Target:
# mount |grep nfs
10.66.4.124:/nfs on /nfs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=10,retrans=2,sec=sys,clientaddr=10.73.131.69,local_lock=none,addr=10.66.4.124)


Source:
10.66.4.124:/nfs on /nfs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,port=0,timeo=15,retrans=1,sec=sys,clientaddr=10.66.5.190,local_lock=none,addr=10.66.4.124)

If "a local ext4 FS on one host, and then exported as NFS to the second host", the guest image ownership will be changed to root:root after the FIRST migration.

Comment 5 Lili Zhu 2018-09-04 09:36:37 UTC
Test with libvirt-4.5.0-8.virtcov.el7.x86_64
Now Only need try to migration for only one time, the migration after "continue" in gdb will be FAILED. 

1.Start a guest on source host with guest disk image located on nfs
# ll /mnt/nfs/lizhu/images/rhel7.6-GUI.img 
-rw-------. 1 qemu qemu 10739318784 Sep  4 05:31 /mnt/nfs/lizhu/images/rhel7.6-GUI.img

2.Attach gdb to libvirtd on the destination host and set breakpoint to qemuMigrationDstFinish

3.Migrate guest to target
# virsh migrate avocado-vt-vm1 qemu+ssh://10.73.73.112/system --verbose --live 

4.Wait until gdb hits the breakpoint

5.Kill QEMU process on the source host

6.Run "continue" command in gdb

7.Check the migration process
# virsh migrate avocado-vt-vm1 qemu+ssh://10.73.73.112/system --verbose --live 
Migration: [100 %]2018-09-04 09:32:46.794+0000: 14588: info : libvirt version: 4.5.0, package: 8.virtcov.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-09-03-10:48:21, x86-034.build.eng.bos.redhat.com)
2018-09-04 09:32:46.794+0000: 14588: info : hostname: ***
2018-09-04 09:32:46.794+0000: 14588: warning : virDomainMigrateVersion3Full:3249 : Guest avocado-vt-vm1 probably left in 'paused' state on source
error: internal error: unable to execute QEMU command 'cont': Could not reopen qcow2 layer: Could not read qcow2 header: Permission denied

8. check the guest image
# ll /mnt/nfs/lizhu/images/rhel7.6-GUI.img 
-rw-------. 1 root root 10739318784 Sep  4 05:32 /mnt/nfs/lizhu/images/rhel7.6-GUI.img

Comment 6 Jiri Denemark 2018-09-04 10:16:14 UTC
This matches what I described in comment #2. We need to enhance the monitor
EOF handler a bit.

Comment 7 Jiri Denemark 2019-04-25 09:56:43 UTC
This bug is going to be addressed in next major release.

Comment 8 Jaroslav Suchanek 2020-02-18 14:29:51 UTC
Can you please try to reproduce it with current rhel-av build? Thanks.

Comment 9 Fangge Jin 2020-02-25 09:21:44 UTC
Test with libvirt-6.0.0-6.virtcov.el8.x86_64 and qemu-kvm-4.2.0-11.module

Same steps as comment 5, it can still be reproduced

# virsh -k0 migrate rhev qemu+ssh://xxxxx/system --live --verbose
Migration: [100 %]error: internal error: unable to execute QEMU command 'cont': Could not reopen file: Permission denied

Comment 12 RHEL Program Management 2021-02-15 07:41:01 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.