Bug 1057645 - ownership of diskimage changes during livemigration, livemigration with kvm/libvirt fails
Summary: ownership of diskimage changes during livemigration, livemigration with kvm/l...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 20
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1058032 1286213
TreeView+ depends on / blocked
 
Reported: 2014-01-24 14:35 UTC by bernhard.glomm
Modified: 2016-04-26 19:14 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1058032 (view as bug list)
Environment:
Last Closed: 2015-05-31 18:48:45 UTC
Type: Bug


Attachments (Terms of Use)
gluster logs during a migration (9.86 KB, application/x-compressed-tar)
2014-01-27 14:34 UTC, Paul Boven
no flags Details
gluster logs of migration host1 (2.71 MB, text/tgz)
2014-01-27 16:49 UTC, bernhard.glomm
no flags Details
gluster logs of migration host2 (64.24 KB, text/tgz)
2014-01-27 16:50 UTC, bernhard.glomm
no flags Details

Description bernhard.glomm 2014-01-24 14:35:27 UTC
Description of problem: 
During livemigration of a vm, using a file on a gluster fuse mount as a diskimage, the diskfile changes its ownership from libvirt-qemu/kvm to root/root. Given the 700 permissions of the diskfile, the receiving libvirt-qemu can't access (at least can't write) to that file anymore.


Version-Release number of selected component (if applicable):
OS: ubuntu 13.04
QEMU: QEMU emulator version 1.4.0 (Debian 1.4.0+dfsg-1expubuntu4) 
      (from ubuntu repo)
LIBVIRTD: libvirtd (libvirt) 1.0.2 
          (from ubuntu repo)
GLUSTER: glusterfs 3.4.2 built on Jan 11 2014 03:21:47
         (from semiosis repo)


How reproducible:
100%


Steps to Reproduce:
1. on a replica 2 Volume, where the volume is mounted like
   mount -t glusterfs <my_own_ip>:/<volume_name> <mount_point>
   resides the image of a virtual machine
2. start on both hosts a "watch tree -pfungiA <mount_point>" (or a level above)
3. observe ownership of <mount_point>/<disk.image>
4. they are root/root as it is regular for a stopped vm
5. start vm
6. ownership changes to libvirt-qemu/kvm
7. trigger migration and watch the ownership of the disk.image
8. ownership changes from libvirt-qemu/kvm to root/root
9. At the same time the filesystem inside the vm becomes readonly


Actual results:
livemigration leaves migrated vm unusable until it is shutdown and restarted
on either of both machines


Expected results:
livemigration should work


Additional info:
the diskfile is completely inaccessible on the reciving side, 
a "file <disk.image>" of "md5sum <disk.image>" gives a permission denied.
on the sending side on the other hand there is no such problem, file and md5sum on <disk.image> works as expected.
changing storage.owner-(u|g)id to libvirt-qemu/kvm on the volume doesn't help.

Volume Name: glfs_atom01
Type: Replicate
Volume ID: f28f0f62-37b3-4b10-8e86-9b373f4c0e75
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 172.24.1.11:/ecopool/fs_atom01
Brick2: 172.24.1.13:/ecopool/fs_atom01
Options Reconfigured:
storage.owner-gid: 104
storage.owner-uid: 107
network.remote-dio: enable

to make sure the rest of my setup is alright,
I set up a drbd/ocfs2 stack on the same machines.
With that I can livemigrate the vm with undisturbed
dbench/bonnie++ running inside.
downside of that setup, dbench reports only 80MB/s
instead of 240MB/s with glusterfs 
plus the config overhead and maintenance efforts
for drbd and ocfs2.

there is a similar report here:
http://epboven.home.xs4all.nl/gluster-migrate.html
suggesting that the problem was introduced changing from gluster 3.4.0 to gluster 3.4.1

Comment 1 bernhard.glomm 2014-01-24 15:09:36 UTC
at exactly 15:00 UTC I started the migration.
Inside the vm I had 
for i in `seq 1 100`; do echo `date` >> /tmp/test.file; sleep 1; done
runing,
after 7 seconds it returned with:
echo: write error: Read-only file system
I checked all logfiles on both machines but found only this in
srv-vms-mnt_atom01.log (the brick log)
on the receiving side

[2014-01-24 15:00:07.848387] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-glfs_atom01-client-1: remote operation failed: Permission denied. Path: /atom01.img (74885dd0-6ff5-4ded-b5b4-d2f477e5bd6d)
[2014-01-24 15:00:07.848549] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-glfs_atom01-client-0: remote operation failed: Permission denied. Path: /atom01.img (74885dd0-6ff5-4ded-b5b4-d2f477e5bd6d)
[2014-01-24 15:00:07.848590] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341318: WRITE => -1 (Permission denied)
[2014-01-24 15:00:07.849288] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341320: WRITE => -1 (Permission denied)
[2014-01-24 15:00:07.849535] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341322: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.719313] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341324: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.719530] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341326: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.719866] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341328: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.720111] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341330: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.720360] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341332: WRITE => -1 (Permission denied)
(END)

Comment 2 bernhard.glomm 2014-01-24 15:20:14 UTC
/var/log/libvirt/qemu/atom01.log

libvirt logs on the sending side:

2014-01-24 14:53:33.052+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name atom01 -S -M pc-i440fx-1.4 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid a20b4267-22b8-fc91-a4ea-938a5ad4a889 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/atom01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vms/mnt_atom01/atom01.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fa:ce:31,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:1,password -vga std -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
W: kvm binary is deprecated, please use qemu-system-x86_64 instead
char device redirected to /dev/pts/2 (label charserial0)
2014-01-24 15:00:03.208+0000: shutting down
qemu: terminating on signal 15 from pid 2170

----------------------

libvirt logs on the receiving side:

2014-01-24 15:00:00.986+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name atom01 -S -M pc-i440fx-1.4 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid a20b4267-22b8-fc91-a4ea-938a5ad4a889 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/atom01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vms/mnt_atom01/atom01.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fa:ce:31,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0,password -vga std -incoming tcp:0.0.0.0:49159 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
W: kvm binary is deprecated, please use qemu-system-x86_64 instead
char device redirected to /dev/pts/4 (label charserial0)

Comment 3 Vijay Bellur 2014-01-26 15:09:40 UTC
Would it be possible to attach tgz of glusterfs log directories from both nodes to this bug report?

Comment 4 Paul Boven 2014-01-27 14:34:35 UTC
Created attachment 856093 [details]
gluster logs during a migration

Attached are logfiles at DEBUG level, made during a migration of the guest 'kvmhost'. The migration itself starts at 14:00:00 and succeeds, but results in the migrated guest not being able to access its image on the destination server. The setup is Ubuntu 13.04, Gluster-3.4.1 from the Ubuntu PPA (semiosis). Also included is the /var/lib/libvirt/qemu/kvmtest.log from the destination server.

Comment 5 bernhard.glomm 2014-01-27 16:49:03 UTC
Created attachment 856149 [details]
gluster logs of migration host1

virsh migrate --verbose --live --p2p --domain atom01 --desturi qemu+ssh://192.168.242.93/system

the diskimage of atom01 resides on glustervolume "glfs_atom01"

root@ping[/0]:~ # gluster volume info glfs_atom01
 
Volume Name: glfs_atom01
Type: Replicate
Volume ID: f28f0f62-37b3-4b10-8e86-9b373f4c0e75
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 172.24.1.11:/ecopool/fs_atom01
Brick2: 172.24.1.13:/ecopool/fs_atom01
Options Reconfigured:
network.remote-dio: enable
storage.owner-uid: 107
storage.owner-gid: 104
diagnostics.client-log-level: DEBUG

root@ping[/0]:~ # id libvirt-qemu
uid=107(libvirt-qemu) gid=104(kvm) groups=104(kvm

Comment 6 bernhard.glomm 2014-01-27 16:50:00 UTC
Created attachment 856150 [details]
gluster logs of migration host2

logs of the receiving host

Comment 7 Paul Boven 2014-01-28 16:33:49 UTC
The libvirt-wiki states that during a migration, libvirt will change the ownership of the guest image, unless it detects that the image is on a shared filesystem. After looking at the code for libvirt, they have code to detect NFS, GFS2 and SMB/CIFS, but not Gluster. As libvirt does not detect that the storage is on a shared file system, the originating host will perform a chown back to root:root at the end of a successful migration, whereas the destination host will do a chown to libvirt-qemu:kvm. This is in fact a race condition, so the difference in behaviour between 3.4.0 and 3.4.1 could be down to timing differences.

http://wiki.libvirt.org/page/Migration_fails_because_disk_image_cannot_be_found

Workaround: 
* stop your guests
* stop libvirt-bin
* edit /etc/libvirt/qemu.conf - this contains a commented out entry 'dynamic_ownership=1', which is the default. Change this to 0, and remove the comment. 
* Do a chown to libvirt-qemu:kvm for all your stopped images. 
* Start the service libvirt-bin again
* Bring up the guests
* Repeat on the other half of your cluster
* Test a live migration - for me, they work again.

You now have to take care of properly setting the ownership of a guest image yourself (presumably only once when you create it).

Other possible solutions:

JoeJulian suggested using libgfapi, giving libvirt direct access without having to go through the filesystem. This is the preferred setup for libvirt+gluster and should also result in better I/O performance. I haven't tested this yet, but it's high on my to-do list.

Submit a patch to libvirt so it can detect that the filesystem is Gluster. statfs() will only show 'FUSE", but we could then use getxattr to see if there is a gluster-specific attribute set (suggested by kkeithley). This could be trusted.glusterfs.volume-id, e.g.

Comment 8 bernhard.glomm 2014-01-29 10:22:04 UTC
I can confirm the workaround above.
It would be nice if libvirt could be patched as described so we won't have to worry about this problem in the future anymore.

Comment 9 Pranith Kumar K 2014-01-29 11:34:46 UTC
Based on comment https://bugzilla.redhat.com/show_bug.cgi?id=1057645#c7 assigning the bug to libvirt

Comment 10 Fedora End Of Life 2015-05-29 10:39:48 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Cole Robinson 2015-05-31 18:48:45 UTC
libvirt migration has had a lot of bug fixes since f20, so I'm assuming this is fixed. Closing as CURRENTRELEASE, please reopen if anyone can reproduce with newer fedora


Note You need to log in before you can comment on or make changes to this bug.