Description of problem: During livemigration of a vm, using a file on a gluster fuse mount as a diskimage, the diskfile changes its ownership from libvirt-qemu/kvm to root/root. Given the 700 permissions of the diskfile, the receiving libvirt-qemu can't access (at least can't write) to that file anymore. Version-Release number of selected component (if applicable): OS: ubuntu 13.04 QEMU: QEMU emulator version 1.4.0 (Debian 1.4.0+dfsg-1expubuntu4) (from ubuntu repo) LIBVIRTD: libvirtd (libvirt) 1.0.2 (from ubuntu repo) GLUSTER: glusterfs 3.4.2 built on Jan 11 2014 03:21:47 (from semiosis repo) How reproducible: 100% Steps to Reproduce: 1. on a replica 2 Volume, where the volume is mounted like mount -t glusterfs <my_own_ip>:/<volume_name> <mount_point> resides the image of a virtual machine 2. start on both hosts a "watch tree -pfungiA <mount_point>" (or a level above) 3. observe ownership of <mount_point>/<disk.image> 4. they are root/root as it is regular for a stopped vm 5. start vm 6. ownership changes to libvirt-qemu/kvm 7. trigger migration and watch the ownership of the disk.image 8. ownership changes from libvirt-qemu/kvm to root/root 9. At the same time the filesystem inside the vm becomes readonly Actual results: livemigration leaves migrated vm unusable until it is shutdown and restarted on either of both machines Expected results: livemigration should work Additional info: the diskfile is completely inaccessible on the reciving side, a "file <disk.image>" of "md5sum <disk.image>" gives a permission denied. on the sending side on the other hand there is no such problem, file and md5sum on <disk.image> works as expected. changing storage.owner-(u|g)id to libvirt-qemu/kvm on the volume doesn't help. Volume Name: glfs_atom01 Type: Replicate Volume ID: f28f0f62-37b3-4b10-8e86-9b373f4c0e75 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 172.24.1.11:/ecopool/fs_atom01 Brick2: 172.24.1.13:/ecopool/fs_atom01 Options Reconfigured: storage.owner-gid: 104 storage.owner-uid: 107 network.remote-dio: enable to make sure the rest of my setup is alright, I set up a drbd/ocfs2 stack on the same machines. With that I can livemigrate the vm with undisturbed dbench/bonnie++ running inside. downside of that setup, dbench reports only 80MB/s instead of 240MB/s with glusterfs plus the config overhead and maintenance efforts for drbd and ocfs2. there is a similar report here: http://epboven.home.xs4all.nl/gluster-migrate.html suggesting that the problem was introduced changing from gluster 3.4.0 to gluster 3.4.1
at exactly 15:00 UTC I started the migration. Inside the vm I had for i in `seq 1 100`; do echo `date` >> /tmp/test.file; sleep 1; done runing, after 7 seconds it returned with: echo: write error: Read-only file system I checked all logfiles on both machines but found only this in srv-vms-mnt_atom01.log (the brick log) on the receiving side [2014-01-24 15:00:07.848387] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-glfs_atom01-client-1: remote operation failed: Permission denied. Path: /atom01.img (74885dd0-6ff5-4ded-b5b4-d2f477e5bd6d) [2014-01-24 15:00:07.848549] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-glfs_atom01-client-0: remote operation failed: Permission denied. Path: /atom01.img (74885dd0-6ff5-4ded-b5b4-d2f477e5bd6d) [2014-01-24 15:00:07.848590] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341318: WRITE => -1 (Permission denied) [2014-01-24 15:00:07.849288] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341320: WRITE => -1 (Permission denied) [2014-01-24 15:00:07.849535] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341322: WRITE => -1 (Permission denied) [2014-01-24 15:00:12.719313] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341324: WRITE => -1 (Permission denied) [2014-01-24 15:00:12.719530] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341326: WRITE => -1 (Permission denied) [2014-01-24 15:00:12.719866] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341328: WRITE => -1 (Permission denied) [2014-01-24 15:00:12.720111] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341330: WRITE => -1 (Permission denied) [2014-01-24 15:00:12.720360] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341332: WRITE => -1 (Permission denied) (END)
/var/log/libvirt/qemu/atom01.log libvirt logs on the sending side: 2014-01-24 14:53:33.052+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name atom01 -S -M pc-i440fx-1.4 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid a20b4267-22b8-fc91-a4ea-938a5ad4a889 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/atom01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vms/mnt_atom01/atom01.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fa:ce:31,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:1,password -vga std -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 W: kvm binary is deprecated, please use qemu-system-x86_64 instead char device redirected to /dev/pts/2 (label charserial0) 2014-01-24 15:00:03.208+0000: shutting down qemu: terminating on signal 15 from pid 2170 ---------------------- libvirt logs on the receiving side: 2014-01-24 15:00:00.986+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name atom01 -S -M pc-i440fx-1.4 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid a20b4267-22b8-fc91-a4ea-938a5ad4a889 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/atom01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vms/mnt_atom01/atom01.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fa:ce:31,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0,password -vga std -incoming tcp:0.0.0.0:49159 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 W: kvm binary is deprecated, please use qemu-system-x86_64 instead char device redirected to /dev/pts/4 (label charserial0)
Would it be possible to attach tgz of glusterfs log directories from both nodes to this bug report?
Created attachment 856093 [details] gluster logs during a migration Attached are logfiles at DEBUG level, made during a migration of the guest 'kvmhost'. The migration itself starts at 14:00:00 and succeeds, but results in the migrated guest not being able to access its image on the destination server. The setup is Ubuntu 13.04, Gluster-3.4.1 from the Ubuntu PPA (semiosis). Also included is the /var/lib/libvirt/qemu/kvmtest.log from the destination server.
Created attachment 856149 [details] gluster logs of migration host1 virsh migrate --verbose --live --p2p --domain atom01 --desturi qemu+ssh://192.168.242.93/system the diskimage of atom01 resides on glustervolume "glfs_atom01" root@ping[/0]:~ # gluster volume info glfs_atom01 Volume Name: glfs_atom01 Type: Replicate Volume ID: f28f0f62-37b3-4b10-8e86-9b373f4c0e75 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 172.24.1.11:/ecopool/fs_atom01 Brick2: 172.24.1.13:/ecopool/fs_atom01 Options Reconfigured: network.remote-dio: enable storage.owner-uid: 107 storage.owner-gid: 104 diagnostics.client-log-level: DEBUG root@ping[/0]:~ # id libvirt-qemu uid=107(libvirt-qemu) gid=104(kvm) groups=104(kvm
Created attachment 856150 [details] gluster logs of migration host2 logs of the receiving host
The libvirt-wiki states that during a migration, libvirt will change the ownership of the guest image, unless it detects that the image is on a shared filesystem. After looking at the code for libvirt, they have code to detect NFS, GFS2 and SMB/CIFS, but not Gluster. As libvirt does not detect that the storage is on a shared file system, the originating host will perform a chown back to root:root at the end of a successful migration, whereas the destination host will do a chown to libvirt-qemu:kvm. This is in fact a race condition, so the difference in behaviour between 3.4.0 and 3.4.1 could be down to timing differences. http://wiki.libvirt.org/page/Migration_fails_because_disk_image_cannot_be_found Workaround: * stop your guests * stop libvirt-bin * edit /etc/libvirt/qemu.conf - this contains a commented out entry 'dynamic_ownership=1', which is the default. Change this to 0, and remove the comment. * Do a chown to libvirt-qemu:kvm for all your stopped images. * Start the service libvirt-bin again * Bring up the guests * Repeat on the other half of your cluster * Test a live migration - for me, they work again. You now have to take care of properly setting the ownership of a guest image yourself (presumably only once when you create it). Other possible solutions: JoeJulian suggested using libgfapi, giving libvirt direct access without having to go through the filesystem. This is the preferred setup for libvirt+gluster and should also result in better I/O performance. I haven't tested this yet, but it's high on my to-do list. Submit a patch to libvirt so it can detect that the filesystem is Gluster. statfs() will only show 'FUSE", but we could then use getxattr to see if there is a gluster-specific attribute set (suggested by kkeithley). This could be trusted.glusterfs.volume-id, e.g.
I can confirm the workaround above. It would be nice if libvirt could be patched as described so we won't have to worry about this problem in the future anymore.
Based on comment https://bugzilla.redhat.com/show_bug.cgi?id=1057645#c7 assigning the bug to libvirt
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
libvirt migration has had a lot of bug fixes since f20, so I'm assuming this is fixed. Closing as CURRENTRELEASE, please reopen if anyone can reproduce with newer fedora