1057645 – ownership of diskimage changes during livemigration, livemigration with kvm/libvirt fails

Bug 1057645 - ownership of diskimage changes during livemigration, livemigration with kvm/libvirt fails

Summary: ownership of diskimage changes during livemigration, livemigration with kvm/l...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	libvirt
Sub Component:
Version:	20
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Libvirt Maintainers
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1058032 1286213
TreeView+	depends on / blocked

Reported:	2014-01-24 14:35 UTC by bernhard.glomm
Modified:	2016-04-26 19:14 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Clone Of:
Clones:	1058032 (view as bug list)
Environment:
Last Closed:	2015-05-31 18:48:45 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
gluster logs during a migration (9.86 KB, application/x-compressed-tar) 2014-01-27 14:34 UTC, Paul Boven	no flags	Details
gluster logs of migration host1 (2.71 MB, text/tgz) 2014-01-27 16:49 UTC, bernhard.glomm	no flags	Details
gluster logs of migration host2 (64.24 KB, text/tgz) 2014-01-27 16:50 UTC, bernhard.glomm	no flags	Details
View All

Description bernhard.glomm 2014-01-24 14:35:27 UTC

Description of problem: 
During livemigration of a vm, using a file on a gluster fuse mount as a diskimage, the diskfile changes its ownership from libvirt-qemu/kvm to root/root. Given the 700 permissions of the diskfile, the receiving libvirt-qemu can't access (at least can't write) to that file anymore.


Version-Release number of selected component (if applicable):
OS: ubuntu 13.04
QEMU: QEMU emulator version 1.4.0 (Debian 1.4.0+dfsg-1expubuntu4) 
      (from ubuntu repo)
LIBVIRTD: libvirtd (libvirt) 1.0.2 
          (from ubuntu repo)
GLUSTER: glusterfs 3.4.2 built on Jan 11 2014 03:21:47
         (from semiosis repo)


How reproducible:
100%


Steps to Reproduce:
1. on a replica 2 Volume, where the volume is mounted like
   mount -t glusterfs <my_own_ip>:/<volume_name> <mount_point>
   resides the image of a virtual machine
2. start on both hosts a "watch tree -pfungiA <mount_point>" (or a level above)
3. observe ownership of <mount_point>/<disk.image>
4. they are root/root as it is regular for a stopped vm
5. start vm
6. ownership changes to libvirt-qemu/kvm
7. trigger migration and watch the ownership of the disk.image
8. ownership changes from libvirt-qemu/kvm to root/root
9. At the same time the filesystem inside the vm becomes readonly


Actual results:
livemigration leaves migrated vm unusable until it is shutdown and restarted
on either of both machines


Expected results:
livemigration should work


Additional info:
the diskfile is completely inaccessible on the reciving side, 
a "file <disk.image>" of "md5sum <disk.image>" gives a permission denied.
on the sending side on the other hand there is no such problem, file and md5sum on <disk.image> works as expected.
changing storage.owner-(u|g)id to libvirt-qemu/kvm on the volume doesn't help.

Volume Name: glfs_atom01
Type: Replicate
Volume ID: f28f0f62-37b3-4b10-8e86-9b373f4c0e75
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 172.24.1.11:/ecopool/fs_atom01
Brick2: 172.24.1.13:/ecopool/fs_atom01
Options Reconfigured:
storage.owner-gid: 104
storage.owner-uid: 107
network.remote-dio: enable

to make sure the rest of my setup is alright,
I set up a drbd/ocfs2 stack on the same machines.
With that I can livemigrate the vm with undisturbed
dbench/bonnie++ running inside.
downside of that setup, dbench reports only 80MB/s
instead of 240MB/s with glusterfs 
plus the config overhead and maintenance efforts
for drbd and ocfs2.

there is a similar report here:
http://epboven.home.xs4all.nl/gluster-migrate.html
suggesting that the problem was introduced changing from gluster 3.4.0 to gluster 3.4.1

Comment 1 bernhard.glomm 2014-01-24 15:09:36 UTC

at exactly 15:00 UTC I started the migration.
Inside the vm I had 
for i in `seq 1 100`; do echo `date` >> /tmp/test.file; sleep 1; done
runing,
after 7 seconds it returned with:
echo: write error: Read-only file system
I checked all logfiles on both machines but found only this in
srv-vms-mnt_atom01.log (the brick log)
on the receiving side

[2014-01-24 15:00:07.848387] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-glfs_atom01-client-1: remote operation failed: Permission denied. Path: /atom01.img (74885dd0-6ff5-4ded-b5b4-d2f477e5bd6d)
[2014-01-24 15:00:07.848549] W [client-rpc-fops.c:471:client3_3_open_cbk] 0-glfs_atom01-client-0: remote operation failed: Permission denied. Path: /atom01.img (74885dd0-6ff5-4ded-b5b4-d2f477e5bd6d)
[2014-01-24 15:00:07.848590] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341318: WRITE => -1 (Permission denied)
[2014-01-24 15:00:07.849288] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341320: WRITE => -1 (Permission denied)
[2014-01-24 15:00:07.849535] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341322: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.719313] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341324: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.719530] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341326: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.719866] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341328: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.720111] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341330: WRITE => -1 (Permission denied)
[2014-01-24 15:00:12.720360] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 341332: WRITE => -1 (Permission denied)
(END)

Comment 2 bernhard.glomm 2014-01-24 15:20:14 UTC

/var/log/libvirt/qemu/atom01.log

libvirt logs on the sending side:

2014-01-24 14:53:33.052+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name atom01 -S -M pc-i440fx-1.4 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid a20b4267-22b8-fc91-a4ea-938a5ad4a889 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/atom01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vms/mnt_atom01/atom01.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fa:ce:31,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:1,password -vga std -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
W: kvm binary is deprecated, please use qemu-system-x86_64 instead
char device redirected to /dev/pts/2 (label charserial0)
2014-01-24 15:00:03.208+0000: shutting down
qemu: terminating on signal 15 from pid 2170

----------------------

libvirt logs on the receiving side:

2014-01-24 15:00:00.986+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name atom01 -S -M pc-i440fx-1.4 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid a20b4267-22b8-fc91-a4ea-938a5ad4a889 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/atom01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vms/mnt_atom01/atom01.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fa:ce:31,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0,password -vga std -incoming tcp:0.0.0.0:49159 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
W: kvm binary is deprecated, please use qemu-system-x86_64 instead
char device redirected to /dev/pts/4 (label charserial0)

Comment 3 Vijay Bellur 2014-01-26 15:09:40 UTC

Would it be possible to attach tgz of glusterfs log directories from both nodes to this bug report?

Comment 4 Paul Boven 2014-01-27 14:34:35 UTC

Created attachment 856093 [details]
gluster logs during a migration

Attached are logfiles at DEBUG level, made during a migration of the guest 'kvmhost'. The migration itself starts at 14:00:00 and succeeds, but results in the migrated guest not being able to access its image on the destination server. The setup is Ubuntu 13.04, Gluster-3.4.1 from the Ubuntu PPA (semiosis). Also included is the /var/lib/libvirt/qemu/kvmtest.log from the destination server.

Comment 5 bernhard.glomm 2014-01-27 16:49:03 UTC

Created attachment 856149 [details]
gluster logs of migration host1

virsh migrate --verbose --live --p2p --domain atom01 --desturi qemu+ssh://192.168.242.93/system

the diskimage of atom01 resides on glustervolume "glfs_atom01"

root@ping[/0]:~ # gluster volume info glfs_atom01
 
Volume Name: glfs_atom01
Type: Replicate
Volume ID: f28f0f62-37b3-4b10-8e86-9b373f4c0e75
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 172.24.1.11:/ecopool/fs_atom01
Brick2: 172.24.1.13:/ecopool/fs_atom01
Options Reconfigured:
network.remote-dio: enable
storage.owner-uid: 107
storage.owner-gid: 104
diagnostics.client-log-level: DEBUG

root@ping[/0]:~ # id libvirt-qemu
uid=107(libvirt-qemu) gid=104(kvm) groups=104(kvm

Comment 6 bernhard.glomm 2014-01-27 16:50:00 UTC

Created attachment 856150 [details]
gluster logs of migration host2

logs of the receiving host

Comment 7 Paul Boven 2014-01-28 16:33:49 UTC

The libvirt-wiki states that during a migration, libvirt will change the ownership of the guest image, unless it detects that the image is on a shared filesystem. After looking at the code for libvirt, they have code to detect NFS, GFS2 and SMB/CIFS, but not Gluster. As libvirt does not detect that the storage is on a shared file system, the originating host will perform a chown back to root:root at the end of a successful migration, whereas the destination host will do a chown to libvirt-qemu:kvm. This is in fact a race condition, so the difference in behaviour between 3.4.0 and 3.4.1 could be down to timing differences.

http://wiki.libvirt.org/page/Migration_fails_because_disk_image_cannot_be_found

Workaround:
* stop your guests
* stop libvirt-bin
* edit /etc/libvirt/qemu.conf - this contains a commented out entry 'dynamic_ownership=1', which is the default. Change this to 0, and remove the comment.
* Do a chown to libvirt-qemu:kvm for all your stopped images.
* Start the service libvirt-bin again
* Bring up the guests
* Repeat on the other half of your cluster
* Test a live migration - for me, they work again.

You now have to take care of properly setting the ownership of a guest image yourself (presumably only once when you create it).

bernhard.glomm
berrange
bloch
boven
charlesalva
clalancette
crobinso
dyuan
gluster-bugs
gsun
itamar
jforbes
laine
libvirt-maint
pkarampu
sasundar
shyu
vbellur
veillard
virt-maint