+++ This bug was initially created as a clone of Bug #987555 +++ Description of problem: Starting with GlusterFS 3.4, glusterfsd uses the IANA defined ephemeral port range (49152 and upward). If you happen to use the same network for storage and qemu-kvm live migration, sometimes you get a port conflict, and live migration aborts Here's a log of a failed live migration on the destination host: 2013-07-23 15:54:32.619+0000: starting up LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name ipasserelle QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name ipasserelle -S -M-M rhel6.4.0 -enable-kvm -m 20482048 -smp-smp 2,sockets=2,cores=1,threads=1 -uuid 8505958b-8227-0a46-91a7-41d3247544e2 -nodefconfig-nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/ipasserelle.monitor,server,nowait -mon-mon chardev=charmonitor,id=monitor,mode=controlchardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/gluster/ipasserelle.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=nonefile=/var/lib/libvirt/images/gluster/ipasserelle.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=rawif=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev-netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:2b:14:d7,bus=pci.0,addr=0x3 -netdev-netdev tap,fd=24,id=hostnet1,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:6d:4f:52,bus=pci.0,addr=0x4 -chardev pty,id=charserial0pty,id=charserial0 -device-device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc-vnc 127.0.0.1:0127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x5 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -incoming tcp:[::]:49152 -device-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 char device redirected to /dev/pts/2 inet_listen_opts: bind(ipv6,::,49152): Address already in use inet_listen_opts: FAILED Migrate: Failed to bind socket Migration failed. Exit code tcp:[::]:49152(-1), exiting. 2013-07-23 15:54:33.016+0000: shutting down [root@dd9 ~]# netstat -laputen | grep :49152 tcp 0 0 0.0.0.0:49152 0.0.0.0:* LISTEN 0 82349 1927/glusterfsd tcp 0 0 127.0.0.1:1015 127.0.0.1:49152 ESTABLISHED 0 82555 1952/glusterfs tcp 0 0 10.90.25.138:49152 10.90.25.137:1016 ESTABLISHED 0 82473 1927/glusterfsd tcp 0 0 10.90.25.138:1021 10.90.25.137:49152 ESTABLISHED 0 82344 1952/glusterfs tcp 0 0 127.0.0.1:49152 127.0.0.1:1008 ESTABLISHED 0 82725 1927/glusterfsd tcp 0 0 127.0.0.1:49152 127.0.0.1:1015 ESTABLISHED 0 82556 1927/glusterfsd tcp 0 0 10.90.25.138:49152 10.90.25.137:1010 ESTABLISHED 0 89092 1927/glusterfsd tcp 0 0 127.0.0.1:1008 127.0.0.1:49152 ESTABLISHED 0 82724 2069/glusterfs tcp 0 0 10.90.25.138:1018 10.90.25.137:49152 ESTABLISHED 0 82784 2115/glusterfs The exact same setup with GlusterFS 3.3.2 is working like a charm Version-Release number of selected component (if applicable): Host is CentOS 6.4 x86_64 gluster 3.4.0-2 (glusterfs glusterfs-server glusterfs-fuse), from the gluster.org RHEL repo libvirt 0.10.2-18 qemu-kvm-rhev 0.12.1.2-2.355.el6.5 How reproducible: Not always, but frequently enough Steps to Reproduce: - Two hosts with a replicated glusterFS volume (both are gluster server and client) - Libvirt on both nodes - One private network used for gluster and live migration - while glusterFS is working, try to live migrate a qemu-kvm VM, using the standard migration (virsh migrate --live vm qemu+ssh://user@other_node/system) - From time to time (not always), the migration will fail because the qemu process on the destination host cannot bind to the choosed port Actual results: Live migration fails Expected results: Live migration shouldn't be bothered by Gluster Additional info: An option to configure the first port, or the port range used by Gluster would avoid this situation --- Additional comment from Daniel on 2013-07-24 05:41:31 EDT --- Just one more info: I have three GlusterFS volumes between the two nodes, and the first three migrations fail. As qemu (or libvirt, not sure which one chooses the incomming migration port) increment the port number at each migration attempt, the fourth migration succeed (and the following migrations succeed too) --- Additional comment from Caerie Houchins on 2013-10-02 17:18:14 EDT --- We just hit this bug in a new setup today. Verifying this still exists. qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64 glusterfs-3.4.0-8.el6.x86_64 CentOS release 6.4 (Final) Linux SERVERNAME 2.6.32-358.18.1.el6.x86_64 #1 SMP Wed Aug 28 17:19:38 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux --- Additional comment from Gianluca Cecchi on 2013-10-09 18:52:29 EDT --- Same problem with oVirt 3.3 and fedora 19 as Hypervisors. See here: http://wiki.libvirt.org/page/FAQ the range 49152-49215 already used by libvirt years before the Gluster change from 3.3 to 3.4... How could you miss it and worse not able to change at least for 3.4.1 as this bugzilla was opened in July? At least could you provide a way to configure gluster to use another range so that if two nodes are both servers and client they can use another range? You are limiting GlusterFS adoption itself as noone would implement oVirt on GlusterFS without migration available... Thanks for reading Gianluca --- Additional comment from Kaleb KEITHLEY on 2013-10-11 07:54:00 EDT --- Out of curiosity, why isn't this a bug in qemu-kvm? Shouldn't qemu-kvm be trying another port if 49152 (or any other port) is in use? And using portmapper to register the port it does end up using? --- Additional comment from Anand Avati on 2013-10-11 08:07:18 EDT --- REVIEW: http://review.gluster.org/6076 (xlators/mgmt/glusterd: ports conflict with qemu live migration) posted (#1) for review on release-3.4 by Kaleb KEITHLEY (kkeithle) --- Additional comment from Gianluca Cecchi on 2013-10-11 08:34:15 EDT --- From http://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers " Dynamic, private or ephemeral ports[edit] The range 49152–65535 (215+214 to 216−1) – above the registered ports – contains dynamic or private ports that cannot be registered with IANA.[133] This range is used for custom or temporary purposes and for automatic allocation of ephemeral ports. " [133] https://tools.ietf.org/html/rfc6335 If a new projyect starts to use a range, in my opinion has to consider that it is not the only project in the world and/or for the future.... ;-) Why libvirt and GlusterFS could not reserve via IANA so that /etc/services could be updated and other projects before set new range can query current status? It seems very like the 192.168.1.x private network used by every one. Latest reserved port 49151 and so why not? Start at 49152... ;-) There are quite several ranges up to 65535 or not? Just my two eurocent --- Additional comment from Gianluca Cecchi on 2013-10-11 08:35:59 EDT --- So as 49152-65535 cannot be registered with IANA why not try any range below 49151 that is still free or ask to IANA to extend or at least coordinate in an way to not overlap? Thanks, Gianluca
This is now fixed by v1.1.3-188-g0196845 and v1.1.3-189-ge3ef20d: commit 0196845d3abd0d914cf11f7ad6c19df8b47c32ed Author: Wang Yufei <james.wangyufei> Date: Fri Oct 11 11:27:13 2013 +0800 qemu: Avoid assigning unavailable migration ports https://bugzilla.redhat.com/show_bug.cgi?id=1019053 When we migrate vms concurrently, there's a chance that libvirtd on destination assigns the same port for different migrations, which will lead to migration failure during prepare phase on destination. So we use virPortAllocator here to solve the problem. Signed-off-by: Wang Yufei <james.wangyufei> Signed-off-by: Jiri Denemark <jdenemar> commit e3ef20d7f7fee595ac4fc6094e04b7d65ee0583a Author: Jiri Denemark <jdenemar> Date: Tue Oct 15 15:26:52 2013 +0200 qemu: Make migration port range configurable https://bugzilla.redhat.com/show_bug.cgi?id=1019053