Bug 1326270

Summary: Migration failed when setting vnc_auto_unix_socket = 1
Product: Red Hat Enterprise Linux 7 Reporter: Fangge Jin <fjin>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: dyuan, mkletzan, mzhan, rbalakri, yafu, zhanghm.zhm, zhanghongming, zpeng
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.3.4-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:42:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirtd log on both the source host and target host, and qemu log on target host none

Description Fangge Jin 2016-04-12 09:55:11 UTC
Created attachment 1146316 [details]
libvirtd log on both the source host and target host, and qemu log on target host

Description of problem:
1) Migration failed when setting vnc_auto_unix_socket = 1 and current occupied domain ids on source host and target host are different.(e.g. the max occupied domain id is 4 on source host, while the max occupied domain id is 1 on target host)
# virsh migrate rhel7.2-1030 qemu+ssh://10.66.4.113/system --live --verbose
error: internal error: process exited while connecting to monitor: 2016-04-12T07:54:18.328793Z qemu-kvm: -vnc unix:/var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: Failed to start VNC server: Failed to bind socket to /var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: No such file or directory

2) Restore a managed saved domain failed.
# virsh managedsave rhel7.2-1030
# virsh start rhel7.2-1030
error: Failed to start domain rhel7.2-1030
error: internal error: process exited while connecting to monitor: 2016-04-12T08:35:53.536685Z qemu-kvm: -vnc unix:/var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: Failed to start VNC server: Failed to bind socket to /var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: No such file or directory

Version-Release number of selected component:
libvirt-1.3.3-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
0. Set vnc_auto_unix_socket = 1 on both source and target hosts, restart libvirtd service.
Start different number of guests on source and target hosts, to make the current occupied domain id different, e.g.
start three guests on source host, and start one guest on target host.

1. Set guest graphic type=vnc:
# virsh dumpxml rhel7.2-1030
...
    <graphics type='vnc' port='-1' autoport='yes'/>
...

2. Start guest rhel7.2-1030 on source host(its domain id will be 4) and migrate it to target host(on target, domain id of guest rhel7.2 will be 2):
# virsh migrate rhel7.2-1030 qemu+ssh://10.66.4.113/system --live --verbose
error: internal error: process exited while connecting to monitor: 2016-04-12T07:54:18.328793Z qemu-kvm: -vnc unix:/var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: Failed to start VNC server: Failed to bind socket to /var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: No such file or directory

3. Do managedsave and restore:
# virsh managedsave rhel7.2-1030
# virsh start rhel7.2-1030
error: Failed to start domain rhel7.2-1030
error: internal error: process exited while connecting to monitor: 2016-04-12T08:35:53.536685Z qemu-kvm: -vnc unix:/var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: Failed to start VNC server: Failed to bind socket to /var/lib/libvirt/qemu/domain-4-rhel7.2-1030/vnc.sock: No such file or directory


Actual results:
Migration failed.
Restore a managed saved domain failed.

Expected results:
Migration can succeed.
And domain can restore from managed saved state successfully.

Comment 2 Martin Kletzander 2016-04-26 14:34:31 UTC
Fix posted upstream:

https://www.redhat.com/archives/libvir-list/2016-April/msg01733.html

Comment 3 Martin Kletzander 2016-04-28 14:20:15 UTC
Fixed upstream with v1.3.4-rc1-6-g55320c23dd16:
commit 55320c23dd163e75eb61ed6bea2f339ccfeff4f9
Author: Martin Kletzander <mkletzan>
Date:   Tue Apr 26 14:27:16 2016 +0200

    qemu: Regenerate VNC socket paths

Comment 5 yafu 2016-06-02 06:54:46 UTC
Reproduce on build libvirt-1.3.3-1.el7.x86_64, verify pass on build libvirt-1.3.4-1.el7.x86_64.

Steps:
scenario1:
0. Set vnc_auto_unix_socket = 1 on both source and target hosts, restart libvirtd service.
Start different number of guests on source and target hosts, to make the current occupied domain id different, e.g.
start three guests on source host, and start one guest on target host.

1. Prepare a runnig guest with graphic type=vnc:
# virsh dumpxml rhel7.2
...
    <graphics type='vnc' socket='/var/lib/libvirt/qemu/domain-3-rhel7.2/vnc.sock'>
...

2.Do migrate:
#virsh migrate rhel7.2 qemu+ssh://10.66.4.148/system --live --verbose

3.After migration completed, check the vnc socket paths in the target host:
#virsh dumpxml rhel7.2 | grep vnc
...
  <graphics type='vnc' socket='/var/lib/libvirt/qemu/domain-2-rhel7.2/vnc.sock'>
...

scenario2:
1.set vnc_auto_unix_scoket=1 and restart libvirtd service;

2. Prepare a runnig guest with graphic type=vnc:
# virsh dumpxml rhel7.2
...
    <graphics type='vnc' socket='/var/lib/libvirt/qemu/domain-3-rhel7.2/vnc.sock'>
...

3.Do managedsave:
#virsh managedsave rhel7.2
Domain rhel7.2 state saved by libvirt

4.Start the guest:
#virsh start rhel7.2
Domain rhel7.2 started

5.Check the vnc socket paths:
#virsh dumpxml rhel7.2 
...
    <graphics type='vnc' socket='/var/lib/libvirt/qemu/domain-4-rhel7.2/vnc.sock'>
...

Comment 7 errata-xmlrpc 2016-11-03 18:42:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html