Bug 1347077

Summary: vhost-user: A socket file is not deleted after VM's port is detached.
Product: Red Hat Enterprise Linux 7 Reporter: Steve Shin <jonshin>
Component: qemu-kvm-rhevAssignee: Marc-Andre Lureau <marcandre.lureau>
Status: CLOSED ERRATA QA Contact: Pei Zhang <pezhang>
Severity: medium Docs Contact:
Priority: high    
Version: 7.2CC: ailan, atragler, chayang, dmaley, editucci, huding, jdonohue, jherrman, john.joyce, joycej, juzhang, knoel, marcandre.lureau, mst, pbonzini, pezhang, victork, virt-maint, xfu, xiywang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: hot
Fixed In Version: qemu-kvm-rhev-2.6.0-12.el7 Doc Type: Bug Fix
Doc Text:
Previously, deleting a guest virtual machine set up as a vhost-user server caused the socket file on the host machine to be preserved. This update adjusts the guest clean-up mechanism to ensure that in the described situation, the socket file is deleted as expected.
Story Points: ---
Clone Of:
: 1351892 (view as bug list) Environment:
Last Closed: 2016-11-07 21:18:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1154739, 1351892, 1364088    

Description Steve Shin 2016-06-16 01:26:57 UTC
Description of problem:
vhost-user: A socket file is not deleted after VM's port is detached. 

Version-Release number of selected component (if applicable):


How reproducible:
Easy to reproduce

Steps to Reproduce:
1. Setup a vhost VM (server mode)
2. Delete a vhost VM
3. Check the socket file


Actual results:
- The socket file still exists.

Expected results:
- The socket file should be deleted after VM's port is detached.

Additional info:

Comment 2 Marc-Andre Lureau 2016-06-16 11:42:16 UTC
Series sent upstream:
https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg04523.html

Comment 3 Ademar Reis 2016-06-21 14:48:13 UTC
v2 of the series:
https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg04743.html

Comment 4 Ademar Reis 2016-06-21 16:27:44 UTC
Last 7.2 scratchbuild available for testing is *-2.3.0-31.el7_2.16, with the
backports of reconnect & socket cleanup:
http://people.redhat.com/~mlureau/bz1322087/

Or from Brew:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11242337

Comment 9 joycej 2016-07-06 17:08:52 UTC
We don't seem to be hitting this issue when testing with: qemu-img-rhev-2.3.0-31.el7_2.16.test.x86_64.rpm
which was provided as a fix for:
  https://bugzilla.redhat.com/show_bug.cgi?id=1322087

My original understanding was there would be two separate patches for these 2 problems.   I the notes above it seems it was plan that 2.16 would solve both issues.   Please confirm.

Comment 10 Ademar Reis 2016-07-07 16:23:03 UTC
(In reply to joycej from comment #9)
> We don't seem to be hitting this issue when testing with:
> qemu-img-rhev-2.3.0-31.el7_2.16.test.x86_64.rpm
> which was provided as a fix for:
>   https://bugzilla.redhat.com/show_bug.cgi?id=1322087

That's expected. The test package includes the temporary fixes for both issues. By temporary, I mean patches that were submitted upstream, but were not reviewed or merged in the official branch.

> My original understanding was there would be two separate patches for these
> 2 problems.   I the notes above it seems it was plan that 2.16 would solve
> both issues.   Please confirm.

That's for the packages that include the final patches (reviewed and merged in our official branch). We already have builds for the reconnect support, but the fd-leak fix (this BZ) is taking longer because during testing a regression on ARM was detected (doesn't impact Cisco, but required a new, different patch).

Comment 11 Miroslav Rezanina 2016-07-08 08:40:08 UTC
Fix included in qemu-kvm-rhev-2.6.0-12.el7

Comment 12 Pei Zhang 2016-07-09 12:14:37 UTC
Reproduced:
Versions:
host:
3.10.0-460.el7.x86_64
qemu-kvm-rhev-2.6.0-11.el7.x86_64

guest:
3.10.0-456.el7.x86_64

Steps:
1. Run a slirp/vlan in a background process
# /usr/libexec/qemu-kvm \
-net none \
-net socket,vlan=0,udp=localhost:4444,localaddr=localhost:5555 \
-net user,vlan=0

2. Start qemu with vhost-user as server mode
# /usr/libexec/qemu-kvm  -m 1024 -smp 2 \
-object memory-backend-file,id=mem,size=1024M,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-chardev socket,id=char0,path=/tmp/vubr.sock,server \
-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-device virtio-net-pci,netdev=mynet1,mac=54:52:00:1a:2c:01 \
/home/pezhang/rhel7.3.qcow2 \
-monitor stdio \
-vga std -vnc :10 \

QEMU waiting for connection on: disconnected:unix:/tmp/vubr.sock,server

3. Start vubr as vhostuser client
# ./vhost-user-bridge -c

4. Check socket file
# ll /tmp/vubr.sock 
srwxr-xr-x 1 root root 0 Jul  9 07:33 /tmp/vubr.sock

5. Shutdown guest
(qemu) system_powerdown 
or in guest:
# shutdown -h now 

6. Check socket file again, it still exists
# ll /tmp/vubr.sock 
srwxr-xr-x 1 root root 0 Jul  9 07:33 /tmp/vubr.sock

So this bug has been reproduced.


Verified: 
Versions:(other versions keep same)
qemu-kvm-rhev-2.6.0-12.el7.x86_64

Steps:
1. Run a slirp/vlan in a background process
2. Start qemu with vhost-user as server mode
3. Start vubr as vhostuser client
4. Check socket file
# ll /tmp/vubr.sock 
srwxr-xr-x 1 root root 0 Jul  9 07:44 /tmp/vubr.sock

5. Shutdown guest
(qemu) system_powerdown : this step cause new issue, I describe it below.
or
in guest:
# shutdown -h now 

6. Check socket file again, it has been deleted.
# ll /tmp/vubr.sock 
ls: cannot access /tmp/vubr.sock: No such file or directory

So I think 'delete issue' has been fixed.  But this fix caused another new issue:
In Step 4, if shutdown guest using "(qemu)system_powerdown", after the guest shutdown, qemu will prompt 'Segmentation fault'.

So I file a new bug to track this new issue:
Bug 1354090 - Boot guest with vhostuser server mode, QEMU prompt 'Segmentation fault' after executing '(qemu)system_powerdown'

Comment 13 Marc-Andre Lureau 2016-07-09 12:40:08 UTC
(In reply to Pei Zhang from comment #12)
> So I file a new bug to track this new issue:
> Bug 1354090 - Boot guest with vhostuser server mode, QEMU prompt
> 'Segmentation fault' after executing '(qemu)system_powerdown'

Yes, the issue was found by Paolo yesterday: http://patchwork.ozlabs.org/patch/646457/

Since the previous patches have been applied, it's probably good to have another bug for the crash.

Comment 15 Pei Zhang 2016-07-22 10:04:56 UTC
Re-verified this bug with qemu-kvm-rhev-2.6.0-15.el7.x86_64. 
Above step1~6 all work as expected, no error occurs. Thank you.

Comment 17 errata-xmlrpc 2016-11-07 21:18:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html