Bug 1243684

Summary: Virsh client doesn't print error message when the connection is reset by server on some ocassion.
Product: Red Hat Enterprise Linux 7 Reporter: Fangge Jin <fjin>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: dyuan, jdenemar, lhuang, lizhu, mzhan, rbalakri
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.3.1-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:19:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Fangge Jin 2015-07-16 06:03:47 UTC
Description:
 Virsh client doesn't print error message when the connection is reset by server on some ocassion.

Version:
libvirt-1.2.17-2.el7.x86_64
qemu-kvm-rhev-2.3.0-9.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
0.Prepare two hosts: server and client.

1.On clinet:
# virsh -k0 -c qemu+ssh://10.66.6.6/system
root.6.6's password:
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh #

2.Set iptables rule on client:
# iptables -A INPUT -s 10.66.6.6 -j DROP

3.Within 30s, issue command "list" on client, it will hang:
virsh # list

4.After 30s when the connection has been reset by server, clear iptables rule on client:
# iptables -F

5.Wait several minutes, "list" will return without error:
virsh # list
 Id    Name                           State
----------------------------------------------------

6."list" again, it will reconnect to the server and list the running domain on server.
virsh # list
root.6.6's password:
error: Reconnected to the hypervisor
 Id    Name                           State
----------------------------------------------------
 47    rhel7-3                        running
 48    rhel7-4                        running
 51    rhel7-1                        running


Actual results:
In step5, "list" return without error.

Expected results:
In step5, "list" should return with the following error:
# list
error: Failed to list domains
error: Cannot recv data: Ncat: Broken pipe.: Connection reset by peer

Additional info:
If I "list" AFTER 30s instead of within 30s in step3, "list" can return with the expected error in step5.

Comment 3 Jiri Denemark 2016-03-03 12:18:58 UTC
This should be fixed upstream since v1.2.19-114-g035947e:

commit 035947eb8743306a421b3c71f26c75f749000dfc
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Sep 15 16:46:07 2015 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Wed Sep 23 13:09:50 2015 +0200

    virsh: Notify users about disconnects
    
    After my "client rpc: Report proper error for keepalive disconnections"
    patch, virsh would no long print a warning when it closes a connection
    to a daemon after a keepalive timeout. Although the warning
    
        virsh # 2015-09-15 10:59:26.729+0000: 642080: info :
        libvirt version: 1.2.19
        2015-09-15 10:59:26.729+0000: 642080: warning :
        virKeepAliveTimerInternal:143 : No response from client
        0x7efdc0a46730 after 1 keepalive messages in 2 seconds
    
    was pretty ugly, it was still useful. This patch brings the useful part
    back while making it much nicer:
    
    virsh # error: Disconnected from qemu:///system due to keepalive timeout
    
    Signed-off-by: Jiri Denemark <jdenemar>

Comment 5 Lili Zhu 2016-09-04 13:50:08 UTC

When I execute step5, "list" does return an error:
error: Disconnected from qemu+ssh://10.66.4.117/system due to I/O error

According to your comment, it should report an error like this:
error: Disconnected from qemu:///system due to keepalive timeout


I do not know why. Could you please check whether this error is acceptable?

Comment 6 Jiri Denemark 2016-09-05 11:30:53 UTC
Yeah, both errors are OK.

Comment 7 Lili Zhu 2016-09-06 12:31:45 UTC
Reproduce with libvirt-1.2.17-13.el7_2.5.x86_64 

Verified with the packages:
libvirt-2.0.0-6.el7.x86_64

Test steps:
Steps to Reproduce:
0.Prepare two hosts: server and client.

1.On client:
# virsh -k0 -c qemu+ssh://10.66.4.117/system
root.4.117's password:
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh #

2.Set iptables rule on client:
# iptables -A INPUT -s 10.66.4.117 -j DROP

3.Within 30s, issue command "list" on client, it will hang:
virsh # list

4.After 30s when the connection has been reset by server, clear iptables rule on client:
# iptables -F

5.Wait several minutes, 
virsh # list


Test results:
error: Disconnected from qemu+ssh://10.66.4.117/system due to I/O error

Virsh client does print error message when the connection is reset by server before the server's keepalive times out.

Comment 9 errata-xmlrpc 2016-11-03 18:19:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html