Bug 1373859

Summary: Sys::Virt::CLOSE_REASON_EOF in perl-Sys-Virt can not be triggered any more
Product: Red Hat Enterprise Linux 7 Reporter: Dan Zheng <dzheng>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Dan Zheng <dzheng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.3CC: dyuan, jdenemar, rbalakri, weizhan, xuzhang, zpeng
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-3.9.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 10:39:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Zheng 2016-09-07 09:58:12 UTC
Description of problem:

An event with Sys::Virt::CLOSE_REASON_EOF can not triggered any more.

Version-Release number of selected component (if applicable):
libvirt-2.0.0-6.el7.x86_64
qemu-kvm-rhev-2.6.0-22.el7.x86_64
kernel-3.10.0-495.el7.x86_64
perl-Sys-Virt-2.0.0-1.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
In libvirt-tck, there is a script (scripts/event/050-close-callback.t) like below. 
This is to use 'systemctl stop libvirtd' to make perl-Sys-Virt to trigger an event with reason "Sys::Virt::CLOSE_REASON_EOF".  This can work in Rhel 7.2 and before. But now it does not work. It always returns the reason 'Sys::Virt::CLOSE_REASON_ERROR'.

Sys::Virt::Event::register_default();

diag "1. Get Sys::Virt::CLOSE_REASON_EOF";
my $conn = Sys::Virt->new(uri => "qemu:///system", readonly => 1);

$conn->register_close_callback(
    sub {
        my $con = shift ;
        $reason = shift ;
        print "1. Closed reason=$reason\n";
    });

system("systemctl stop libvirtd");

Sys::Virt::Event::run_default();

$conn->unregister_close_callback();

is ($reason, Sys::Virt::CLOSE_REASON_EOF, "Get connect close reason EOF");


Actual results:
Sys::Virt::CLOSE_REASON_ERROR

Expected results:
Sys::Virt::CLOSE_REASON_EOF

Additional info:

It may be related to this commit. 
commit adf3be57df74f9298ce246f5d9ea78fef518104f
Date:   Tue Sep 15 16:45:41 2015 +0200
client rpc: Process pending data on error

Comment 2 Jiri Denemark 2017-05-02 14:56:40 UTC
Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2017-May/msg00036.html

Comment 3 Jiri Denemark 2017-05-03 14:29:52 UTC
Fixed upstream by

commit 42faf316ec9db2a1343088e12b70c2fd3a24cbe8
Refs: [master], [fixes], {origin/master}, {origin/HEAD}, v3.3.0-rc1-5-g42faf316e
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue May 2 16:39:57 2017 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue May 2 18:53:24 2017 +0200

    client: Report proper close reason

    When we get a POLLHUP or VIR_EVENT_HANDLE_HANGUP event for a client, we
    still want to read from the socket to process any accumulated data. But
    doing so inevitably results in an error and a call to
    virNetClientMarkClose before we get to processing the hangup event (and
    another call to virNetClientMarkClose). However the close reason passed
    to the second virNetClientMarkClose call is ignored because another one
    was already set. We need to pass the correct close reason when marking
    the socket to be closed for the first time.

    https://bugzilla.redhat.com/show_bug.cgi?id=1373859

    Signed-off-by: Jiri Denemark <jdenemar>

Comment 5 Dan Zheng 2017-12-07 08:08:11 UTC
Test packages:
perl-Sys-Virt-3.9.0-2.el7.x86_64
libvirt-3.9.0-3.el7.x86_64
qemu-kvm-rhev-2.10.0-8.el7.x86_64


The original problem is resolved, but got another failure that 'Sys::Virt::CLOSE_REASON_ERROR' can not be triggered.

Below is the code which happens to the error.
 
diag "2. Get Sys::Virt::CLOSE_REASON_ERROR";
$conn = Sys::Virt->new(uri => $uri, readonly => 1);
$conn->register_close_callback(
    sub {
        my $con = shift ;
        $reason = shift ;
        print "2. Closed reason=$reason\n";
    });

system("iptables -A INPUT -s $hostip -j DROP && sleep 40 && iptables -D INPUT -s $hostip -j DROP &");
system("sleep 10");

ok_error(sub {$conn->list_domains();}, "I/O error", Sys::Virt::Error::ERR_SYSTEM_ERROR);

$conn->unregister_close_callback();

is ($reason, Sys::Virt::CLOSE_REASON_ERROR, "Get connect close reason ERROR");


Output:
# 2. Get Sys::Virt::CLOSE_REASON_ERROR
2. Closed reason=1
ok 2 - I/O error
not ok 3 - Get connect close reason ERROR
#   Failed test 'Get connect close reason ERROR'
#   at scripts/event/050-close-callback.t line 128.
#          got: '1'                 <=== means Sys::Virt::CLOSE_REASON_EOF
#     expected: '0'                 <=== means Sys::Virt::CLOSE_REASON_ERROR

Jiri, do you think this is side effect of the fix or another bug?

Comment 6 Jiri Denemark 2017-12-07 13:37:15 UTC
I think the current result is correct. The client will be notified the connection has been closed by the server and thus the reported reason will be EOF.

Comment 7 Dan Zheng 2017-12-22 01:32:56 UTC
jiri,

In the past, the working behavior is that  'stop libvirtd' will cause the connection is disconnected with reason Sys::Virt::CLOSE_REASON_EOF and disable network by iptables  will cause the connection is disconnected with reason Sys::Virt::CLOSE_REASON_ERROR which means I/O error. 

But now both scenarios will return with reason Sys::Virt::CLOSE_REASON_EOF. This confuses me. Could you recommend what scenario can trigger the reason with Sys::Virt::CLOSE_REASON_ERROR after this fix? Thanks.

Comment 8 Jiri Denemark 2018-01-19 16:12:36 UTC
I guess the client itself could close the file descriptor libvirt uses for communication with the server. Of course, such behavior would be a very bad bug in a real client, but it's about the only way I can think of which *could* result in I/O error.

Actually, another option could be playing with the target specified to iptables, you could use REJECT with some clever reason. But I'm not sure how the kernel is going to deal with it. If it propagates an error to libvirt or if it just reports the connection as hung up.

Comment 9 Dan Zheng 2018-01-30 08:00:31 UTC
OK, we will find other way to verify this Sys::Virt::CLOSE_REASON_ERROR.
Make it verified as the original problem was fixed.

Comment 13 errata-xmlrpc 2018-04-10 10:39:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704