Bug 813819

Summary: Unable to disable sending keep-alive messages
Product: Red Hat Enterprise Linux 6 Reporter: Radek Novacek <rnovacek>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: acathrow, ajia, berrange, dallan, dyasny, dyuan, mzhan, ovasik, rwu, whuang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.0-0rc0.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 07:11:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Radek Novacek 2012-04-18 13:37:09 UTC
With update of libvirt in RHEL 6.3 it starts to require sending keep-alive messages. Virt-who doesn't run main loop, so I want to disable sending keep-alive messages. 

Documentation for method "virConnectSetKeepAlive" says:
When interval is <= 0, no keepalive messages will be sent.

So I tried to set negative interval and it fails with:
libvirt.libvirtError: invalid argument: negative or zero interval make no sense

So the actual behaviour contradicts with documentation. And I'm not able to disable sending keep-alive messages.

Notice: This changed in RHEL 6.3, it worked fine in 6.2.

Version-Release number of selected component (if applicable):
libvirt-0.9.10-11.el6.x86_64

Comment 1 Daniel Berrangé 2012-04-18 13:41:52 UTC
This code in libvirt.c is bogus:

    if (interval <= 0) {
        virLibConnError(VIR_ERR_INVALID_ARG,
                        _("negative or zero interval make no sense"));
        goto error;
    }

AFAICT from GIT history this check has been there since day 1, so I'm sceptical that this API worked as you described in 6.2

Comment 2 Daniel Berrangé 2012-04-18 13:42:46 UTC
Also note that if you don't register any main loop impl with libvirt, then it should automatically skip usage of keep alive. The problem only arises if you register a main loop impl, but then don't run it.

Comment 4 Radek Novacek 2012-04-19 06:04:08 UTC
Sorry, I expressed myself incorrectly. The connection dying because the
keep-alive message are not send is new thing in 6.3. Calling
"virConnectSetKeepAlive" is how I tried to fix that.

Comment 5 Daniel Berrangé 2012-04-19 08:29:17 UTC
Does your application register a main loop implementation with libvirt ?  If it does not, then keep alive should not even have activated.

Comment 6 Radek Novacek 2012-04-19 11:52:12 UTC
I think I nailed down the issue. This hasn't changed between 6.2 and 6.3, sorry. It was my fault in mixing threads and forking. I started the event loop before double-fork when starting the daemon and it caused that keep-alive messages get lost.

Nevertheless the documentation is still wrong and should be changed. But that's not 6.3 stuff, feel free to close this bug.

Comment 7 Daniel Berrangé 2012-04-19 13:16:53 UTC
> Nevertheless the documentation is still wrong and should be changed. But that's
> not 6.3 stuff, feel free to close this bug.

Actually the code is wrong, so we should still fix this problem

Comment 8 Peter Krempa 2012-04-26 09:53:11 UTC
Fixed upstream:
commit 6446a9e20cc65561ce6061742baf35a3a63d5ba1
Author: Peter Krempa <pkrempa>
Date:   Tue Apr 24 16:38:41 2012 +0200

    keepalive: Add ability to disable keepalive messages

Comment 9 Huang Wenlong 2012-05-11 06:27:20 UTC
Hi, 

I will try to verify this bug when new build is out so could you provide the method to verfy this bug ? 
thanks very much

Wenlong

Comment 10 dyuan 2012-06-20 07:06:55 UTC
Test with libvirt-0.9.10-11.el6 and libvirt-0.9.10-21.el6:

set interval=-1 in libvirtd.conf
keepalive_interval = -1
keepalive_count = 0

run the event-test.py
# python /usr/share/doc/libvirt-python-0.9.10/events-python/event-test.py 

get the following msg in libvirtd.log:

2012-06-20 06:23:36.547+0000: 21614: debug : virKeepAliveNew:244 : client=0x11c5db0, interval=-1, count=0
2012-06-20 06:23:36.547+0000: 21614: debug : virKeepAliveNew:277 : RPC_KEEPALIVE_NEW: ka=0x11c5b70 client=0x11c5db0 refs=2
2012-06-20 06:23:36.548+0000: 21614: debug : virKeepAliveCheckMessage:408 : ka=0x11c5b70, client=0x11c5db0, msg=0x7f7e4e546010
2012-06-20 06:23:36.550+0000: 21614: debug : virKeepAliveCheckMessage:408 : ka=0x11c5b70, client=0x11c5db0, msg=0x7f7e4e505010
2012-06-20 06:23:36.550+0000: 21616: debug : virKeepAliveStart:346 : Keepalive messages disabled by configuration

Seems that the interval<0 has taken affect but cann't get the "negative or zero interval make no sense".

I also notice that the code was removed in the upstream and " * Note: This API function controls only keepalive messages sent by the client."  for virConnectSetKeepAlive. 
Not sure whether we can test it with event-test.py as the client. Can you help to confirm that how can we trigger this error functionally? then we can reproduce and verify it once it's ON_QA. Thanks.

Comment 12 dyuan 2012-07-24 05:39:29 UTC
Hi, rnovacek

Can you help to provide some steps to reproduce this bug ? thanks.

Comment 13 Radek Novacek 2012-07-24 07:41:41 UTC
To get this "negative or zero interval make no sense" error in non fixed version of libvirt run this python script:

import libvirt
v = libvirt.open("")
v.setKeepAlive(-1, 1)

To test setting positive keep-alive interval you need event loop in client, you can start it with this script:

eventLoopThread = None

def virEventLoopNativeRun():
    import libvirt
    while True:
        libvirt.virEventRunDefaultImpl()

def virEventLoopNativeStart():
    global eventLoopThread
    libvirt.virEventRegisterDefaultImpl()
    eventLoopThread = threading.Thread(target=virEventLoopNativeRun, name="libvirtEventLoop")
    eventLoopThread.setDaemon(True)
    eventLoopThread.start()

virEventLoopNativeStart()

To test if the interval is correct you might try to stop the event loop then wait longer than the keep-alive interval and check if the connection breaks. But I'm not sure it will work, this is only my assumption.

Comment 14 Peter Krempa 2012-07-24 13:20:37 UTC
While developing that functionality I used a trivial libvirt client written in C that enabled keepalives and manually called the event loop a few times and then disabled them and run the event loop again. Unfortunately I can't find it anymore.

To test this you need to disable server side keepalives, as they could interfere with client keepalives.

Comment 16 Huang Wenlong 2012-08-06 05:51:42 UTC
Verify this bug with : 

libvirt-0.10.0-0rc0.el6.x86_64

terminal1: 
# python /usr/share/doc/libvirt-python-*/events-python/event-test.py
Using uri:qemu:///system

terminal2: 

# ./t.py 
libvir: Remote Driver error : internal error the caller doesn't support keepalive protocol; perhaps it's missing event loop implementation
Traceback (most recent call last):
  File "./t.py", line 5, in <module>
    v.setKeepAlive(-1, 1)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 3150, in setKeepAlive
    if ret == -1: raise libvirtError ('virConnectSetKeepAlive() failed', conn=self)
libvirt.libvirtError: internal error the caller doesn't support keepalive protocol; perhaps it's missing event loop implementation


#cat t.py 
#!/usr/bin/python

import libvirt
v = libvirt.open("")
v.setKeepAlive(-1, 1)

Comment 17 errata-xmlrpc 2013-02-21 07:11:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html