Bug 816465 - libvirtd may die after restart the service
Summary: libvirtd may die after restart the service
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.3
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-26 07:21 UTC by yanbing du
Modified: 2012-06-20 06:57 UTC (History)
9 users (show)

Fixed In Version: libvirt-0.9.10-17.el6
Doc Type: Bug Fix
Doc Text:
(This is a bug in new functionality, and has never been in a released version of libvirt.)
Clone Of:
Environment:
Last Closed: 2012-06-20 06:57:42 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0748 0 normal SHIPPED_LIVE Low: libvirt security, bug fix, and enhancement update 2012-06-19 19:31:38 UTC

Description yanbing du 2012-04-26 07:21:13 UTC
Description of problem:
In few cases, after restart libvirtd service, libvirtd may die. And there's an error message in libvirtd log:
error : virNetlinkEventServiceStart:359 : No error message provided

Version-Release number of selected component (if applicable):
libvirt-0.9.10-14.el6.x86_64
qemu-kvm-0.12.1.2-2.277.el6.x86_64
kernel-2.6.32-262.el6.x86_64

How reproducible:
1/10

Steps to Reproduce:
1. Prepare a running domain
# virsh list
 Id    Name                           State
----------------------------------------------------
 100   vr-rhel6u2-x86_64-kvm          running

2. Run the script(During restart the libvirtd service, do some actions, like save/restore a domain)

# cat repro.sh
#! /bin/bash

for i in {1..100}; do
    service libvirtd restart
    virsh save vr-rhel6u2-x86_64-kvm /tmp/test.save
    virsh restore /tmp/test.save
    service libvirtd status
done 
  
Actual results:
Sometimes, step2 ouput :
libvirtd (pid  9244) is running...
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]
error: Failed to reconnect to the hypervisor
error: failed to get domain 'vr-rhel6u2-x86_64-kvm'
error: invalid connection pointer in virDomainLookupByName
error: Failed to reconnect to the hypervisor

error: Failed to reconnect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

libvirtd dead but subsys locked
Stopping libvirtd daemon:                                  [FAILED]
Starting libvirtd daemon:                                  [  OK  ]


Expected results:
libvirtd can always restart well.

Additional info:
libvirtd.log
------
2012-04-26 07:00:13.606+0000: 9410: info : libvirt version: 0.9.10, package: 14.el6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2012-04-24-00:57:34, x86-004.build.bos.redhat.com)
2012-04-26 07:00:13.606+0000: 9410: error : virNetlinkEventServiceStart:359 : No error message provided
------

Comment 2 Laine Stump 2012-04-26 20:05:52 UTC
The first problem is that the error message isn't informative. That is fixed with

commit 9586925bacc9f37a00c6afb5bb376f9025214898
Author: Laine Stump <laine@laine.org>
Date:   Thu Apr 26 14:35:26 2012 -0400

    util: fix error messages in virNetlinkEventServiceStart
    
    Some of the error messages in this function should have been
    virReportSystemError (since they have an errno they want to log), but
    were mistakenly written as netlinkError, which expects a libvirt error
    code instead. The result was that when one of the errors was
    encountered, "No error message provided" would be printed instead of
    something meaningful.

Fixing this allows us to see the real error message:

error : virNetlinkEventServiceStart:359 : cannot connect to netlink socket:
                                          Address already in use

This is occurring during the following code:

    srv->netlinknh = nl_handle_alloc();

    if (!srv->netlinknh) {
        virReportSystemError(errno,
                             "%s", _("cannot allocate nlhandle for virNetlinkEvent server"));
        goto error_locked;
    }

    if (nl_connect(srv->netlinknh, NETLINK_ROUTE) < 0) {
        virReportSystemError(errno,
                             "%s", _("cannot connect to netlink socket"));
        goto error_server;
    }

It's the call to nl_connect that's returning the error. Since it happens immediately on startup, but doesn't happen every time, it doesn't appear to be a resource leak.

As a matter of fact, this is starting to smell more an more like a repeat of Bug 677724.

Comment 3 Laine Stump 2012-04-27 20:08:24 UTC
The failure within nl_connect() is coming from a call to bind(), which returns EADDRINUSE. I looked at the kernel source for netlink_bind(), and it will only return EADDRINUSE if the hashtable of currently bound AF_NETLINK sockets already contains an entry that exactly matches the nl_pid that is passed in the nl_sockaddr struct.

In the case of the bind that comes from nl_connect(), this nl_pid is set by nl_handle_alloc() (which is called by libvirt just before nl_connect()), and it is set to the pid of the process, + "some number between 0 & 1024" << 22. The first nl_handle_alloc() after a process starts will end up with nl_pid == the process pid, the 2nd will get "process pid" + (1 << 22), etc (until some nl_handle is closed/destroyed).

I instrumented both the libvirt and libnl sources, and found that in this case that is failing, 1) the nl_pid sent to bind() is always == the process' pid, and 2) it is the first call to nl_connect from libvirt itself (and apparently from any libraries libvirt is linking in - I also instrumented the call to the openDriver call for netcf, which is the only other library used by libvirt that uses libnl, and it is called much later).

Based on the fact that this is the first call to libnl by this process, and an examination of the bind implementation that shows it will only return EADDRINUSE if there is an entry with an exactly matching pid, I decided to try out the hypothesis that there is some other code called during libvirt startup that is directly doing a bind() of an AF_NETLINK socket, without going through libnl. It would actually make sense for such a call to put the process' pid into the nl_sockaddr, which would create a conflict. My test was to modify the code in libnl to always skip over the value  pid + (0 << 22), and always start with "pid + (1 << 22)". Without this change, 100 libvirtd restarts would consistently result in 8-10 failures. With the change, I was unable to reproduce the failure.

Of course the only thing this proves is that the failure really is due to finding a matching nl_pid in the hashtable. I haven't yet been able to determine why.

Comment 4 Laine Stump 2012-04-27 21:48:31 UTC
I just did a different test - after changing libnl back to its original state (starts binding with "pid + (0 << 22)"), I modified the code in libvirt to sleep for 5 seconds and retry the nl_connect once if it failed. In my test run of 100 libvirtd restarts, this produced 15 failures of the initial nl_connect() attempt, but 0 failures on the 2nd attempt.

Although I don't have direct evidence, this, along with the previous test, tells me that there is probably some library used by libvirt that temporarily does a bind to an AF_NETLINK socket (*not* through libnl), thus occupying the slot for "pid"; in the meantime, libnl doesn't know about it (yet believes that it has full control over all use of AF_NETLINK sockets). *Most* of the time, this phantom piece of code closes its socket before  virNetlinkEventServiceStart() is called, so there is no conflict, but sometimes it wins the race, attempts to bind to the same address and fails.

We can work around this in libvirt in a couple of different ways:

1) put a short sleep + retry in for cases when the initial nl_connect fails.

2) Move the call to virNetlinkEventServiceStart() up to before any other threads are started in main().

Of course neither of these methods would be of any use if we ever happened to link to a library that used bind(AF_NETLINK,...) for a socket that it kept open long-term. This points out that the real problem is in the fact that libnl believes it has exclusive use of AF_NETLINK sockets, but it doesn't. assigning an nl_pid to a particular nlhandle shouldn't be done until bind is called during nl_connect() (the problem is that this would change the semantics of nl_socket_get_local_port(), which expects nl_pid to be set as soon as the nlhandle is allocated :-( ).

Comment 5 Laine Stump 2012-05-01 17:49:17 UTC
Experimenting with option (2) above shows that it isn't feasible, because the EventHandler code is used by virNetlinkEventServiceStart, and its callback functions aren't registered until virNetServerNew() is called, and that's also where the other threads are started up. So, if we want to keep the call to virNetlinkEventServiceStart() in main(), calling it prior to virNetServerNew() is too early, but by the time virNetServerNew() returns, whatever other thread is creating the temporary competition for the netlink socket with nl_pid of "libvirtd's pid" will already be started.

On the other hand, option (1) with a sleep time of 500msec achieves a 100% restart success rate in my tests (in the same test, 21 of 100 restarts failed the first call to nl_connect()). So doing a 1-2 second sleep seems like a fairly reliable fix.

One other option I hadn't mentioned before is:

3) prior to allocating the nl_handle that virNetlinkEventServiceStart actually uses, allocate an extra nl_handle that we never attempt to connect. Internally to libnl, this has the effect of reserving the address for "pid of libvirtd", but not attempting to actually bind to it. My experiments have shown that this is also 100% successful in eliminating the failures, without needing a sleep or retry.

I'm cleaning up both versions of patch right now, and will send them upstream for comment.

Comment 7 Laine Stump 2012-05-07 08:31:19 UTC
An update: 

I did find the "other library code" that was binding to a netlink socket directly (not through libnl) - is the glibc function getaddrinfo. See the description here:

  https://www.redhat.com/archives/libvir-list/2012-May/msg00202.html
  https://www.redhat.com/archives/libvir-list/2012-May/msg00238.html

Once that was discovered, it became apparent that the workaround option (2) above is only succeeding by coincidence, and that the only viable alternative is (3).

However, testing by IBM showed that method (3) by itself wasn't enough, because lldpad (used for 802.1Qbg) sends unsolicited netlink messages to libvirtd using the nl_pid given as the source in messages previously received from libvirtd, and libvirtd's virNetlinkCommand() has been hardcoding that field to getpid() in all outgoing messages; the result is that, with a patch for (3) applied, lldpad->libvirtd communication is broken.

To fix this, I expanded the patchset to allow specifying a non-default nl_pid in outgoing netlink messages:

  https://www.redhat.com/archives/libvir-list/2012-May/msg00340.html

Preliminary testing by IBM shows that lldpad->libvirtd communication works properly with this new patch series, but I'm still awaiting a final confirmation.

Comment 8 Laine Stump 2012-05-07 18:31:25 UTC
IBM has confirmed that lldpad<-->libvirtd communication still works properly with these patches in place, and my own testing shows 100% success rate in libvirtd start/restart. The following patches were just pushed upstream:

commit 642973135c54b93242c4548ef27d591b52b0994c
Author: Laine Stump <laine@laine.org>
Date:   Thu May 3 10:39:04 2012 -0400

    util: fix libvirtd startup failure due to netlink error
    
commit cca7bb1fb583459824f7a42be0406d0833a80593
Author: Laine Stump <laine@laine.org>
Date:   Fri May 4 12:58:36 2012 -0400

    util: allow specifying both src and dst pid in virNetlinkCommand

commit c99e93758d44e7c6d62dbd3d064b6df61d1fefb2
Author: Laine Stump <laine@laine.org>
Date:   Fri May 4 13:19:51 2012 -0400

    util: function to get local nl_pid used by netlink event socket
    
commit cc0737713a8e1ad37fc9d5393b4ea514f7708138
Author: Laine Stump <laine@laine.org>
Date:   Fri May 4 13:48:20 2012 -0400

    util: set src_pid for virNetlinkCommand when appropriate

Comment 10 Laine Stump 2012-05-08 20:32:56 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
(This is a bug in new functionality, and has never been in a released version of libvirt.)

Comment 12 hongming 2012-05-11 03:06:27 UTC
Version
libvirt-0.9.10-18.el6.x86_64
qemu-kvm-0.12.1.2-2.277.el6.x86_64


Steps to verify: 
1. # virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     rhel6u3                        running

2. # cat repro.sh
#!/bin/bash 
for i in {1..100}; do
    service libvirtd restart
    virsh save rhel6u3 /tmp/test.save
    virsh restore /tmp/test.save
    service libvirtd status
done 


3. sh repro.sh

Results
libvirtd can always restart well. There is no any error occurs.

Comment 14 errata-xmlrpc 2012-06-20 06:57:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html


Note You need to log in before you can comment on or make changes to this bug.