Description of problem: If collectd is started when libvirtd is not running it tries to reconnect after increasing timeout intervals. But this is not happening. Version-Release number of selected component (if applicable): 4.5.1-2.1.fc10 How reproducible: Every time Steps to Reproduce: 1. service libvirtd stop 2. service collectd start 3. service libvirtd start Actual results: Step 2 results in: [root@localhost log]# service collectd start Starting collectd: libvir: Remote error : unable to connect to '/var/run/libvirt/libvirt-sock-ro': Connection refused [ OK ] Would be good if that error message was not printed to stdout, since it will retry when the libvirt socket is available /var/log/collectd.log after step2 says: [2009-01-21 16:45:30] connection failed: unable to connect to '/var/run/libvirt/libvirt-sock-ro': Connection refused [2009-01-21 16:45:30] libvirt plugin: Not connected. Use Connection in config file to supply connection URI. For more information see <http://libvirt.org/uri.html> [2009-01-21 16:45:30] read-function of plugin `libvirt' failed. Will suspend it for 10 seconds. After Step 3 the log says: [2009-01-21 16:47:54] libvirt plugin: Not connected. Use Connection in config file to supply connection URI. For more information see <http://libvirt.org/uri.html> [2009-01-21 16:47:54] read-function of plugin `libvirt' failed. Will suspend it for 20 seconds. [2009-01-21 16:48:14] libvirt plugin: Not connected. Use Connection in config file to supply connection URI. For more information see <http://libvirt.org/uri.html> [2009-01-21 16:48:14] read-function of plugin `libvirt' failed. Will suspend it for 40 seconds. ... Even though /var/run/libvirt/libvirt-sock-ro is now present Expected results: After libvirtd is started, the connection should be successful Additional info: Also, if libvirtd is running when collectd is started and then later libvirtd is stopped/started, collectd never reconnects
Agreed, this is going to be a problem. Any idea what the priority of having this fix is for ovirt?
For oVirt this is generally a problem on startup of the Node, since collectd starts right after libvirtd and the libvirtd ro socket is not ready yet on occasion when collectd starts. So it's an intermittent problem. What we can do is put something in rc.local to restart collectd in the hopes that the second time the ro socket will be around. But this is not an optimal solution since there is still a chance that the connection might be dead.
collectd-4.5.4-2.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/collectd-4.5.4-2.fc11
collectd-4.5.4-2.fc11 has been pushed to the Fedora 11 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update collectd'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-8503
collectd-4.5.4-2.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.