Description of problem: In the latest vdsm build from git (vdsm-4.10.0-0.452.git87594e3.fc17.x86_64, F17), vdsmd.service never starts alone after rebooting. I have had a look to journalctl anfd I've found this : systemd-vdsmd[538]: vdsm: Failed to define network filters on libvirt[FAILED] [root@node ~]# service vdsmd status Redirecting to /bin/systemctl status vdsmd.service vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled) Active: failed (Result: exit-code) since Fri, 21 Sep 2012 12:13:01 +0200; 4min 56s ago Process: 543 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=1/FAILURE) CGroup: name=systemd:/system/vdsmd.service Sep 21 12:12:55 node.abes.fr systemd-vdsmd[543]: Note: Forwarding request to 'systemctl disable libvirt-guests.service'. Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: vdsm: libvirt already configured for vdsm [ OK ] Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting wdmd... Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting sanlock... Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting iscsid: Sep 21 12:13:01 node.abes.fr systemd-vdsmd[543]: Starting libvirtd (via systemctl): [ OK ] Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1.regestering a vm with ovirt-engine 2.node rebooting at the end of the bootstrap 3.when rebooting, vdsmd have not started Actual results: no vdsmd processus, node isn't present in the node list Expected results: node is up and ready to run vms Additional info: network is okay, when manually starting the daemon, it goes up I have seen that bz : vdsmd: set nwfilter on ovirt-node ovirt-node is shipped with .pyc only. Do not try running a missing executable. I'm not sure it is the same, when running manually python /usr/share/vdsm/nwfilter.pyc, it returns nothing, so must not be an error there. I guess it could comme from dependant daemon that doesn't start when vdsmd needs it. that could explain why we can successfully start it manually after the end of the boot...
Can you spot any other vdsm-related stuff on journalctl? Would you edit your /lib/systemd/systemd-vdsmd and remove the /dev/null-throwing parts from python /usr/share/vdsm/nwfilter.pyc > /dev/null 2>&1 and retry so we can have a better guess what's broken.
I have redirected python /usr/share/vdsm/nwfilter.pyc to /tmp/nwfilter.log but after rebooting, the file is empty as well. Executing python /usr/share/vdsm/nwfilter.pyc returns nothing, so it is what expected. nevertheless, I've found some more informations in journalctl this morning : Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: Starting libvirtd (via systemctl): [ OK ] Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: libvir: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: Traceback (most recent call last): Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/share/vdsm/nwfilter.py", line 83, in <module> Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: main() Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/share/vdsm/nwfilter.py", line 32, in main Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: conn = libvirtconnection.get() Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 113, in get Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: conn = libvirt.openAuth('qemu:///system', auth, 0) Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 102, in openAuth Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: if ret is None:raise libvirtError('virConnectOpenAuth() failed') Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: libvirt.libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: vdsm: Failed to define network filters on libvirt[FAILED] seems to be a libvirtd sock issue, but when regarding to this file, nothing seems to be anormal : [root@node ~]# ll /var/run/libvirt/libvirt-sock srwxrwx---. 1 root kvm 0 Sep 24 10:03 /var/run/libvirt/libvirt-sock and manually rebooting vdsmd always successes
Hi, Thanks for your report. There are patches upstream (below) available for review which should fix your issue. Fell free to make a comment there to speed up them to get merged. vdsmd: await for libvirt with systemd, too http://gerrit.ovirt.org/#/c/8162/ vdsmd.init: verify if libvirt socket file exists http://gerrit.ovirt.org/#/c/8175/
Tested executed: Cloning vdsm source code: # git clone git://gerrit.ovirt.org/vdsm Building the rpms (Without patches): # cd vdsm # ./autogen.sh --system && make && make rpm # cd ~/rpmbuild/RPMS/x86_64/ # rpm -ivh vdsm-python-4.10.0-0.468.git5ac6d2c.fc17.x86_64.rpm # rpm -ivh vdsm-4.10.0-0.468.git5ac6d2c.fc17.x86_64.rpm == Reboot the system == After the reboot, check vdsmd status: # systemctl status vdsmd.service (it should show as failed) To fix, apply the below patches [1], rebuild/reinstall the rpm and reboot the system, vdsmd will be start correctly. [1] vdsmd: await for libvirt with systemd, too http://gerrit.ovirt.org/#/c/8162/ vdsmd.init: verify if libvirt socket file exists http://gerrit.ovirt.org/#/c/8175/
Suggested by Moti: http://gerrit.ovirt.org/#/c/8245/4 Merged to upstream, Moved to ON_QA