Bug 859483 - vdsmd.service never starts alone after rebooting.
Summary: vdsmd.service never starts alone after rebooting.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.1 GA
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.2
Assignee: Douglas Schilling Landgraf
QA Contact: Haim
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-21 16:09 UTC by exploit
Modified: 2014-01-13 00:54 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-15 06:46:20 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)

Description exploit 2012-09-21 16:09:34 UTC
Description of problem:
In the latest vdsm build from git (vdsm-4.10.0-0.452.git87594e3.fc17.x86_64, F17), vdsmd.service never starts alone after rebooting.
I have had a look to journalctl anfd I've found this :

systemd-vdsmd[538]: vdsm: Failed to define network filters on libvirt[FAILED]

[root@node ~]# service vdsmd status
Redirecting to /bin/systemctl  status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
          Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
          Active: failed (Result: exit-code) since Fri, 21 Sep 2012 12:13:01 +0200; 4min 56s ago
         Process: 543 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=1/FAILURE)
          CGroup: name=systemd:/system/vdsmd.service

Sep 21 12:12:55 node.abes.fr systemd-vdsmd[543]: Note: Forwarding request to 'systemctl disable libvirt-guests.service'.
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: vdsm: libvirt already configured for vdsm [  OK  ]
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting wdmd...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting sanlock...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting iscsid:
Sep 21 12:13:01 node.abes.fr systemd-vdsmd[543]: Starting libvirtd (via systemctl):  [  OK  ] 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.regestering a vm with ovirt-engine
2.node rebooting at the end of the bootstrap
3.when rebooting, vdsmd have not started
  
Actual results:
no vdsmd processus, node isn't present in the node list

Expected results:
node is up and ready to run vms

Additional info:
network is okay, when manually starting the daemon, it goes up
I have seen that bz :  vdsmd: set nwfilter on ovirt-node

    ovirt-node is shipped with .pyc only. Do not try running a missing
    executable. 
I'm not sure it is the same, when running manually python /usr/share/vdsm/nwfilter.pyc, it returns nothing, so must not be an error there.
I guess it could comme from dependant daemon that doesn't start when vdsmd needs it. that could explain why we can successfully start it manually after the end of the boot...

Comment 1 Dan Kenigsberg 2012-09-23 11:40:41 UTC
Can you spot any other vdsm-related stuff on journalctl?

Would you edit your /lib/systemd/systemd-vdsmd and remove the /dev/null-throwing parts from

 python /usr/share/vdsm/nwfilter.pyc > /dev/null 2>&1

and retry so we can have a better guess what's broken.

Comment 2 exploit 2012-09-24 08:11:20 UTC
I have redirected python /usr/share/vdsm/nwfilter.pyc  to /tmp/nwfilter.log but after rebooting, the file is empty as well. Executing python /usr/share/vdsm/nwfilter.pyc returns nothing, so it is what expected.
nevertheless, I've found some more informations in journalctl this morning :

Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: Starting libvirtd (via systemctl):  [  OK  ]
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: libvir: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: Traceback (most recent call last):
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/share/vdsm/nwfilter.py", line 83, in <module>
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: main()
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/share/vdsm/nwfilter.py", line 32, in main
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: conn = libvirtconnection.get()
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 113, in get
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: conn = libvirt.openAuth('qemu:///system', auth, 0)
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 102, in openAuth
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: if ret is None:raise libvirtError('virConnectOpenAuth() failed')
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: libvirt.libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: vdsm: Failed to define network filters on libvirt[FAILED]

seems to be a libvirtd sock issue, but when regarding to this file, nothing seems to be anormal :
[root@node ~]# ll /var/run/libvirt/libvirt-sock
srwxrwx---. 1 root kvm 0 Sep 24 10:03 /var/run/libvirt/libvirt-sock

and manually rebooting vdsmd always successes

Comment 3 Douglas Schilling Landgraf 2012-09-24 19:38:26 UTC
Hi,

Thanks for your report. 
There are patches upstream (below) available for review which should fix your issue. Fell free to make a comment there to speed up them to get merged.

vdsmd: await for libvirt with systemd, too
http://gerrit.ovirt.org/#/c/8162/

vdsmd.init: verify if libvirt socket file exists
http://gerrit.ovirt.org/#/c/8175/

Comment 4 Douglas Schilling Landgraf 2012-09-24 19:50:02 UTC
Tested executed:

Cloning vdsm source code:
# git clone git://gerrit.ovirt.org/vdsm

Building the rpms (Without patches):
# cd vdsm
# ./autogen.sh --system && make && make rpm

# cd ~/rpmbuild/RPMS/x86_64/
# rpm -ivh vdsm-python-4.10.0-0.468.git5ac6d2c.fc17.x86_64.rpm
# rpm -ivh vdsm-4.10.0-0.468.git5ac6d2c.fc17.x86_64.rpm

== Reboot the system ==

After the reboot, check vdsmd status:

# systemctl status vdsmd.service (it should show as failed)

To fix, apply the below patches [1], rebuild/reinstall the rpm and reboot the system, vdsmd will be start correctly.

[1]
vdsmd: await for libvirt with systemd, too
http://gerrit.ovirt.org/#/c/8162/

vdsmd.init: verify if libvirt socket file exists
http://gerrit.ovirt.org/#/c/8175/

Comment 5 Igor Lvovsky 2012-10-02 10:20:37 UTC
Suggested by Moti:
http://gerrit.ovirt.org/#/c/8245/4
Merged to upstream, Moved to ON_QA


Note You need to log in before you can comment on or make changes to this bug.