Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 859483

Summary: vdsmd.service never starts alone after rebooting.
Product: [Retired] oVirt Reporter: exploit
Component: vdsmAssignee: Douglas Schilling Landgraf <dougsland>
Status: CLOSED CURRENTRELEASE QA Contact: Haim <hateya>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.1 GACC: abaron, acathrow, bazulay, danken, dougsland, dyasny, iheim, ilvovsky, mgoldboi, yeylon, ykaul
Target Milestone: ---   
Target Release: 3.2   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-15 06:46:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description exploit 2012-09-21 16:09:34 UTC
Description of problem:
In the latest vdsm build from git (vdsm-4.10.0-0.452.git87594e3.fc17.x86_64, F17), vdsmd.service never starts alone after rebooting.
I have had a look to journalctl anfd I've found this :

systemd-vdsmd[538]: vdsm: Failed to define network filters on libvirt[FAILED]

[root@node ~]# service vdsmd status
Redirecting to /bin/systemctl  status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
          Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
          Active: failed (Result: exit-code) since Fri, 21 Sep 2012 12:13:01 +0200; 4min 56s ago
         Process: 543 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=1/FAILURE)
          CGroup: name=systemd:/system/vdsmd.service

Sep 21 12:12:55 node.abes.fr systemd-vdsmd[543]: Note: Forwarding request to 'systemctl disable libvirt-guests.service'.
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: vdsm: libvirt already configured for vdsm [  OK  ]
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting wdmd...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting sanlock...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting iscsid:
Sep 21 12:13:01 node.abes.fr systemd-vdsmd[543]: Starting libvirtd (via systemctl):  [  OK  ] 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.regestering a vm with ovirt-engine
2.node rebooting at the end of the bootstrap
3.when rebooting, vdsmd have not started
  
Actual results:
no vdsmd processus, node isn't present in the node list

Expected results:
node is up and ready to run vms

Additional info:
network is okay, when manually starting the daemon, it goes up
I have seen that bz :  vdsmd: set nwfilter on ovirt-node

    ovirt-node is shipped with .pyc only. Do not try running a missing
    executable. 
I'm not sure it is the same, when running manually python /usr/share/vdsm/nwfilter.pyc, it returns nothing, so must not be an error there.
I guess it could comme from dependant daemon that doesn't start when vdsmd needs it. that could explain why we can successfully start it manually after the end of the boot...

Comment 1 Dan Kenigsberg 2012-09-23 11:40:41 UTC
Can you spot any other vdsm-related stuff on journalctl?

Would you edit your /lib/systemd/systemd-vdsmd and remove the /dev/null-throwing parts from

 python /usr/share/vdsm/nwfilter.pyc > /dev/null 2>&1

and retry so we can have a better guess what's broken.

Comment 2 exploit 2012-09-24 08:11:20 UTC
I have redirected python /usr/share/vdsm/nwfilter.pyc  to /tmp/nwfilter.log but after rebooting, the file is empty as well. Executing python /usr/share/vdsm/nwfilter.pyc returns nothing, so it is what expected.
nevertheless, I've found some more informations in journalctl this morning :

Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: Starting libvirtd (via systemctl):  [  OK  ]
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: libvir: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: Traceback (most recent call last):
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/share/vdsm/nwfilter.py", line 83, in <module>
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: main()
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/share/vdsm/nwfilter.py", line 32, in main
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: conn = libvirtconnection.get()
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 113, in get
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: conn = libvirt.openAuth('qemu:///system', auth, 0)
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 102, in openAuth
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: if ret is None:raise libvirtError('virConnectOpenAuth() failed')
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: libvirt.libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
Sep 24 09:49:53 node.abes.fr systemd-vdsmd[542]: vdsm: Failed to define network filters on libvirt[FAILED]

seems to be a libvirtd sock issue, but when regarding to this file, nothing seems to be anormal :
[root@node ~]# ll /var/run/libvirt/libvirt-sock
srwxrwx---. 1 root kvm 0 Sep 24 10:03 /var/run/libvirt/libvirt-sock

and manually rebooting vdsmd always successes

Comment 3 Douglas Schilling Landgraf 2012-09-24 19:38:26 UTC
Hi,

Thanks for your report. 
There are patches upstream (below) available for review which should fix your issue. Fell free to make a comment there to speed up them to get merged.

vdsmd: await for libvirt with systemd, too
http://gerrit.ovirt.org/#/c/8162/

vdsmd.init: verify if libvirt socket file exists
http://gerrit.ovirt.org/#/c/8175/

Comment 4 Douglas Schilling Landgraf 2012-09-24 19:50:02 UTC
Tested executed:

Cloning vdsm source code:
# git clone git://gerrit.ovirt.org/vdsm

Building the rpms (Without patches):
# cd vdsm
# ./autogen.sh --system && make && make rpm

# cd ~/rpmbuild/RPMS/x86_64/
# rpm -ivh vdsm-python-4.10.0-0.468.git5ac6d2c.fc17.x86_64.rpm
# rpm -ivh vdsm-4.10.0-0.468.git5ac6d2c.fc17.x86_64.rpm

== Reboot the system ==

After the reboot, check vdsmd status:

# systemctl status vdsmd.service (it should show as failed)

To fix, apply the below patches [1], rebuild/reinstall the rpm and reboot the system, vdsmd will be start correctly.

[1]
vdsmd: await for libvirt with systemd, too
http://gerrit.ovirt.org/#/c/8162/

vdsmd.init: verify if libvirt socket file exists
http://gerrit.ovirt.org/#/c/8175/

Comment 5 Igor Lvovsky 2012-10-02 10:20:37 UTC
Suggested by Moti:
http://gerrit.ovirt.org/#/c/8245/4
Merged to upstream, Moved to ON_QA