Bug 1222154 - el6: libvirtd configurator configures upstart but does not disable libvirtd's sysv job
Summary: el6: libvirtd configurator configures upstart but does not disable libvirtd's...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.3
Assignee: Yaniv Bronhaim
QA Contact: Jiri Belka
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-15 22:14 UTC by Marina Kalinin
Modified: 2019-08-15 04:36 UTC (History)
22 users (show)

Fixed In Version: vdsm-4.16.18-1.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-15 13:19:23 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd -v logs (extracted from libvirtd.log) (2.94 MB, text/plain)
2015-05-21 14:23 UTC, Pavel Zhukov
no flags Details
libvirtd -v logs (extracted from libvirtd.log) (2.99 MB, text/plain)
2015-05-21 14:26 UTC, Pavel Zhukov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1103 0 normal SHIPPED_LIVE vdsm 3.5.3 - bug fix and enhancement update 2015-06-15 17:18:53 UTC
oVirt gerrit 41144 0 ovirt-3.5 MERGED vdsm-tool: is-configured: on el6, we must disable sysv's libvirtd job Never

Description Marina Kalinin 2015-05-15 22:14:46 UTC
Description of problem:
Libvirt does not start on RHEL 6.6  RHEV host.
libvirt-0.10.2-46.el6_6.6.x86_64   Wed May 13 18:00:57 2015
vdsm-4.16.13.1-1.el6ev.x86_64      Wed May 13 18:01:04 2015

Happened on 2 hosts - one clean new install, another, the one I am attaching the logs from was upgraded from rhev 3.4 latest.

Here are the details
Host won't join the RHEV-M because vdsm isn't running, which in turn is because libvirt is failing to load. I've had the customer increase the logging for libvirt and he's rebooted the system, but libvirt is failing to start:

~~~
May 15 12:16:51  init: libvirtd main process ended, respawning
May 15 12:16:51  init: libvirtd main process (428) terminated with status 1
May 15 12:16:51  init: libvirtd respawning too fast, stopped
~~~

The cause of the failure seems to be some sort of failure reading some necessary files:
~~~
$ tail var/log/libvirt/libvirtd.log                                              
2015-05-14 15:33:32.998+0000: 20528: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:33:48.072+0000: 20530: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:34:03.123+0000: 20529: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:34:18.160+0000: 20528: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:34:33.199+0000: 20526: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:34:48.263+0000: 20529: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:35:03.299+0000: 20528: error : virDomainObjGetMetadata:16670 : metadata not found: Requested metadata element is not present
2015-05-14 15:35:40.040+0000: 20525: error : virNetSocketReadWire:1194 : End of file while reading data: Input/output error
2015-05-14 18:23:48.913+0000: 20301: info : libvirt version: 0.10.2, package: 46.el6_6.4 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-03-20-09:33:22, x86-026.build.eng.bos.redhat.com)
2015-05-14 18:23:48.913+0000: 20301: error : virNWFilterSnoopLeaseFileRefresh:1903 : open("/var/run/libvirt/network/nwfilter.ltmp"): No such file or directory
~~~

Here is the output from libvirt.log after increasing the log level:
~~~
$ grep -A 16 -B 16 'output error' var/log/libvirt/libvirtd.log
2015-05-15 18:26:32.519+0000: 20613: debug : virEventPollCalculateTimeout:320 : Calculate expiry of 1 timers
2015-05-15 18:26:32.519+0000: 20613: debug : virEventPollCalculateTimeout:346 : Timeout at 0 due in -1 ms
2015-05-15 18:26:32.519+0000: 20613: debug : virEventPollRunOnce:614 : EVENT_POLL_RUN: nhandles=9 timeout=-1
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollRunOnce:625 : Poll got 1 event(s)
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchTimeouts:410 : Dispatch 1
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:455 : Dispatch 9
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=0 w=1
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=1 w=2
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=2 w=3
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=3 w=4
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=4 w=5
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=5 w=6
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=6 w=7
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=7 w=8
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:469 : i=8 w=15
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollDispatchHandles:483 : EVENT_POLL_DISPATCH_HANDLE: watch=15 events=9
2015-05-15 18:26:32.521+0000: 20613: error : virNetSocketReadWire:1194 : End of file while reading data: Input/output error
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollCleanupTimeouts:501 : Cleanup 1
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollCleanupHandles:549 : Cleanup 9
2015-05-15 18:26:32.521+0000: 20613: debug : virNetServerClientClose:623 : client=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : virKeepAliveStop:320 : RPC_KEEPALIVE_STOP: ka=0x278fee0 client=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectRef:168 : OBJECT_REF: obj=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x278fee0
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectUnref:137 : OBJECT_DISPOSE: obj=0x278fee0
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectRef:168 : OBJECT_REF: obj=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : daemonRemoveAllClientStreams:480 : stream=(nil)
2015-05-15 18:26:32.521+0000: 20613: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x278f6c0
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollRemoveHandle:175 : EVENT_POLL_REMOVE_HANDLE: watch=15
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollRemoveHandle:188 : mark delete 8 20
2015-05-15 18:26:32.521+0000: 20613: debug : virEventPollInterruptLocked:697 : Skip interrupt, 0 2131273824
2015-05-15 18:26:32.521+0000: 20613: debug : virNetMessageFree:73 : msg=0x2791520 nfds=0 cb=(nil)
~~~

SELinux is permissive:
~~~
$ cat etc/sysconfig/selinux 
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#       enforcing - SELinux security policy is enforced.
#       permissive - SELinux prints warnings instead of enforcing.
#       disabled - SELinux is fully disabled.
SELINUX=permissive
# SELINUXTYPE= type of policy in use. Possible values are:
#       targeted - Only targeted network daemons are protected.
#       strict - Full SELinux protection.
SELINUXTYPE=targeted
~~~

vdsm error:
~~~
MainThread::DEBUG::2015-05-13 14:45:19,954::upgrade::91::upgrade::(apply_upgrade) Running upgrade upgrade-unified-persistence
MainThread::DEBUG::2015-05-13 14:45:19,955::utils::739::root::(execCmd) /sbin/ip route show to 0.0.0.0/0 table all (cwd None)
MainThread::DEBUG::2015-05-13 14:45:19,966::utils::759::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainThread::DEBUG::2015-05-13 14:45:19,991::libvirtconnection::150::root::(get) trying to connect libvirt
MainThread::ERROR::2015-05-13 14:45:20,021::upgrade::95::upgrade::(apply_upgrade) Failed to run upgrade-unified-persistence
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/vdsm/tool/upgrade.py", line 93, in apply_upgrade
    upgrade.run(ns, args)
  File "/usr/lib/python2.6/site-packages/vdsm/tool/unified_persistence.py", line 169, in run
    run()
  File "/usr/lib/python2.6/site-packages/vdsm/tool/unified_persistence.py", line 43, in run
    networks, bondings = _getNetInfo()
  File "/usr/lib/python2.6/site-packages/vdsm/tool/unified_persistence.py", line 129, in _getNetInfo
    netinfo = NetInfo()
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 820, in __init__
    _netinfo = get()
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 730, in get
    d['networks'] = _libvirtNets2vdsm(nets, gateways, ipv6routes, ipaddrs)
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 687, in _libvirtNets2vdsm
    dhcp4 = getDhclientIfaces(_DHCLIENT_LEASES_GLOBS)
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 673, in getDhclientIfaces
    interfaces.update(_parseLeaseFile(leaseFile, ipv6))
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 639, in _parseLeaseFile
    expiryTime = _parseExpiryTime(line[len(EXPIRE):end])
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 617, in _parseExpiryTime
    return datetime.strptime(expiryTime, '%w %Y/%m/%d %H:%M:%S')
  File "/usr/lib64/python2.6/_strptime.py", line 325, in _strptime
    (data_string, format))
ValueError: time data 'never' does not match format '%w %Y/%m/%d %H:%M:%S'
~~~

Comment 3 Ido Barkan 2015-05-18 05:19:37 UTC
also look like a vdsm bug. so, is libvirt and vdsm both down?
Can you print the output of 'service vdsmd status' and 'service libvritd status'

Comment 4 Marina Kalinin 2015-05-18 14:48:28 UTC
Both. Vdsm cannot start, cause libvirt is down.

Comment 5 Ján Tomko 2015-05-19 08:59:10 UTC
The libvirt error from comment 1 (version el6_6.4):
2015-05-14 18:23:48.913+0000: 20301: info : libvirt version: 0.10.2, package: 46.el6_6.4 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-03-20-09:33:22, x86-026.build.eng.bos.redhat.com)
2015-05-14 18:23:48.913+0000: 20301: error : virNWFilterSnoopLeaseFileRefresh:1903 : open("/var/run/libvirt/network/nwfilter.ltmp"): No such file or directory

was fixed in version el6_6.5 for bug 1208619 [6.6.z] and bug 1200991 [6.7]

(The metadata error is harmless and will be fixed by bug 1220474 and
virNetSocketReadWire:1194 : End of file while reading data: Input/output error
is logged when a connection to libvirtd is unexpectedly closed)


The sosreport shows (version el6_6.6):
* the kernel booting at 14:24:42 EDT

* libvirtd starting at 14:25:39, then forking as 20613
2015-05-15 18:25:39.319+0000: 20601: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build
.eng.bos.redhat.com)
2015-05-15 18:25:39.321+0000: 20613: debug : main:1068 : Ensuring run dir '/var/run/libvirt' exists

* another libvirtd starting at 14:25:52, failing to acquire PID path
2015-05-15 18:25:52.861+0000: 21663: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)
2015-05-15 18:25:52.862+0000: 21663: error : virPidFileAcquirePath:410 : Failed to acquire pid file '/var/run/libvirtd.pid': Resource temporarily unavailable

and a lot of other attempts to start it, but there is no record of the 20613 libvirtd shutting down and ps auxwww still shows it as running:
 root     20613  0.8  0.0 899956 10808 ?        Sl   14:25   0:01 libvirtd --daemon --listen


I don't know why init tries to start it again, but libvirtd should be up and running on that system - does it respond to virsh commands?
Could this is similar to bug 1198187 where libvirtd was started too early by vdsm?

The vdsm error in comment 1 does not seem to be related to libvirt.

Comment 6 Pavel Zhukov 2015-05-19 09:09:45 UTC
Jan,
Something definitely wrong with libvirtd. I'm not sure if we should blame vdsm here:

# libvirtd -v 2>&1 | tee /tmp/no_daemon.log
2015-05-19 03:01:40.827+0000: 1866: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 3
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 5
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 6
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 7
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 8
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 9
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 10
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 11
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 12
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 14
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 15
2015-05-19 03:01:40.827+0000: 1866: debug : virFileClose:72 : Closed fd 17
2015-05-19 03:01:40.830+0000: 1867: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 3
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 5
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 6
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 7
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 8
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 9
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 10
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 11
2015-05-19 03:01:40.830+0000: 1867: debug : virFileClose:72 : Closed fd 12

and so on and so far....

Comment 7 Ján Tomko 2015-05-19 11:25:26 UTC
Libvirtd shouldn't log the version multiple times on the same log output unless it's been started again (which it should not do on its own).

Can you share the corresponding logs from /var/log/libvirt/libvirtd.log (or log_outputs from libvirtd.conf)? (After the logging configuration is parsed, libvirtd switches from logging to stderr to logging to the specified outputs)

Did you run that while there already was an instance of libvirtd running (or a leftover pid file) on the system?

Comment 8 Pavel Zhukov 2015-05-19 13:12:59 UTC
(In reply to Ján Tomko from comment #7)
> Libvirtd shouldn't log the version multiple times on the same log output
> unless it's been started again (which it should not do on its own).
Of course it shouldn't and It doesn't usually (in working envs)
> 
> Can you share the corresponding logs from /var/log/libvirt/libvirtd.log (or
> log_outputs from libvirtd.conf)? (After the logging configuration is parsed,
> libvirtd switches from logging to stderr to logging to the specified outputs)
You can find libvirtd.log in previous comments (nothing was changed). I've checked if we can see some useful error in early initialization step (helped me pretty often in the past) but no luck. 
> 
> Did you run that while there already was an instance of libvirtd running (or
> a leftover pid file) on the system?
It should not be unless there is bug in init scripts:
# service libvirtd status
libvirtd (pid  20613) is running...
# service libvirtd stop
Stopping libvirtd daemon:                                  [  OK  ]
# service libvirtd status
libvirtd is stopped
# service vdsmd status
VDS daemon is not running
libvirtd -v => https://bugzilla.redhat.com/show_bug.cgi?id=1222154#c6

Comment 9 Pavel Zhukov 2015-05-19 13:24:21 UTC
(In reply to Ján Tomko from comment #7)

> Did you run that while there already was an instance of libvirtd running (or
> a leftover pid file) on the system?

We have checked that thare are not either libvirt or vdsm processes on the host running (killed superVdsm only). No success! Libvirtd are respawning itself in the loop with version output (like before) even if it's stated not as daemon.

Comment 10 Ján Tomko 2015-05-19 14:23:57 UTC
Libvirtd cannot respawn itself on its own.

I thought the sosreport's log was created by system startup, not running libvirtd -v manually. There are no errors indicating a bug in libvirt in the sosreport's libvirtd.log, libvirtd refuses to start because it's already running.


Looking at the sosreport, libvirtd is set to be started by both init:

$ ls etc/rc.d/rc3.d/*libvirtd
etc/rc.d/rc3.d/S97libvirtd

and by upstart:

$ head etc/init/libvirtd.conf 
# libvirtd upstart job
#
# XXX wait for rc to get all dependent initscripts started
# from sysv libvirtd initscript: Required-Start: $network messagebus
start on stopped rc RUNLEVEL=[345]
stop on runlevel [!345]

Only one of those should start by default. Is this the default configuration after installation, or was libvirtd set up to start by both by user intervention?

Comment 11 Marina Kalinin 2015-05-19 21:22:50 UTC
Dan, 
Can you please clarify the flow here?

Thank you!

Comment 13 Dan Kenigsberg 2015-05-19 23:19:53 UTC
(In reply to Ján Tomko from comment #10)

> Looking at the sosreport, libvirtd is set to be started by both init:
> 
> $ ls etc/rc.d/rc3.d/*libvirtd
> etc/rc.d/rc3.d/S97libvirtd
> 
> and by upstart:
> 
> $ head etc/init/libvirtd.conf 
> # libvirtd upstart job
> #
> # XXX wait for rc to get all dependent initscripts started
> # from sysv libvirtd initscript: Required-Start: $network messagebus
> start on stopped rc RUNLEVEL=[345]
> stop on runlevel [!345]
> 
> Only one of those should start by default. Is this the default configuration
> after installation, or was libvirtd set up to start by both by user
> intervention?

on el6, vdsm configures libvirt as an upstart job (that is able to respawn itself) and `chkconfig libvirtd off` to disable the sysv job.

HOWEVER, I don't see where the latter step (sysv disable) is done, now with the new lib/vdsm/tool/configurators/libvirt.py.

Mooli, can you take a look?


(and Marina, the 'never' issue deserves its own network bug. repeating it here only confuses)

Comment 14 Pavel Zhukov 2015-05-20 06:14:14 UTC
(In reply to Dan Kenigsberg from comment #13)
> 
> on el6, vdsm configures libvirt as an upstart job (that is able to respawn
> itself) and `chkconfig libvirtd off` to disable the sysv job.
Dan, seems like it's not true anymore (checked both RHEL-H and RHEV-H hosts):
# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor 6.6 (20150421.0.el6ev)

# chkconfig --list | grep libvirt   
libvirtd       	0:off	1:off	2:on	3:on	4:on	5:on	6:off

# ls -l /etc/rc.d/rc3.d/S97libvirtd 
lrwxrwxrwx. 1 root root 18 2015-04-21 06:32 /etc/rc.d/rc3.d/S97libvirtd -> ../init.d/libvirtd

upstart config is there as well

Comment 15 Yaniv Bronhaim 2015-05-20 09:47:52 UTC
Indeed it looks like we dropped that chkconfig off in the new libvirt.py configurator. Can you please confirm that https://gerrit.ovirt.org/41144 solves the bug?

Comment 17 Marina Kalinin 2015-05-20 19:00:52 UTC
Interesting enough, on my setup, with all same versions, RHEL 6.6 host and vdsm-4.16.13.1-1.el6ev, my libvirtd is off.
I asked the customer to disable libvirtd in chkconfig and try again.
Now working on the dhcp lease issue, will open a separate bug for it.

Comment 18 Marina Kalinin 2015-05-20 19:17:05 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1223530

Comment 21 Pavel Zhukov 2015-05-21 06:47:05 UTC
Disabling of libvirtd service did not solve the problem:

2015-05-19 13:17:04.423+0000: 24252: debug : virFileClose:72 : Closed fd 5
<== NO LOGS BETWEEN
2015-05-20 19:16:43.764+0000: 22296: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)
2015-05-20 19:16:43.764+0000: 22296: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)
2015-05-20 19:21:28.661+0000: 26052: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)
2015-05-20 19:21:29.097+0000: 27190: info : libvirt version: 0.10.2, package: 46.el6_6.6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-10-12:18:44, x86-026.build.eng.bos.redhat.com)


If we don't have any libvirtd processes running at all and vdsmd is stopped and try to start libvird from the shell directly it prints "libvirt version" every few seconds. Doesn't it mean the problem is in libvirtd process itself?

Comment 22 Dan Kenigsberg 2015-05-21 09:23:36 UTC
Pavel, what is the output of

  service libvirtd status
  initctl status libvirtd

just before you run

  libvirtd --listen

SOMETHING is start libvirtd constantly, and I suspect it is initctl's upstart.

Comment 23 Pavel Zhukov 2015-05-21 10:33:04 UTC
Dan, We're not able to reproduce the issue at home. 
(In reply to Dan Kenigsberg from comment #22)
> Pavel, what is the output of
> 
>   service libvirtd status
>   initctl status libvirtd
> 
sysv service is stopped process doesn't exist
https://bugzilla.redhat.com/show_bug.cgi?id=1222154#c6 

> just before you run
> 
>   libvirtd --listen
> 
> SOMETHING is start libvirtd constantly, and I suspect it is initctl's
> upstart.

I know this is weird but if libvirtd is already started (by something) we should see something like (stderr):

# libvirtd -v
2015-05-21 10:07:11.953+0000: 16557: info : libvirt version: 0.10.2, package: 46.el6_6.4 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-03-20-09:33:22, x86-026.build.eng.bos.redhat.com)
2015-05-21 10:07:11.953+0000: 16557: error : virPidFileAcquirePath:410 : Failed to acquire pid file '/var/run/libvirtd.pid': Resource temporarily unavailable

If libvirtd was started by upstrart/sysv/systemd it writes "libvirtd version" to log file (not stderr). Right?  But we see repeaded logs in stderr.
         if (godaemon) {     <== if -d options was specified not the case here
 <..omitted..>
        } else {                                                     |{                                                                    
            if (virAsprintf(&tmp, "%d:stderr", virLogGetDefaultPriori|    int tmp;        << == stderrr                                                 
ty()) < 0)                                                           |    int len;                                                         
                goto no_memory;                                      |                                                                     
        }

Comment 24 Ján Tomko 2015-05-21 13:51:49 UTC
The snippet of code you pasted only shows the defaults that are applied if no log outputs are configured.

The log_outputs option in /etc/libvirt/libvirtd.conf and the LIBVIRT_LOG_OUTPUTS environment variable can both affect libvirt's log outputs. Even if they're set, libvirt can still report some messages on stderr early in the startup phase, but most of the messages will go to the configured logs (including the 'failed to acquire pid' message).

Comment 25 Pavel Zhukov 2015-05-21 14:23:42 UTC
Created attachment 1028216 [details]
libvirtd -v logs (extracted from libvirtd.log)

Comment 26 Pavel Zhukov 2015-05-21 14:26:11 UTC
Created attachment 1028219 [details]
libvirtd -v logs (extracted from libvirtd.log)

Comment 27 Jiri Belka 2015-05-22 12:16:01 UTC
I could not reproduce at all - clean RHEL 6.6 including ZStream installation and I either did add the host into 3.5.3 env (vdsm was from vt14.4 - same version of libvirt and vdsm as written above) or I installed vdsm manually on the host and rebooted and then added into RHEVM. It was working all the time as expected.

Comment 30 Dan Kenigsberg 2015-05-26 11:12:34 UTC
After installation and `vdsm-tool configure`, el6 must not have libvirtd configured to run on boot. This can be verified by seeing only "off" on 

  chkconfig --list libvirtd

Before this fix, you would have seen "on" there, which leads to libvirtd starting first by sysv, and then by upstart. The second libvirtd fails to start, trashes libvirtd.log, and attempts to start again.

Comment 31 Fabian Deutsch 2015-05-26 19:28:27 UTC
We are still seeing this on RHEV-H with vdsm-4.16.17-1.el6ev.x86_64:

[root@node1 admin]# cat /etc/system-release
Red Hat Enterprise Virtualization Hypervisor release 6.7 (20150526.16.el6ev)
[root@node1 admin]# rpm -q vdsm
vdsm-4.16.17-1.el6ev.x86_64
[root@node1 admin]# chkconfig --list libvirtd
libvirtd       	0:off	1:off	2:on	3:on	4:on	5:on	6:off
[root@node1 admin]# 

Shold it be fixed in the vdsm version?

Comment 33 Dan Kenigsberg 2015-05-27 16:14:32 UTC
Fabian, is it fixed when you run `vdsm-tool configure` from the command line?

Yaniv, can you check this?

Comment 34 Fabian Deutsch 2015-05-27 17:44:25 UTC
According to Tolik's findings, this bug is not fixed with the current patch.

Comment 37 Yaniv Bronhaim 2015-05-28 14:25:07 UTC
(In reply to Fabian Deutsch from comment #34)
> According to Tolik's findings, this bug is not fixed with the current patch.

this specifically relates to node and being handled in #1225202

Comment 39 Jiri Belka 2015-05-28 14:54:25 UTC
Provide exact reproduction steps, there is no clear understanding what is the cause. Or should we just check libvirtd is 'off'?

Comment 40 Yaniv Bronhaim 2015-05-28 15:06:23 UTC
over el6 try to run "chkconfig libvirtd on", then run service vdsmd start - vdsm should fail to start.
then run - vdsm-tool configure --force - which should set everything properly for vdsm. 
then you can verify with - chkconfig libvirt - that vdsm actually disabled systemctl control for libvirtd.
then run vdsm and see that all run smoothly.

over el7\fedora you should not see any effect

Comment 41 Jiri Belka 2015-05-29 11:23:30 UTC
ok, vdsm-4.16.18-1.el6ev.x86_64

tested flow in #40 and also with just clean install and adding into rhevm.

Comment 44 Dan Kenigsberg 2015-06-12 11:56:32 UTC
Pavel, this bug can theoretically lead to migration failure, but it is not probable or clear. Can you provide more information?

Comment 45 Pavel Zhukov 2015-06-12 12:14:44 UTC
Dan,
No I cann't. destination host has been reinstalled and all logs are gone

Comment 47 errata-xmlrpc 2015-06-15 13:19:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1103.html


Note You need to log in before you can comment on or make changes to this bug.