Bug 990980 - [host-deploy] network service fails to start dhclient already runs
Summary: [host-deploy] network service fails to start dhclient already runs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.VDSM
Version: master
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: 1.1.0
Assignee: Alon Bar-Lev
QA Contact: Pavel Stehlik
URL:
Whiteboard: network
Depends On:
Blocks: 988995
TreeView+ depends on / blocked
 
Reported: 2013-08-01 10:24 UTC by Martin Pavlik
Modified: 2016-02-10 19:52 UTC (History)
9 users (show)

Fixed In Version: is10
Clone Of:
Environment:
Last Closed: 2013-09-23 12:14:53 UTC
oVirt Team: Network
Embargoed:


Attachments (Terms of Use)
log_collector (3.89 MB, application/x-xz)
2013-08-01 10:24 UTC, Martin Pavlik
no flags Details
journal (36.91 KB, text/plain)
2013-08-01 15:18 UTC, Martin Pavlik
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 17831 0 None None None Never

Description Martin Pavlik 2013-08-01 10:24:49 UTC
Created attachment 781521 [details]
log_collector

Description of problem:
Host installation fails with error 

Failed to install Host dell-08. Failed to execute stage 'Closing up': Command '/bin/systemctl' failed to execute.

Version-Release number of selected component (if applicable):
vdsm-4.12.0-10.git295a069.fc19.x86_64
oVirt Engine Version: 3.3.0-0.3.beta1.fc19

How reproducible:
100%

Steps to Reproduce:
1. install new (fresh OS) to oVirt

Actual results:
host install fails

Expected results:
host installed

Additional info:


2013-08-01 12:07:17 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:446 execute-output: ('/bin/systemctl', 'start', 'network.service') stderr:
Job for network.service failed. See 'systemctl status network.service' and 'journalctl -xn' for details.

2013-08-01 12:07:17 DEBUG otopi.context context._executeMethod:132 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-nlj3MIM0ao/pythonlib/otopi/context.py", line 122, in _executeMethod
    method['method']()
  File "/tmp/ovirt-nlj3MIM0ao/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 178, in _start
    self.services.state('network', True)
  File "/tmp/ovirt-nlj3MIM0ao/otopi-plugins/otopi/services/systemd.py", line 138, in state
    'start' if state else 'stop'
  File "/tmp/ovirt-nlj3MIM0ao/otopi-plugins/otopi/services/systemd.py", line 77, in _executeServiceCommand
    raiseOnError=raiseOnError
  File "/tmp/ovirt-nlj3MIM0ao/pythonlib/otopi/plugin.py", line 451, in execute
    command=args[0],
RuntimeError: Command '/bin/systemctl' failed to execute
2013-08-01 12:07:17 ERROR otopi.context context._executeMethod:141 Failed to execute stage 'Closing up': Command '/bin/systemctl' failed to execute




vdsm.log
MainThread::WARNING::2013-08-01 12:07:18,932::clientIF::204::vds::(_prepareBindings) Unable to load the json rpc server module. Please make sure it is installed.

Comment 1 Alon Bar-Lev 2013-08-01 11:14:39 UTC
Hi,

Can you please provide output of:
 systemctl status network.service
 journalctl -xn

Thanks!

Comment 2 Martin Pavlik 2013-08-01 12:30:38 UTC
[root@dell-r210ii-08 ~]# systemctl status network.service
network.service - LSB: Bring up/down networking
   Loaded: loaded (/etc/rc.d/init.d/network)
   Active: failed (Result: exit-code) since Thu 2013-08-01 12:23:43 CEST; 2h 4min ago
  Process: 23817 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE)

Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[23817]: RTNETLINK answer...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com systemd[1]: network.service: con...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com systemd[1]: Failed to start LSB:...
Aug 01 12:23:43 dell-r210ii-08.rhev.lab.eng.brq.redhat.com systemd[1]: Unit network.service...
[root@dell-r210ii-08 ~]#  journalctl -xn
-- Logs begin at Thu 2013-08-01 11:40:38 CEST, end at Thu 2013-08-01 14:25:01 CEST. --
Aug 01 13:25:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24054]: (root) CMD (/usr/sbin
Aug 01 13:40:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24058]: (root) CMD (/usr/sbin
Aug 01 13:55:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24064]: (root) CMD (/usr/sbin
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24068]: (root) CMD (run-parts
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24071]:
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24077]:
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24079]:
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24085]:
Aug 01 14:10:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24089]: (root) CMD (/usr/sbin
Aug 01 14:25:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24094]: (root) CMD (/usr/sbin
lines 1-11/11 (END)...skipping...
-- Logs begin at Thu 2013-08-01 11:40:38 CEST, end at Thu 2013-08-01 14:25:01 CEST. --
Aug 01 13:25:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24054]: (root) CMD (/usr/sbin/logrotate /etc/logrotate.d/libvirtd)
Aug 01 13:40:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24058]: (root) CMD (/usr/sbin/logrotate /etc/logrotate.d/libvirtd)
Aug 01 13:55:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24064]: (root) CMD (/usr/sbin/logrotate /etc/logrotate.d/libvirtd)
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24068]: (root) CMD (run-parts /etc/cron.hourly)
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24071]: starting 0anacron
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24077]: finished 0anacron
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24079]: starting vdsm-logrotate
Aug 01 14:01:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com run-parts(/etc/cron.hourly)[24085]: finished vdsm-logrotate
Aug 01 14:10:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24089]: (root) CMD (/usr/sbin/logrotate /etc/logrotate.d/libvirtd)
Aug 01 14:25:01 dell-r210ii-08.rhev.lab.eng.brq.redhat.com CROND[24094]: (root) CMD (/usr/sbin/logrotate /etc/logrotate.d/libvirtd)

Comment 3 Alon Bar-Lev 2013-08-01 14:24:20 UTC
The journalctl -xn  output is out of sync... can you go to the same time window or reproduce and immediately run it?

Thanks!

Comment 4 Martin Pavlik 2013-08-01 15:18:56 UTC
Created attachment 781655 [details]
journal

Comment 5 Alon Bar-Lev 2013-08-01 15:26:40 UTC
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: Bringing up loopback interface:  [  OK  ]
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: Bringing up interface em1:
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: dhclient(1096) is already running - exiting.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: 
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: This version of ISC DHCP is based on the release available
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: on ftp.isc.org.  Features have been added and other changes
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: have been made to the base software release in order to make
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: it work better with this distribution.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: 
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: Please report for this software via the Red Hat Bugzilla site:
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: http://bugzilla.redhat.com
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: 
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com dhclient[9181]: exiting.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: Determining IP information for em1...dhclient(1096) is already running - exiting.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: This version of ISC DHCP is based on the release available
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: on ftp.isc.org.  Features have been added and other changes
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: have been made to the base software release in order to make
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: it work better with this distribution.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: Please report for this software via the Red Hat Bugzilla site:
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: http://bugzilla.redhat.com
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: exiting.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: failed.
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com network[9036]: [FAILED]
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com systemd[1]: network.service: control process exited, code=exited status=1
Aug 01 12:07:17 dell-r210ii-08.rhev.lab.eng.brq.redhat.com systemd[1]: Failed to start LSB: Bring up/down networking.
-- Subject: Unit network.service has failed

Comment 6 Alon Bar-Lev 2013-08-08 10:56:31 UTC
Antoni,

I need help here...

Alon

Comment 7 Antoni Segura Puimedon 2013-08-08 11:33:55 UTC
Ah... This is bad. network.service is not that resilient.

I think that this is a bug of network.service, it shouldn't be considered as failed if some interface is already up (I think that it is only considered a fail if the BOOTPROTO=dhcp).

If we can, we should have network.service maintainers fix this. Otherwise, the workaround would be to manually (not via systemctl) do a /etc/rc.d/init.d/network restart

and systemctl enable network.service

Comment 8 Alon Bar-Lev 2013-08-08 12:50:55 UTC
commit 13607e94f23d29e6fa69997652ba31a112055e68
Author: Alon Bar-Lev <alonbl>
Date:   Thu Aug 8 15:00:39 2013 +0300

    vdsm: packages: ignore errors when starting network service
    
    Bug-Url: https://bugzilla.redhat.com/show_bug.cgi?id=990980
    Change-Id: Ib46f1cec395c69e860318101fd76370d5598bf57
    Signed-off-by: Alon Bar-Lev <alonbl>

Comment 9 Itamar Heim 2013-09-23 12:14:53 UTC
bulk closing, assuming verified bugs are in 3.3.


Note You need to log in before you can comment on or make changes to this bug.