Bug 1393094
| Summary: | f25: cloud-init systemd dependency loop for multi-user.target | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Dusty Mabe <dustymabe> |
| Component: | cloud-init | Assignee: | Garrett Holmstrom <gholms> |
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 25 | CC: | adimania, apevec, gholms, Jan.van.Eldik, lars, mattdm, shardy, s, steve.traylen, strigazi, walters |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | cloud-init-0.7.8-5.fc25 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-02-02 20:20:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Dusty Mabe
2016-11-08 20:43:10 UTC
Any update on this. I have noticed quite a few breaks on the atomic host. It's a bit random, but in slow vms, it happens a lot. The following are logs from OpenStack/Magnum's tests in the openstack infrastructure. Two vms are created using Fedora Atomic 20170106. In one node cloud init configures the network. [1] On the other node it doesn't. [2] Deployed in the same environment with the same qcow2 image. In an environment built with the same process, we get the error mentioned in the bug description: [3] Jan 16 13:03:02 localhost.localdomain systemd[1]: cloud-final.service: Job cloud-final.service/start deleted to break ordering cycle starting with cloud-init.target/start [1] http://logs.openstack.org/57/417457/22/check/gate-functional-dsvm-magnum-k8s-ubuntu-xenial/f300eb2/logs/cluster-nodes/node-test_replication_controller_apis-172.24.5.8/cloud-init-output.txt.gz [2] http://logs.openstack.org/57/417457/22/check/gate-functional-dsvm-magnum-k8s-ubuntu-xenial/f300eb2/logs/cluster-nodes/master-test_replication_controller_apis-172.24.5.5/cloud-config.txt.gz [3] http://logs.openstack.org/57/417457/22/check/gate-functional-dsvm-magnum-swarm-ubuntu-xenial/f976e1b/logs/cluster-nodes/master-test_start_stop_container_from_api-172.24.5.8/cloud-final.txt.gz Hi, This looks to be fixed in now release 0.7.9 version of cloud init. Please can we update 0.7.9? Building https://koji.fedoraproject.org/koji/taskinfo?taskID=17346019 I'll backport to f25 soon. > Building https://koji.fedoraproject.org/koji/taskinfo?taskID=17346019
>
> I'll backport to f25 soon.
That build failed
also here is a scratch build for f25 if anyone wants to grab it and test things out: https://koji.fedoraproject.org/koji/taskinfo?taskID=17393273 I found an issue with the new cloud-init - basically during the run of cloud-init it tries to set the hostname using hostnamectl but dbus hasn't started yet and it fails. We need to figure this out.
from the debug log:
```
2017-01-25 08:59:38,338 - util.py[DEBUG]: Running module set_hostname (<module 'cloudinit.config.cc_set_hostname' from '/usr/lib/python3.5/site-packages/cloudinit/config/cc_set_hostname.py'>) failed
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/cloudinit/stages.py", line 776, in _run_modules
freq=freq)
File "/usr/lib/python3.5/site-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python3.5/site-packages/cloudinit/helpers.py", line 187, in run
results = functor(*args)
File "/usr/lib/python3.5/site-packages/cloudinit/config/cc_set_hostname.py", line 47, in handle
cloud.distro.set_hostname(hostname, fqdn)
File "/usr/lib/python3.5/site-packages/cloudinit/distros/__init__.py", line 84, in set_hostname
self._write_hostname(writeable_hostname, self.hostname_conf_fn)
File "/usr/lib/python3.5/site-packages/cloudinit/distros/rhel.py", line 130, in _write_hostname
util.subp(['hostnamectl', 'set-hostname', str(hostname)])
File "/usr/lib/python3.5/site-packages/cloudinit/util.py", line 1850, in subp
cmd=args)
cloudinit.util.ProcessExecutionError: Unexpected error while running command.
Command: ['hostnamectl', 'set-hostname', 'cloudhost.localdomain']
Exit code: 1
Reason: -
Stdout: -
Stderr: Failed to create bus connection: No such file or directory
```
and from the journal you can see that dbus starts after cloud-init:
```
-bash-4.3# journalctl -u cloud-init | grep -P 'Starting|hostname'
Jan 25 08:59:36 localhost.localdomain systemd[1]: Starting Initial cloud-init job (metadata service crawler)...
Jan 25 08:59:38 localhost.localdomain cloud-init[1308]: 2017-01-25 08:59:38,331 - util.py[WARNING]: Failed to set the hostname to cloudhost.localdomain (cloudhost)
Jan 25 08:59:38 localhost.localdomain cloud-init[1308]: 2017-01-25 08:59:38,338 - util.py[WARNING]: Running module set_hostname (<module 'cloudinit.config.cc_set_hostname' from '/usr/lib/python3.5/site-packages/cloudinit/config/cc_set_hostname.py'>) failed
-bash-4.3#
-bash-4.3#
-bash-4.3# journalctl -u dbus | head -n 2
-- Logs begin at Wed 2017-01-25 08:59:29 UTC, end at Wed 2017-01-25 09:02:04 UTC. --
Jan 25 08:59:38 localhost.localdomain systemd[1]: Started D-Bus System Message Bus.
```
We need to file a bug upstream and/or send a patch to get this fixed for us.
I pulled the patch from the commit in the first comment of this BZ and applied it to cloud-init 0.7.8 in Fedora and built a scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=17405006
cloud-init 0.7.9 includes: DefaultDependencies=no Which means that it does *not* have the implicit dependency on basic.target. So with 0.7.8, running systemd-analyze dot cloud-init.service yields: "cloud-init.service"->"basic.target" [color="green"]; But in 0.7.9 there is no such dependency. dbus.service runs before basic.target, and this is why everything works in 0.7.8 but hostname setting fails in 0.7.9. With this build [1] from comment #9, I did a few runs in OpenStack's CI and I didn't see the error again. [1] https://koji.fedoraproject.org/koji/taskinfo?taskID=17405006 Filed the new issue as bug 1417025. cloud-init-0.7.8-5.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-1f675fd070 cloud-init-0.7.8-5.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-1f675fd070 cloud-init-0.7.8-5.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. |