Bug 1420946

Summary: cloud-init conflicts with os-collect-config in a TripleO image
Product: Red Hat Enterprise Linux 7 Reporter: Juan Antonio Osorio <josorior>
Component: cloud-initAssignee: Lars Kellogg-Stedman <lars>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: apevec, hmatsumo, huzhao, josorior, lars, lnatapov, mbracho, mburns, rcritten, sasha, scorcora, ykawada
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 0.7.9-3 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: 1408589 Environment:
Last Closed: 2017-04-06 21:58:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1336504    

Description Juan Antonio Osorio 2017-02-09 23:13:48 UTC
When updating a TripleO CentOS image with the latest cloud-init, the overcloud deploy fails.

This shows up in the form of the not being able to execute any SoftwareConfig in the overcloud nodes (getting those parts of the stack to just stall).

The logs show that the both crond and os-collect-config where unable to start due to a cycle in the systemd service file:

Feb 09 18:14:00 localhost.localdomain systemd[1]: microcode.service lacks both ExecStart= and ExecStop= setting. Refusing.
Feb 09 18:14:00 localhost.localdomain systemd[1]: Cannot add dependency job for unit microcode.service, ignoring: Unit is not loaded properly: Invalid argument.
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found ordering cycle on multi-user.target/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on crond.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on os-collect-config.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on cloud-final.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on multi-user.target/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Breaking ordering cycle by deleting job crond.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Job crond.service/start deleted to break ordering cycle starting with multi-user.target/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found ordering cycle on multi-user.target/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on os-collect-config.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on cloud-final.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Found dependency on multi-user.target/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Breaking ordering cycle by deleting job os-collect-config.service/start
Feb 09 18:14:00 localhost.localdomain systemd[1]: Job os-collect-config.service/start deleted to break ordering cycle starting with multi-user.target/start
 

Last time we reproduced this error, we tried with: cloud-init-0.7.9-2.el7.x86_64

And it seems that 0.7.9-* fails.

We got it to work with: 0.7.6-9

Comment 2 Lars Kellogg-Stedman 2017-03-07 20:04:24 UTC
There is a scratch build available here that should resolve this problem:

  https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12706804

It would be great if you could test that out and confirm that it works. Thanks!

Comment 3 Lars Kellogg-Stedman 2017-03-07 21:02:23 UTC
For QE: To verify this fix:

Boot a system with both os-collect-config and cloud-init-0.7.9-3 installed.  Examine the output of "journalctl -b -u multi-user.target" and verify that there are no messages of the form:

    Found ordering cycle on multi-user.target/start

Comment 4 Rob Crittenden 2017-03-13 22:04:25 UTC
Scratch build works for me.

Comment 5 Lars Kellogg-Stedman 2017-03-14 21:18:20 UTC
Thanks!

Comment 6 Juan Antonio Osorio 2017-03-27 11:16:01 UTC
worked for me.