Using https://kojipkgs.fedoraproject.org//work/tasks/5585/25835585/Fedora-Cloud-Base-28-20180320.n.0.x86_64.qcow2 things boot and cloud-init runs, but it creates a bogus route pointing the metadata service to the local instance (which of course fails). [[0;32m OK [0m] Reached target Network. [ 17.249056] cloud-init[924]: Cloud-init v. 17.1 running 'init' at Tue, 20 Mar 2018 16:26:32 +0000. Up 10.62 seconds. [ 17.250631] cloud-init[924]: ci-info: ++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++ [ 17.252014] cloud-init[924]: ci-info: +--------+------+--------------+---------------+-------+-------------------+ [ 17.253460] cloud-init[924]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address | [ 17.254789] cloud-init[924]: ci-info: +--------+------+--------------+---------------+-------+-------------------+ [ 17.256118] cloud-init[924]: ci-info: | eth0: | True | 172.25.64.52 | 255.255.240.0 | . | fa:16:3e:11:cb:02 | [ 17.257429] cloud-init[924]: ci-info: | eth0: | True | . | . | d | fa:16:3e:11:cb:02 | [ 17.258761] cloud-init[924]: ci-info: | lo: | True | 127.0.0.1 | 255.0.0.0 | . | . | [ 17.260064] cloud-init[924]: ci-info: | lo: | True | . | . | d | . | [ 17.261368] cloud-init[924]: ci-info: +--------+------+--------------+---------------+-------+-------------------+ [ 17.262745] cloud-init[924]: ci-info: +++++++++++++++++++++++++++++Route IPv4 info+++++++++++++++++++++++++++++ [ 17.264081] cloud-init[924]: ci-info: +-------+-------------+-------------+---------------+-----------+-------+ [ 17.265400] cloud-init[924]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags | [ 17.266710] cloud-init[924]: ci-info: +-------+-------------+-------------+---------------+-----------+-------+ [ 17.268085] cloud-init[924]: ci-info: | 0 | 0.0.0.0 | 172.25.64.1 | 0.0.0.0 | eth0 | UG | [ 17.269401] cloud-init[924]: ci-info: | 1 | 169.254.0.0 | 0.0.0.0 | 255.255.0.0 | eth0 | U | [ 17.271197] cloud-init[924]: ci-info: | 2 | 172.25.64.0 | 0.0.0.0 | 255.255.240.0 | eth0 | U | [ 17.272526] cloud-init[924]: ci-info: +-------+-------------+-------------+---------------+-----------+-------+ [ 17.273895] cloud-init[924]: 2018-03-20 16:26:38,796 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [2/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb01c7aaeb8>: Failed to establish a new connection: [Errno 113] No route to host',))] [ 20.320776] cloud-init[924]: 2018-03-20 16:26:41,868 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [5/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb01c7c8710>: Failed to establish a new connection: [Errno 113] No route to host',))] ...tons more of those... [ 194.551942] cloud-init[924]: 2018-03-20 16:28:42,435 - DataSourceEc2.py[CRITICAL]: Giving up on md from ['http://169.254.169.254/2009-04-04/meta-data/instance-id'] after 126 seconds [ 194.559970] cloud-init[924]: 2018-03-20 16:28:42,443 - url_helper.py[WARNING]: Calling 'http://172.25.64.3/latest/meta-data/instance-id' failed [0/120s]: request error [HTTPConnectionPool(host='172.25.64.3', port=80): Max retries exceeded with url: /latest/meta-data/instance-id (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb01c7c8780>: Failed to establish a new connection: [Errno 111] Connection refused',))] ... At this point the instance is up and can be pinged fine, but since it couldn't reach the metadata service there's no ssh keys setup, etc. So you cannot login. Eventually it times out again and ssh starts, but there's no ssh keys injected, so you still cannot login. Note that this is a very old openstack (RHOS 5). It would be good to know if this problem happens to new clouds too.
Proposed as a Blocker for 28-beta by Fedora user kevin using the blocker tracking app because: "The cloud-init package must be functional for release blocking cloud images. "
+1 blocker, obviously hits the criterion right on the nose.
cloud-init has a disable_metadata switch one can use to block access to the EC2 metadata service, but it shouldn't be enabled by default and it *definitely* shouldn't be applied this early in the boot process. Do you happen to have the rest of the VM's boot-time output handy?
Created attachment 1410821 [details] boot logs from cloud instance Here's the logs...
+1 blocker if this is indeed true for all installations. If it turns out that it's limited to old versions of OpenStack, I'll revise that.
Agreed with Stephen, there, I'm +1 assuming it's a general failure (has anyone tested EC2 yet?)
Has anyone tested with other/newer clouds, if its happening in every instance then +1 Blocker
That's +3, setting accepted - if it's shown that this doesn't happen on other clouds, I will drop accepted status for a revote.
The route also gets added on EC2, although due to networking differences between openstack and EC2, it does not actually break the metadata gathering there. Regardless, this means that the networking behaviour changed and it breaks on clouds that aren't setup for these routes. Cloud image from Fedora-28-20180321.n.0: Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: ++++++++++++++++++++++++++++Route IPv4 info+++++++++++++++++++++++++++++ Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: +-------+-------------+------------+---------------+-----------+-------+ Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags | Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: +-------+-------------+------------+---------------+-----------+-------+ Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: | 0 | 0.0.0.0 | 172.30.2.1 | 0.0.0.0 | eth0 | UG | Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: | 1 | 169.254.0.0 | 0.0.0.0 | 255.255.0.0 | eth0 | U | Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: | 2 | 172.30.2.0 | 0.0.0.0 | 255.255.255.0 | eth0 | U | Mar 21 22:52:14 ip-172-30-2-30.ec2.internal cloud-init[957]: ci-info: +-------+-------------+------------+---------------+-----------+-------+ Current live Fedora 27 cloud image: Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: ++++++++++++++++++++++++++++Route IPv4 info+++++++++++++++++++++++++++++ Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: +-------+-------------+------------+---------------+-----------+-------+ Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags | Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: +-------+-------------+------------+---------------+-----------+-------+ Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: | 0 | 0.0.0.0 | 172.30.2.1 | 0.0.0.0 | eth0 | UG | Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: | 1 | 172.30.2.0 | 0.0.0.0 | 255.255.255.0 | eth0 | U | Mar 21 22:56:42 ip-172-30-2-41.ec2.internal cloud-init[825]: ci-info: +-------+-------------+------------+---------------+-----------+-------+
So, after some debug images and poking around I am pretty sure the bug is this: - cloud-init overwrites our /etc/sysconfig/network file with: # Created by cloud-init on instance boot automatically, do not edit. # NETWORKING=yes - This means that 2 lines we add in the kickstart are gone: NOZEROCONF=yes DEVTIMEOUT=10 - network starts, NOZEROCONF is not set so (in ifup-eth you can see): # Add Zeroconf route. if [ -z "${NOZEROCONF}" -a "${ISALIAS}" = "no" -a "${REALDEVICE}" != "lo" ]; then ip route add 169.254.0.0/16 dev ${REALDEVICE} metric $((1000 + $(cat /sys/class/net/${REALDEVICE}/ifindex))) scope link fi - Now the metadata route is hosed.
So, how are we gonna fix it?
cloud-init-17.1-3.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-86137b2be8
The fix for this was pulled into the Beta-1.1 (Beta RC1) compose. Can anyone confirm the fix in the Cloud images from that compose? Thanks.
cloud-init-17.1-4.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-86137b2be8
I tested: http://dl.fedoraproject.org/pub/alt/stage/28_Beta-1.1/Cloud/x86_64/images/Fedora-Cloud-Base-28_Beta-1.1.x86_64.qcow2 and http://dl.fedoraproject.org/pub/alt/stage/28_Beta-1.1/Cloud/ppc64le/images/Fedora-Cloud-Base-28_Beta-1.1.ppc64le.qcow2 and they boot fine and I can login to them. So, fix works here as far as I can tell.
Tested with Fedora-Cloud-Base-28_Beta-1.1.x86_64 on EC2 and locally with testcloud, boot and login work OK!
cloud-init-17.1-4.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.