Bug 1795383 - tripleo node restart fail due to error from cloud-init.
Summary: tripleo node restart fail due to error from cloud-init.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 16.0 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: beta
: 16.1 (Train on RHEL 8.2)
Assignee: Adriano Petrich
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
: 1791949 (view as bug list)
Depends On: 1768770 1802152
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-27 20:02 UTC by Toure Dunnon
Modified: 2020-04-02 16:42 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-02 16:42:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Toure Dunnon 2020-01-27 20:02:06 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Toure Dunnon 2020-01-27 20:03:25 UTC
Description of problem:
After deploying OSP16 if any of the active nodes are rebooted there is an error
from cloud-init which prevent a normal start to most cloud services. This problem
did not exist under OSP13 which is running a slightly older package.

cloud-init-18.5-3.el7.x86_64 <-- older package python2 based.


cloud-init-18.5-7.el8.noarch <-- newer package python3 based.

---------------------------------------------------------------------------------



 ] Started Ceph Monitor.
[   35.673378] cloud-init[3837]: Cloud-init v. 18.5 running 'init' at Thu, 16 Jan 2020 18:11:51 +0000. Up 35.54 seconds.
[   35.673675] cloud-init[3837]: ci-info: +++++++++++++++++++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++++++++++++++++++
[   35.673737] cloud-init[3837]: ci-info: +----------------+-------+------------------------------+---------------+--------+-------------------+
[   35.673780] cloud-init[3837]: ci-info: |     Device     |   Up  |           Address            |      Mask     | Scope  |     Hw-Address    |
[   35.673826] cloud-init[3837]: ci-info: +----------------+-------+------------------------------+---------------+--------+-------------------+
[   35.673883] cloud-init[3837]: ci-info: |     br-ex      |  True |          10.0.0.122          | 255.255.255.0 | global | 52:54:00:71:1d:1a |
[   35.673915] cloud-init[3837]: ci-info: |     br-ex      |  True |  fe80::5054:ff:fe71:1d1a/64  |       .       |  link  | 52:54:00:71:1d:1a |
[   35.673957] cloud-init[3837]: ci-info: |     br-int     | False |              .               |       .       |   .    | 6e:f3:e8:95:2d:49 |
[   35.674006] cloud-init[3837]: ci-info: |  br-isolated   |  True |  fe80::5054:ff:fe0a:a7b0/64  |       .       |  link  | 52:54:00:0a:a7:b0 |
[   35.674060] cloud-init[3837]: ci-info: |      ens3      |  True |        192.168.24.37         | 255.255.255.0 | global | 52:54:00:ff:a9:9e |
[   35.674099] cloud-init[3837]: ci-info: |      ens3      |  True |  fe80::5054:ff:feff:a99e/64  |       .       |  link  | 52:54:00:ff:a9:9e |
[   35.674146] cloud-init[3837]: ci-info: |      ens4      |  True |  fe80::5054:ff:fe0a:a7b0/64  |       .       |  link  | 52:54:00:0a:a7:b0 |
[   35.674177] cloud-init[3837]: ci-info: |      ens5      |  True |  fe80::5054:ff:fe71:1d1a/64  |       .       |  link  | 52:54:00:71:1d:1a |
[   35.674208] cloud-init[3837]: ci-info: | genev_sys_6081 |  True | fe80::348d:afff:fe7e:b51a/64 |       .       |  link  | 36:8d:af:7e:b5:1a |
[   35.674244] cloud-init[3837]: ci-info: |       lo       |  True |          127.0.0.1           |   255.0.0.0   |  host  |         .         |
[   35.674275] cloud-init[3837]: ci-info: |       lo       |  True |           ::1/128            |       .       |  host  |         .         |
[   35.674311] cloud-init[3837]: ci-info: |   ovs-system   | False |              .               |       .       |   .    | 0e:6f:7a:be:e7:48 |
[   35.674343] cloud-init[3837]: ci-info: |     vlan20     |  True |         172.17.1.22          | 255.255.255.0 | global | 6a:7c:9a:9a:76:c2 |
[   35.674386] cloud-init[3837]: ci-info: |     vlan20     |  True | fe80::687c:9aff:fe9a:76c2/64 |       .       |  link  | 6a:7c:9a:9a:76:c2 |
[   35.674419] cloud-init[3837]: ci-info: |     vlan30     |  True |         172.17.3.94          | 255.255.255.0 | global | 92:8c:06:6f:0b:12 |
[   35.674451] cloud-init[3837]: ci-info: |     vlan30     |  True |  fe80::908c:6ff:fe6f:b12/64  |       .       |  link  | 92:8c:06:6f:0b:12 |
[   35.674508] cloud-init[3837]: ci-info: |     vlan40     |  True |         172.17.4.42          | 255.255.255.0 | global | 2e:b1:8e:17:8b:00 |
[   35.674547] cloud-init[3837]: ci-info: |     vlan40     |  True | fe80::2cb1:8eff:fe17:8b00/64 |       .       |  link  | 2e:b1:8e:17:8b:00 |
[   35.674592] cloud-init[3837]: ci-info: |     vlan50     |  True |         172.17.2.44          | 255.255.255.0 | global | 0e:72:95:e6:0d:43 |
[   35.674637] cloud-init[3837]: ci-info: |     vlan50     |  True |  fe80::c72:95ff:fee6:d43/64  |       .       |  link  | 0e:72:95:e6:0d:43 |
[   35.674668] cloud-init[3837]: ci-info: +----------------+-------+------------------------------+---------------+--------+-------------------+
[   35.674710] cloud-init[3837]: ci-info: ++++++++++++++++++++++++++++++++Route IPv4 info+++++++++++++++++++++++++++++++++
[   35.674744] cloud-init[3837]: ci-info: +-------+-----------------+--------------+-----------------+-----------+-------+
[   35.674774] cloud-init[3837]: ci-info: | Route |   Destination   |   Gateway    |     Genmask     | Interface | Flags |
[   35.674804] cloud-init[3837]: ci-info: +-------+-----------------+--------------+-----------------+-----------+-------+
[   35.674832] cloud-init[3837]: ci-info: |   0   |     0.0.0.0     |   10.0.0.1   |     0.0.0.0     |   br-ex   |   UG  |
[   35.674860] cloud-init[3837]: ci-info: |   1   |     10.0.0.0    |   0.0.0.0    |  255.255.255.0  |   br-ex   |   U   |
[   35.674890] cloud-init[3837]: ci-info: |   2   | 169.254.169.254 | 192.168.24.1 | 255.255.255.255 |    ens3   |  UGH  |
[   35.674921] cloud-init[3837]: ci-info: |   3   |    172.17.1.0   |   0.0.0.0    |  255.255.255.0  |   vlan20  |   U   |
[   35.674954] cloud-init[3837]: ci-info: |   4   |    172.17.2.0   |   0.0.0.0    |  255.255.255.0  |   vlan50  |   U   |
[   35.674984] cloud-init[3837]: ci-info: |   5   |    172.17.3.0   |   0.0.0.0    |  255.255.255.0  |   vlan30  |   U   |
[   35.675025] cloud-init[3837]: ci-info: |   6   |    172.17.4.0   |   0.0.0.0    |  255.255.255.0  |   vlan40  |   U   |
[   35.675062] cloud-init[3837]: ci-info: |   7   |   192.168.24.0  |   0.0.0.0    |  255.255.255.0  |    ens3   |   U   |
[   35.675100] cloud-init[3837]: ci-info: +-------+-----------------+--------------+-----------------+-----------+-------+
[   35.675132] cloud-init[3837]: ci-info: +++++++++++++++++++++++++++++++++Route IPv6 info++++++++++++++++++++++++++++++++++
[   35.675185] cloud-init[3837]: ci-info: +-------+---------------------+-------------------------+----------------+-------+
[   35.675216] cloud-init[3837]: ci-info: | Route |     Destination     |         Gateway         |   Interface    | Flags |
[   35.675264] cloud-init[3837]: ci-info: +-------+---------------------+-------------------------+----------------+-------+
[   35.675292] cloud-init[3837]: ci-info: |   9   | 2620:52:0:13b8::/64 |            ::           |     br-ex      |   Ue  |
[   35.675323] cloud-init[3837]: ci-info: |   11  |      fe80::/64      |            ::           | genev_sys_6081 |   U   |
[   35.675354] cloud-init[3837]: ci-info: |   12  |      fe80::/64      |            ::           |     br-ex      |   U   |
[   35.675402] cloud-init[3837]: ci-info: |   13  |      fe80::/64      |            ::           |  br-isolated   |   U   |
[   35.675450] cloud-init[3837]: ci-info: |   14  |      fe80::/64      |            ::           |      ens3      |   U   |
[   35.675497] cloud-init[3837]: ci-info: |   15  |      fe80::/64      |            ::           |      ens4      |   U   |
[   35.675542] cloud-init[3837]: ci-info: |   16  |      fe80::/64      |            ::           |      ens5      |   U   |
[   35.675587] cloud-init[3837]: ci-info: |   17  |      fe80::/64      |            ::           |     vlan20     |   U   |
[   35.675619] cloud-init[3837]: ci-info: |   18  |      fe80::/64      |            ::           |     vlan30     |   U   |
[   35.675652] cloud-init[3837]: ci-info: |   19  |      fe80::/64      |            ::           |     vlan40     |   U   |
[   35.675690] cloud-init[3837]: ci-info: |   20  |      fe80::/64      |            ::           |     vlan50     |   U   |
[   35.675741] cloud-init[3837]: ci-info: |   21  |         ::/0        | fe80::5054:ff:fe55:2f63 |     br-ex      |  UGe  |
[   35.675775] cloud-init[3837]: ci-info: |   23  |        local        |            ::           |     vlan50     |   U   |
[   35.675805] cloud-init[3837]: ci-info: |   24  |        local        |            ::           |     vlan40     |   U   |
[   35.675844] cloud-init[3837]: ci-info: |   25  |        local        |            ::           | genev_sys_6081 |   U   |
[   35.675881] cloud-init[3837]: ci-info: |   26  |        local        |            ::           |  br-isolated   |   U   |
[   35.675921] cloud-init[3837]: ci-info: |   27  |        local        |            ::           |      ens4      |   U   |
[   35.675966] cloud-init[3837]: ci-info: |   28  |        local        |            ::           |     br-ex      |   U   |
[   35.[   35.775037] serial8250: too much work for irq4
676012] cloud-init[3837]: ci-info: |   29  |        local        |            ::           |      ens5      |   U   |
[   35.676056] cloud-init[3837]: ci-info: |   30  |        local        |            ::           |      ens3      |   U   |
[   35.676091] cloud-init[3837]: ci-info: |   31  |        local        |            ::           |     vlan20     |   U   |
[   35.676124] cloud-init[3837]: ci-info: |   32  |        local        |            ::           |     vlan30     |   U   |
[   35.676177] cloud-init[3837]: ci-info: |   33  |       ff00::/8      |            ::           | genev_sys_6081 |   U   |
[   35.676242] cloud-init[3837]: ci-info: |   34  |       ff00::/8      |            ::           |     br-ex      |   U   |
[   35.676271] cloud-init[3837]: ci-info: |   35  |       ff00::/8      |            ::           |  br-isolated   |   U   |
[   35.676312] cloud-init[3837]: ci-info: |   36  |       ff00::/8      |            ::           |      ens3      |   U   |
[   35.676349] cloud-init[3837]: ci-info: |   37  |       ff00::/8      |            ::           |      ens4      |   U   |
[   35.676383] cloud-init[3837]: ci-info: |   38  |       ff00::/8      |            ::           |      ens5      |   U   |
[   35.676414] cloud-init[3837]: ci-info: |   39  |       ff00::/8      |            ::           |     vlan20     |   U   |
[   35.676443] cloud-init[3837]: ci-info: |   40  |       ff00::/8      |            ::           |     vlan30     |   U   |
[   35.676474] cloud-init[3837]: ci-info: |   41  |       ff00::/8      |            ::           |     vlan40     |   U   |
[   35.676532] cloud-init[3837]: ci-info: |   42  |       ff00::/8      |            ::           |     vlan50     |   U   |
[   35.676588] cloud-init[3837]: ci-info: +-------+---------------------+-------------------------+----------------+-------+
[   35.676617] cloud-init[3837]: 2020-01-16 18:11:51,531 - util.py[WARNING]: failed stage init
[   35.798450] cloud-init[3837]: failed run of stage init
[   35.798614] cloud-init[3837]: ------------------------------------------------------------
[   35.798649] cloud-init[3837]: Traceback (most recent call last):
[   35.798680] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/cmd/main.py", line 652, in status_wrapper
[   35.798732] cloud-init[3837]:     ret = functor(name, args)
[   35.798771] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/cmd/main.py", line 362, in main_init
[   35.798802] cloud-init[3837]:     init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
[   35.798833] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/stages.py", line 649, in apply_network_config
[   35.798864] cloud-init[3837]:     netcfg, src = self._find_networking_config()
[   35.798894] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/stages.py", line 636, in _find_networking_config
[   35.798924] cloud-init[3837]:     if self.datasource and hasattr(self.datasource, 'network_config'):
[   35.798955] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/sources/DataSourceConfigDrive.py", line 155, in network_config
[   35.798985] cloud-init[3837]:     self.network_json, known_macs=self.known_macs)
[   35.799046] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/sources/helpers/openstack.py", line 655, in convert_net_json
[   35.799112] cloud-init[3837]:     known_macs = net.get_interfaces_by_mac()
[   35.799152] cloud-init[3837]:   File "/usr/lib/python3.6/site-packages/cloudinit/net/__init__.py", line 596, in get_interfaces_by_mac
[   35.799184] cloud-init[3837]:     (name, ret[mac], mac))
[   35.799234] cloud-init[3837]: RuntimeError: duplicate mac found! both 'br-ex' and 'ens5' have mac '52:54:00:71:1d:1a'
[   35.799270] cloud-init[3837]: ------------------------------------------------------------
[  OK  ] Started Ceph Manager.
[FAILED] Failed to start Initial cloud-init job (metadata service crawler).
See 'systemctl status cloud-init.service' for details.

Version-Release number of selected component (if applicable):

cloud-init-18.5-7.el8.noarch

How reproducible:

Perform a standard deployment of OSP16, once all nodes and services have been deployed
reboot one of the deployed overcloud nodes, the above message will be logged.

Steps to Reproduce:
1. Install the current version of RHOS16
2. Reboot a node from the successful deployment.

Comment 2 Bob Fournier 2020-01-27 22:19:13 UTC
Isn't this the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1768770 ?

Comment 3 Bob Fournier 2020-01-27 22:36:03 UTC
It looks like cloud-init isn't distinguishing between bridges/bonds and physical interfaces so it flags a duplicate mac when its a normal config. It looks like there is a cloud-init patch in progress - https://bugzilla.redhat.com/show_bug.cgi?id=1768770#c3.

This is also a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1791949.

Comment 4 Steve Baker 2020-01-27 22:45:05 UTC
*** Bug 1791949 has been marked as a duplicate of this bug. ***

Comment 6 Bob Fournier 2020-01-28 13:48:53 UTC
Can we get more info about the ramifications of this bug?  Does it make the node unusable after a reboot? Or is it just that the cloud-init fails to start on the rebooted node (which is probably acceptable as cloud-init is not relied on to do config after a reboot)?

This issue was seen in https://bugzilla.redhat.com/show_bug.cgi?id=1726165#c3 and at time it was determined that it wasn't really a problem.

Comment 7 Toure Dunnon 2020-01-28 14:50:57 UTC
The reason this bug has been opened, was to insure that OpenStack release organization was kept
apprised of this issue as it will affect our upcoming release.

Comment 8 Toure Dunnon 2020-01-28 14:51:42 UTC
I will test the brew package to see if that will correct this problem and report back.

Comment 10 pweeks 2020-01-28 17:36:45 UTC
Removing blocker flag - issue has been around since july, discovered in OSP15
Does not block cloudOps (via pradk).
Does not block hardprov (via bfournier).
Also see https://bugzilla.redhat.com/show_bug.cgi?id=1726165#c3 from bandini. 
There is a fix upstream which is being delivered in rhel 8.2, https://bugzilla.redhat.com/show_bug.cgi?id=1768770

Comment 15 Bob Fournier 2020-02-20 12:40:51 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1768770 is on ON_QA in RHEL 8.2.  Moving this to 16.1 which will use RHEL 8.2.

Comment 19 Bob Fournier 2020-04-02 16:42:11 UTC
Fix for https://bugzilla.redhat.com/show_bug.cgi?id=1768770 for RHEL has been verified for 8.2.  Closing this out.


Note You need to log in before you can comment on or make changes to this bug.