Bug 1777529 - [OSP16] Undercloud deployment fails due to dhcp timeout on eth0
Summary: [OSP16] Undercloud deployment fails due to dhcp timeout on eth0
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: beta
: 16.0 (Train on RHEL 8.1)
Assignee: Michele Baldessari
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-27 18:14 UTC by Roman Safronov
Modified: 2020-02-06 14:43 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.1-0.20191129120356.3d9ae93.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-06 14:42:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 696625 0 'None' MERGED Move 'Ensure network service is enabled' after os-net-config has run 2021-02-04 09:49:19 UTC
Red Hat Product Errata RHEA-2020:0283 0 None None None 2020-02-06 14:43:58 UTC

Description Roman Safronov 2019-11-27 18:14:15 UTC
Description of problem:
Undercloud deployment fails when validating that network is up. eth0 fails to get ip address via dhcp

Link to failed job:
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-16_director-rhel-virthost-3cont_2comp-ipv4-geneve/13/


Version-Release number of selected component (if applicable):
16.0-RHEL-8/RHOS_TRUNK-16.0-RHEL-8-20191126.n.2


How reproducible:
Failed 3 times in a row, note until Nov 26 the job used to be successful


Steps to Reproduce:
1. Execute deployment job https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-16_director-rhel-virthost-3cont_2comp-ipv4-geneve/



Actual results:
Deployment fails

Expected results:
Deployment succeeds

Additional info:


from undercloud_install.log

TASK [Ensure network service is enabled] ***************************************
Wednesday 27 November 2019  12:37:20 -0500 (0:00:00.275)       0:00:38.187 ****
fatal: [undercloud-0]: FAILED! => {"changed": false, "msg": "Unable to start service network: Job for network.service failed because the control process exited with error code.\nSee \"system
ctl status network.service\" and \"journalctl -xe\" for details.\n"}

NO MORE HOSTS LEFT *************************************************************




From journalctl


-- Unit network.service has begun starting up.
Nov 27 12:37:21 undercloud-0.redhat.local network[15473]: WARN      : [network] You are using 'network' service provided by 'network-scripts', which are now deprecated.
Nov 27 12:37:21 undercloud-0.redhat.local network[15486]: You are using 'network' service provided by 'network-scripts', which are now deprecated.
Nov 27 12:37:21 undercloud-0.redhat.local network[15473]: WARN      : [network] 'network-scripts' will be removed in one of the next major releases of RHEL.
Nov 27 12:37:21 undercloud-0.redhat.local network[15487]: 'network-scripts' will be removed in one of the next major releases of RHEL.
Nov 27 12:37:21 undercloud-0.redhat.local network[15473]: WARN      : [network] It is advised to switch to 'NetworkManager' instead for network management.
Nov 27 12:37:21 undercloud-0.redhat.local network[15488]: It is advised to switch to 'NetworkManager' instead for network management.
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.2541] audit: op="connections-reload" pid=15520 uid=0 result="success"
Nov 27 12:37:21 undercloud-0.redhat.local network[15473]: Bringing up loopback interface:  [  OK  ]
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.5780] audit: op="connections-load" args="/etc/sysconfig/network-scripts/ifcfg-eth0" pid=15617 uid=0 resul>
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6707] agent-manager: req[0x557fd5b978a0, :1.86/nmcli-connect/0]: agent registered
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6733] device (eth0): Activation: starting connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6734] audit: op="connection-activate" uuid="5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03" name="System eth0" pid=>
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6737] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6747] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6818] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Nov 27 12:37:21 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876241.6826] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds)
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <warn>  [1574876287.4415] dhcp4 (eth0): request timed out
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.4416] dhcp4 (eth0): state changed unknown -> timeout
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.4508] dhcp4 (eth0): canceled DHCP transaction
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.4509] dhcp4 (eth0): state changed timeout -> done
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.4514] device (eth0): state change: ip-config -> failed (reason 'ip-config-unavailable', sys-iface-state: >
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <warn>  [1574876287.4540] device (eth0): Activation: failed for connection 'System eth0'
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.4545] device (eth0): state change: failed -> disconnected (reason 'none', sys-iface-state: 'managed')
Nov 27 12:38:07 undercloud-0.redhat.local network[15473]: Bringing up interface eth0:  Error: Connection activation failed: IP configuration could not be reserved (no available address, tim>
Nov 27 12:38:07 undercloud-0.redhat.local network[15473]: Hint: use 'journalctl -xe NM_CONNECTION=5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03 + NM_DEVICE=eth0' to get more details.
Nov 27 12:38:07 undercloud-0.redhat.local network[15473]: [FAILED]
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.5217] audit: op="connections-load" args="/etc/sysconfig/network-scripts/ifcfg-eth1" pid=15643 uid=0 resul>
Nov 27 12:38:07 undercloud-0.redhat.local network[15473]: Bringing up interface eth1:  [  OK  ]
Nov 27 12:38:07 undercloud-0.redhat.local NetworkManager[1396]: <info>  [1574876287.6361] audit: op="connections-load" args="/etc/sysconfig/network-scripts/ifcfg-eth2" pid=15665 uid=0 resul>
Nov 27 12:38:07 undercloud-0.redhat.local network[15473]: Bringing up interface eth2:  [  OK  ]
Nov 27 12:38:07 undercloud-0.redhat.local systemd[1]: network.service: Control process exited, code=exited status=1
Nov 27 12:38:07 undercloud-0.redhat.local systemd[1]: network.service: Failed with result 'exit-code'.
Nov 27 12:38:07 undercloud-0.redhat.local systemd[1]: Failed to start LSB: Bring up/down networking.
-- Subject: Unit network.service has failed
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
-- 
-- Unit network.service has failed.
-- 




[stack@undercloud-0 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:60:81:4c brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:d3:88:6a brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.33/24 brd 172.16.0.255 scope global dynamic noprefixroute eth1
       valid_lft 3180sec preferred_lft 3180sec
    inet6 fe80::5054:ff:fed3:886a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:92:a3:d6 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.60/24 brd 10.0.0.255 scope global dynamic noprefixroute eth2
       valid_lft 3180sec preferred_lft 3180sec
    inet6 2620:52:0:13b8::fe:ce/128 scope global dynamic noprefixroute 
       valid_lft 3064sec preferred_lft 3064sec
    inet6 fe80::5054:ff:fe92:a3d6/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

Comment 1 Roman Safronov 2019-11-27 18:22:47 UTC
Raising priority as the issue seems a blocker, it happens also to other users, see OSP16 deployments on customized job page
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OSPD-Customized-Deployment-virt/

Comment 13 errata-xmlrpc 2020-02-06 14:42:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283


Note You need to log in before you can comment on or make changes to this bug.