Bug 1702685
Summary: | Network doesn't come up at boot time after reboot on overcloud nodes | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> | ||||
Component: | openstack-tripleo-common | Assignee: | Adriano Petrich <apetrich> | ||||
Status: | CLOSED ERRATA | QA Contact: | Alexander Chuzhoy <sasha> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 15.0 (Stein) | CC: | apetrich, atonner, bfournie, dbecker, dsneddon, emacchi, hjensas, mburns, morazi, racedoro, sasha, sclewis, slinaber, ssmolyak | ||||
Target Milestone: | beta | Keywords: | AutomationBlocker, Regression, Triaged | ||||
Target Release: | 15.0 (Stein) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | openstack-tripleo-common-10.7.1-0.20190522180807.438b9fb.el8ost | Doc Type: | No Doc Update | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-09-21 11:21:34 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
This is a an issue with the network interfaces not being restarted, see also https://bugzilla.redhat.com/show_bug.cgi?id=1667265, which was opened against Fedora 29 but exhibits the same status of network service after reboot as the screen shot: $ systemctl status network.service ● network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; generated) Active: inactive (dead) Docs: man:systemd-sysv-generator(8) I'm not sure what we can do about this in OSP, its a RHEL 8 issue. *** Bug 1701866 has been marked as a duplicate of this bug. *** C(In reply to Bob Fournier from comment #1) > This is a an issue with the network interfaces not being restarted, see also > https://bugzilla.redhat.com/show_bug.cgi?id=1667265, which was opened > against Fedora 29 but exhibits the same status of network service after > reboot as the screen shot: > > $ systemctl status network.service > ● network.service - LSB: Bring up/down networking > Loaded: loaded (/etc/rc.d/init.d/network; generated) > Active: inactive (dead) > Docs: man:systemd-sysv-generator(8) > > I'm not sure what we can do about this in OSP, its a RHEL 8 issue. Can we perhaps enable the network service from OSP side? I thought I fixed that with https://review.opendev.org/#/q/topic:bug/1823353+(status:open+OR+status:merged) -- I wonder if the image change was taken in account when building the new images. also note for myself, I missed to fix the undercloud as well. I'll send a patch. I wasn't able to reproduce on both the undercloud & overcloud. However I'm hitting https://bugzilla.redhat.com/show_bug.cgi?id=1701866. Marius, can you try again and show me a reproducer ? (In reply to Emilien Macchi from comment #6) > I wasn't able to reproduce on both the undercloud & overcloud. However I'm > hitting https://bugzilla.redhat.com/show_bug.cgi?id=1701866. > > Marius, can you try again and show me a reproducer ? I've got a reproducer: [root@controller-0 heat-admin]# systemctl status network ● network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; generated) Active: inactive (dead) Docs: man:systemd-sysv-generator(8) openstack-tripleo-puppet-elements-10.3.1-0.20190420090433.9ba1438.el8ost.noarch The patch is present: [root@undercloud-0 stack]# cat /usr/share/tripleo-puppet-elements/overcloud-base/post-install.d/51-enable-network-service #!/bin/bash set -eux set -o pipefail # https://launchpad.net/bugs/1823353 systemctl enable network systemctl start network ## Images version rhosp-director-images-x86_64-15.0-20190423.1.el8ost.noarch rhosp-director-images-15.0-20190423.1.el8ost.noarch rhosp-director-images-ipa-x86_64-15.0-20190423.1.el8ost.noarch I could reboot my overcloud node today without any workaround... I'm a bit confused why it fails for me. You confirm the reboot doesn't work right? If yes, can you try to reboot after running a "systemctl enable network" and report back. Thanks (In reply to Emilien Macchi from comment #8) > I could reboot my overcloud node today without any workaround... I'm a bit > confused why it fails for me. You confirm the reboot doesn't work right? If > yes, can you try to reboot after running a "systemctl enable network" and > report back. > Thanks Yes, after rebooting one of the controller nodes it's not reachable over the network. I can confirm that after manually systemctl enable network and rebooting the nodes it is reachable at boot time. https://review.opendev.org/#/c/655758/ will fix the issue (In reply to Emilien Macchi from comment #10) > https://review.opendev.org/#/c/655758/ will fix the issue How do I test it? Do I need it on undercloud only or in mistral executor container as well? Fix is in FIV but bug didn't get updated so updating now. undercloud) [stack@undercloud-0 ~]$ dnf list installed openstack-tripleo-common Installed Packages openstack-tripleo-common.noarch 10.7.1-0.20190525000410.71c099f.el8ost @rhelosp-15.0-trunk (undercloud) [stack@undercloud-0 ~]$ . ./stackrc (undercloud) [stack@undercloud-0 ~]$ openstack server list +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | 307f181e-842b-4328-b95e-4e64ef5f43de | ceph-2 | ACTIVE | ctlplane=192.168.24.8 | overcloud-full | ceph | | a6cfdcea-c98a-429c-aa2b-59eb969b7164 | compute-1 | ACTIVE | ctlplane=192.168.24.16 | overcloud-full | compute | | 016b729d-6ed7-4363-aa8c-2b3965ff7a91 | ceph-0 | ACTIVE | ctlplane=192.168.24.6 | overcloud-full | ceph | | 2f3b9fa6-6049-4fb5-a05d-14d3ee6965ca | controller-2 | ACTIVE | ctlplane=192.168.24.12 | overcloud-full | controller | | 551ce9c2-e255-4e24-ad2e-e181873acaaa | controller-0 | ACTIVE | ctlplane=192.168.24.20 | overcloud-full | controller | | aca677ee-a9a3-42fc-9715-a975ab74d447 | controller-1 | ACTIVE | ctlplane=192.168.24.15 | overcloud-full | controller | | e9575990-0c91-430a-9b24-7380375361a4 | compute-0 | ACTIVE | ctlplane=192.168.24.10 | overcloud-full | compute | | 05c29a3e-2691-4bc7-8211-d872a80e3aee | ceph-1 | ACTIVE | ctlplane=192.168.24.23 | overcloud-full | ceph | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | 6d5436e8-c1ce-44e8-bbef-3f50217bd7ea | ceph-0 | 016b729d-6ed7-4363-aa8c-2b3965ff7a91 | power on | active | False | | 00984634-7eb8-4dd9-a524-6d6430be0d5a | ceph-1 | 307f181e-842b-4328-b95e-4e64ef5f43de | power on | active | False | | e3235602-a400-4e36-bd01-3f3e4d31acf3 | ceph-2 | 05c29a3e-2691-4bc7-8211-d872a80e3aee | power on | active | False | | 62291b36-fa7b-4367-8af1-e338588549cf | compute-0 | e9575990-0c91-430a-9b24-7380375361a4 | power on | active | False | | 633aa813-6e0f-488b-b92c-258137771434 | compute-1 | a6cfdcea-c98a-429c-aa2b-59eb969b7164 | power on | active | False | | 4ca41f32-2c0f-4c0c-aee7-87b3d2ddd7b3 | controller-0 | 2f3b9fa6-6049-4fb5-a05d-14d3ee6965ca | power on | active | False | | b3d2c723-a9b3-478d-8dae-04efa42ce5da | controller-1 | 551ce9c2-e255-4e24-ad2e-e181873acaaa | power on | active | False | | d8e93647-66b9-4c61-acc8-82078fcd587f | controller-2 | aca677ee-a9a3-42fc-9715-a975ab74d447 | power on | active | False | | 7a658110-a136-4a34-a99e-9e2cc45b54cf | ironic-0 | None | power off | available | False | | 6d1b2a74-cc7a-45c5-a011-26fc5486a2c1 | ironic-1 | None | power off | available | False | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node reboot d8e93647-66b9-4c61-acc8-82078fcd587f (undercloud) [stack@undercloud-0 ~]$ ping 192.168.24.15 PING 192.168.24.15 (192.168.24.15) 56(84) bytes of data. From 192.168.24.1 icmp_seq=9 Destination Host Unreachable From 192.168.24.1 icmp_seq=10 Destination Host Unreachable From 192.168.24.1 icmp_seq=11 Destination Host Unreachable From 192.168.24.1 icmp_seq=12 Destination Host Unreachable From 192.168.24.1 icmp_seq=13 Destination Host Unreachable From 192.168.24.1 icmp_seq=14 Destination Host Unreachable openstack baremetal node list +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | 6d5436e8-c1ce-44e8-bbef-3f50217bd7ea | ceph-0 | 016b729d-6ed7-4363-aa8c-2b3965ff7a91 | power on | active | False | | 00984634-7eb8-4dd9-a524-6d6430be0d5a | ceph-1 | 307f181e-842b-4328-b95e-4e64ef5f43de | power on | active | False | | e3235602-a400-4e36-bd01-3f3e4d31acf3 | ceph-2 | 05c29a3e-2691-4bc7-8211-d872a80e3aee | power on | active | False | | 62291b36-fa7b-4367-8af1-e338588549cf | compute-0 | e9575990-0c91-430a-9b24-7380375361a4 | power on | active | False | | 633aa813-6e0f-488b-b92c-258137771434 | compute-1 | a6cfdcea-c98a-429c-aa2b-59eb969b7164 | power on | active | False | | 4ca41f32-2c0f-4c0c-aee7-87b3d2ddd7b3 | controller-0 | 2f3b9fa6-6049-4fb5-a05d-14d3ee6965ca | power on | active | False | | b3d2c723-a9b3-478d-8dae-04efa42ce5da | controller-1 | 551ce9c2-e255-4e24-ad2e-e181873acaaa | power on | active | False | | d8e93647-66b9-4c61-acc8-82078fcd587f | controller-2 | aca677ee-a9a3-42fc-9715-a975ab74d447 | power on | active | False | | 7a658110-a136-4a34-a99e-9e2cc45b54cf | ironic-0 | None | power off | available | False | | 6d1b2a74-cc7a-45c5-a011-26fc5486a2c1 | ironic-1 | None | power off | available | False | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ (undercloud) [stack@undercloud-0 ~]$ ping 192.168.24.15 PING 192.168.24.15 (192.168.24.15) 56(84) bytes of data. 64 bytes from 192.168.24.15: icmp_seq=1 ttl=64 time=1.21 ms 64 bytes from 192.168.24.15: icmp_seq=2 ttl=64 time=0.358 ms 64 bytes from 192.168.24.15: icmp_seq=3 ttl=64 time=0.308 ms ^C --- 192.168.24.15 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 18ms rtt min/avg/max/mdev = 0.308/0.626/1.213/0.415 ms (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.24.15 Warning: Permanently added '192.168.24.15' (ECDSA) to the list of known hosts. [heat-admin@controller-1 ~]$ uptime 12:38:56 up 1 min, 1 user, load average: 25.95, 6.80, 2.30 [heat-admin@controller-1 ~]$ exit successfully rebooted overcloud controller and found it accessible after the reboot: Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |
Created attachment 1558201 [details] overcloud node after reboot console Description of problem: Network doesn't come up at boot time after reboot on overcloud nodes. Version-Release number of selected component (if applicable): 15 -p RHOS_TRUNK-15.0-RHEL-8-20190423.n.1 How reproducible: 100% Steps to Reproduce: 1. Deploy OSP15 overcloud 2. SSH to one of the overcloud nodes and run reboot Actual results: The node isn't accessible via SSH after reboot because the network service is down. Expected results: The node is accessible via SSH after reboot. Additional info: Attaching console screenshot.