Bug 1600178

Summary:

Neutron routers become unavailable after rebooting networker nodes post minor update

Product:

Red Hat OpenStack

Reporter:

Marius Cornea <mcornea>

Component:

puppet-tripleo

Assignee:

Sofer Athlan-Guyot <sathlang>

Status:

CLOSED ERRATA

QA Contact:

Amit Ugol <augol>

Severity:

urgent

Docs Contact:

Priority:

urgent

Version:

10.0 (Newton)

CC:

aguetta, amuller, bhaley, ccamacho, dbecker, gkumar, jamsmith, jfrancoa, jhardee, jjoyce, jpretori, jschluet, kiyyappa, majopela, mburns, mcornea, morazi, ojanas, pcaruana, pmorey, sathlang, slinaber, sputhenp, tvignaud

Target Milestone:

z10

Keywords:

Triaged, ZStream

Target Release:

10.0 (Newton)

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

puppet-tripleo-5.6.8-17.el7ost

Doc Type:

Bug Fix

Doc Text:

Previously, neutron routers became unavailable when using networker role after a minor update. This was caused by a change in the way the neutron server node on the overcloud (the host parameter) was identified. The value could be be overwritten. As a result, the old l3 agent and attached FIP were unavailable. With this update, the neutron host parameter does not change on the node implementing the networker role during the update and the FIP are available.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-11-26 18:00:29 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
sosreport-networker-0	none
sosreport-networker-1	none
neutron.tar.gz	none

Description Marius Cornea 2018-07-11 15:26:43 UTC

Description of problem:

After running the OSP10 minor update procedure and after rebooting the networker nodes the Neutron routers created before running the minor update procedure are note available anymore.

According to the /var/log/neutron/l3-agent.log files it appears that the neutron l3 agent process cannot access the pid files in /var/lib/neutron/external/pids/ due to permission errors:

[stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router f4c26283-f4a9-4a65-be11-5b77cf330286
+--------------------------------------+-------------------------+----------------+-------+----------+
| id                                   | host                    | admin_state_up | alive | ha_state |
+--------------------------------------+-------------------------+----------------+-------+----------+
| 9d13f3a5-199d-43a5-a49e-8a65f0188b05 | networker-0.localdomain | True           | xxx   | standby  |
| b56c201a-2443-478d-ab92-f5f9c6792515 | networker-1.localdomain | True           | xxx   | standby  |
+--------------------------------------+-------------------------+----------------+-------+----------+

[root@networker-1 heat-admin]# grep f4c26283-f4a9-4a65-be11-5b77cf330286 /var/log/neutron/l3-agent.log
2018-07-11 14:55:56.276 249422 DEBUG neutron.agent.l3.ha [-] Handling notification for router f4c26283-f4a9-4a65-be11-5b77cf330286, state master enqueue /usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py:88
2018-07-11 14:55:56.277 249422 INFO neutron.agent.l3.ha [-] Router f4c26283-f4a9-4a65-be11-5b77cf330286 transitioned to master
2018-07-11 14:55:56.278 249422 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-f4c26283-f4a9-4a65-be11-5b77cf330286', 'sysctl', '-w', 'net.ipv6.conf.qg-28e948c5-80.accept_ra=2'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:105
2018-07-11 14:55:56.329 249422 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-f4c26283-f4a9-4a65-be11-5b77cf330286', 'sysctl', '-w', 'net.ipv6.conf.qg-28e948c5-80.forwarding=1'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:105
2018-07-11 14:55:56.356 249422 DEBUG neutron.agent.l3.ha [-] Spawning metadata proxy for router f4c26283-f4a9-4a65-be11-5b77cf330286 _update_metadata_proxy /usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py:194
2018-07-11 14:55:56.356 249422 DEBUG neutron.agent.linux.utils [-] Unable to access /var/lib/neutron/external/pids/f4c26283-f4a9-4a65-be11-5b77cf330286.pid get_value_from_file /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:267
2018-07-11 14:55:56.357 249422 DEBUG neutron.agent.linux.utils [-] Unable to access /var/lib/neutron/external/pids/f4c26283-f4a9-4a65-be11-5b77cf330286.pid get_value_from_file /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:267
    pidfile     /var/lib/neutron/external/pids/f4c26283-f4a9-4a65-be11-5b77cf330286.pid
    http-request add-header X-Neutron-Router-ID f4c26283-f4a9-4a65-be11-5b77cf330286
2018-07-11 14:55:56.361 249422 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-f4c26283-f4a9-4a65-be11-5b77cf330286', 'haproxy', '-f', '/var/lib/neutron/ns-metadata-proxy/f4c26283-f4a9-4a65-be11-5b77cf330286.conf'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:105
2018-07-11 14:55:56.405 249422 DEBUG neutron.agent.l3.router_info [-] Spawning radvd daemon in router device: f4c26283-f4a9-4a65-be11-5b77cf330286 enable_radvd /usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py:492
2018-07-11 14:55:56.406 249422 DEBUG neutron.agent.linux.utils [-] Unable to access /var/lib/neutron/external/pids/f4c26283-f4a9-4a65-be11-5b77cf330286.pid.radvd get_value_from_file /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:267
2018-07-11 14:55:56.406 249422 DEBUG neutron.agent.linux.utils [-] Unable to access /var/lib/neutron/external/pids/f4c26283-f4a9-4a65-be11-5b77cf330286.pid.radvd get_value_from_file /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:267
2018-07-11 14:55:56.407 249422 DEBUG neutron.agent.linux.external_process [-] No process started for f4c26283-f4a9-4a65-be11-5b77cf330286 disable /usr/lib/python2.7/site-packages/neutron/agent/linux/external_process.py:123
2018-07-11 14:55:56.407 249422 DEBUG neutron.agent.linux.ra [-] radvd disabled for router f4c26283-f4a9-4a65-be11-5b77cf330286 disable /usr/lib/python2.7/site-packages/neutron/agent/linux/ra.py:192
2018-07-11 14:56:12.411 249422 DEBUG neutron.agent.l3.ha [-] Updating server with HA routers states {'461ca428-e6de-4fc2-b571-507a96476a83': 'active', 'd8074e7e-b108-4c62-aa95-b49e72867562': 'active', '7c3dd049-cc73-43cf-9e8d-649da3ecacea': 'active', '03ca67c8-eece-44ee-9a51-fd7adff0d85f': 'active', 'edccfa87-1ea6-4937-a0ef-2f4c0d8551fe': 'active', '788fcbae-6bb4-4458-9298-81dadefe8dc5': 'active', 'a8b24af1-3d7a-4ff5-baf0-f8577ed77a23': 'active', '448cdab2-5bbe-41d7-a2e9-5ad60596b107': 'active', 'f4c26283-f4a9-4a65-be11-5b77cf330286': 'active', '7b9920fb-5299-4388-b74c-fe7ab6d068a4': 'active', '7b35e807-52a5-49cf-96f6-facf88456488': 'active', '6fc34db3-4b90-4556-98c2-d5366c0685d1': 'active', '75da816e-ac8b-4c45-b8fa-6f6f6158e6af': 'active'} notify_server /usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py:215

[root@networker-1 heat-admin]# ls -l /var/lib/neutron/external/pids/
total 52
-rw-r--r--. 1 root root 7 Jul 11 14:55 03ca67c8-eece-44ee-9a51-fd7adff0d85f.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 448cdab2-5bbe-41d7-a2e9-5ad60596b107.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 461ca428-e6de-4fc2-b571-507a96476a83.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 6fc34db3-4b90-4556-98c2-d5366c0685d1.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 75da816e-ac8b-4c45-b8fa-6f6f6158e6af.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 788fcbae-6bb4-4458-9298-81dadefe8dc5.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 7b35e807-52a5-49cf-96f6-facf88456488.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 7b9920fb-5299-4388-b74c-fe7ab6d068a4.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 7c3dd049-cc73-43cf-9e8d-649da3ecacea.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 a8b24af1-3d7a-4ff5-baf0-f8577ed77a23.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 d8074e7e-b108-4c62-aa95-b49e72867562.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 edccfa87-1ea6-4937-a0ef-2f4c0d8551fe.pid
-rw-r--r--. 1 root root 7 Jul 11 14:55 f4c26283-f4a9-4a65-be11-5b77cf330286.pid


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:

1. Deploy OSP10 z4 with 3 controllers + 2 computes + 3 ceph osd nodes + 2 networker nodes

2. Reboot overcloud nodes

3. Check neutron routers created before update status 

Actual results:
All l3 agents are in ha_state 'standby'

Expected results:
There is one active l3 agent per router.

Additional info:
Attaching sosreports from networker nodes.

Comment 1 Marius Cornea 2018-07-11 15:35:32 UTC

Created attachment 1458140 [details]
sosreport-networker-0

Comment 2 Marius Cornea 2018-07-11 15:36:14 UTC

Created attachment 1458141 [details]
sosreport-networker-1

Comment 3 Marius Cornea 2018-07-11 18:27:43 UTC

Some notes: the issue didn't reproduce on a fresh OSP10 latest deployment.

Comment 4 Brian Haley 2018-07-11 19:01:52 UTC

So the odd thing I see in the l3-agent log on networker-1 is:

2018-07-11 15:07:47.266 3722 DEBUG neutron.agent.l3.agent [req-f1f57f60-e241-43fb-b06b-cebc838896f7 - - - - -] Starting fullsync periodic_sync_routers_task periodic_sync_routers_task

[...]

2018-07-11 15:07:53.127 3722 DEBUG neutron.agent.l3.agent [req-f1f57f60-e241-43fb-b06b-cebc838896f7 - - - - -] periodic_sync_routers_task successfully completed fetch_and_sync_all_routers

But there was no work done, and no message back to the server notifying it of the current state of routers.  For example, right before the restart there was:

2018-07-11 14:56:12.411 249422 DEBUG neutron.agent.l3.ha [-] Updating server with HA routers states {'461ca428-e6de-4fc2-b571-507a96476a83': 'active', 'd8074e7e-b108-4c62-aa95-b49e72867562': 'active', '7c3dd049-cc73-43cf-9e8d-649da3ecacea': 'active', '03ca67c8-eece-44ee-9a51-fd7adff0d85f': 'active', 'edccfa87-1ea6-4937-a0ef-2f4c0d8551fe': 'active', '788fcbae-6bb4-4458-9298-81dadefe8dc5': 'active', 'a8b24af1-3d7a-4ff5-baf0-f8577ed77a23': 'active', '448cdab2-5bbe-41d7-a2e9-5ad60596b107': 'active', 'f4c26283-f4a9-4a65-be11-5b77cf330286': 'active', '7b9920fb-5299-4388-b74c-fe7ab6d068a4': 'active', '7b35e807-52a5-49cf-96f6-facf88456488': 'active', '6fc34db3-4b90-4556-98c2-d5366c0685d1': 'active', '75da816e-ac8b-4c45-b8fa-6f6f6158e6af': 'active'} notify_server

I didn't see an l3-agent log in the networker-0 sosreport to correlate things.

It's as if there were no routers returned in the full sync call.

Comment 5 Marius Cornea 2018-07-12 03:05:14 UTC

I reproduced the issue(it only shows up after minor update). After rebooting networker-1 node:

[stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router 4c373aff-7755-40c5-80b7-289475fd9008

+--------------------------------------+-------------------------+----------------+-------+----------+
| id                                   | host                    | admin_state_up | alive | ha_state |
+--------------------------------------+-------------------------+----------------+-------+----------+
| 3d07375f-3e6a-41bb-9712-f814c5008807 | networker-0.localdomain | True           | :-)   | active   |
| 61e270cb-586c-49df-b925-2d9d58bbe70d | networker-1.localdomain | True           | xxx   | standby  |
+--------------------------------------+-------------------------+----------------+-------+----------+

Attaching /var/log/neutron from networker-1.

Comment 6 Marius Cornea 2018-07-12 03:05:40 UTC

Created attachment 1458253 [details]
neutron.tar.gz

Comment 7 Marius Cornea 2018-07-13 13:54:56 UTC

Note: this only seems to be happening when you have separate Networker nodes. I wasn't able to reproduce it with monolithic controllers.

Comment 8 Marius Cornea 2018-07-15 02:07:31 UTC

OK, so what happens is that after update and reboot the hostname for the agents changes(see networker-1.localdomain agents are down while networker-1 agents are up): 


[stack@undercloud-0 ~]$ neutron agent-list

+--------------------------------------+--------------------+-------------------------+-------------------+-------+----------------+---------------------------+
| id                                   | agent_type         | host                    | availability_zone | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+-------------------------+-------------------+-------+----------------+---------------------------+
| 03e37225-5d81-4acc-87dd-287afdec9e09 | Metadata agent     | networker-1.localdomain |                   | xxx   | True           | neutron-metadata-agent    |
| 37d9c9a1-99ab-4744-aa78-5b3948743670 | Open vSwitch agent | networker-0.localdomain |                   | :-)   | True           | neutron-openvswitch-agent |
| 3c591ad5-8a2d-40db-9eb6-2d3a1251e201 | DHCP agent         | networker-0.localdomain | nova              | :-)   | True           | neutron-dhcp-agent        |
| 402cb673-47a8-43d7-b81c-69330c8d4e45 | Open vSwitch agent | compute-0.localdomain   |                   | :-)   | True           | neutron-openvswitch-agent |
| 496f9a91-5b51-486f-81a1-31b526ccbf65 | L3 agent           | networker-0.localdomain | nova              | :-)   | True           | neutron-l3-agent          |
| 55da449a-3d94-4165-89fc-499025050f25 | DHCP agent         | networker-1             | nova              | :-)   | True           | neutron-dhcp-agent        |
| 88e7f6aa-3520-4c52-8f63-f3dd3fd08172 | Metadata agent     | networker-0.localdomain |                   | :-)   | True           | neutron-metadata-agent    |
| 8a0c92ed-cdc7-4701-9fbf-86f3a4589fe9 | L3 agent           | networker-1.localdomain | nova              | xxx   | True           | neutron-l3-agent          |
| 94292a2f-d8e4-480a-bb21-e8df37f97154 | Open vSwitch agent | compute-4.localdomain   |                   | :-)   | True           | neutron-openvswitch-agent |
| 9816c1be-f4f9-46e3-b905-f8efaacb54b1 | Open vSwitch agent | networker-1.localdomain |                   | xxx   | True           | neutron-openvswitch-agent |
| 9be3b9ba-5cb1-42b9-a1de-f8ce1748d444 | Open vSwitch agent | networker-1             |                   | :-)   | True           | neutron-openvswitch-agent |
| bb3a5435-e9e2-4c01-b717-989d28c14486 | Open vSwitch agent | compute-2.localdomain   |                   | :-)   | True           | neutron-openvswitch-agent |
| d4783830-25d9-4a4e-955d-55e5670c5bb5 | DHCP agent         | networker-1.localdomain | nova              | xxx   | True           | neutron-dhcp-agent        |
| df386d89-d052-4052-8e23-2abe8ea03f4f | L3 agent           | networker-1             | nova              | :-)   | True           | neutron-l3-agent          |
| e24a5ce2-a0ee-4733-93ec-6c191190782f | Open vSwitch agent | compute-1.localdomain   |                   | :-)   | True           | neutron-openvswitch-agent |
| e27638e7-2b3c-4ba7-a293-3a79395ec47b | Metadata agent     | networker-1             |                   | :-)   | True           | neutron-metadata-agent    |
| e295f9e6-6180-4531-985e-786e61f2b4d8 | Open vSwitch agent | compute-3.localdomain   |                   | :-)   | True           | neutron-openvswitch-agent |
+--------------------------------------+--------------------+-------------------------+-------------------+-------+----------------+---------------------------+

Comment 9 Marius Cornea 2018-07-15 02:40:38 UTC

[root@networker-1 heat-admin]# grep -v ^# /etc/neutron/neutron.conf  | grep -v ^$
[DEFAULT]
auth_strategy=keystone
core_plugin=ml2
service_plugins=router,qos,trunk
allow_overlapping_ips=True
host=networker-1
global_physnet_mtu=1496
dhcp_agents_per_network=2
debug=True
log_dir=/var/log/neutron
rpc_backend=rabbit
control_exchange=neutron
[agent]
root_helper=sudo neutron-rootwrap /etc/neutron/rootwrap.conf
[cors]
[cors.subdomain]
[database]
[keystone_authtoken]
[matchmaker_redis]
[nova]
[oslo_concurrency]
lock_path=$state_path/lock
[oslo_messaging_amqp]
[oslo_messaging_notifications]
[oslo_messaging_rabbit]
rabbit_hosts=172.17.1.17:5672,172.17.1.18:5672,172.17.1.27:5672
rabbit_use_ssl=False
rabbit_userid=guest
rabbit_password=8rabVQE2unyVyvb3uwmhBXPuV
rabbit_ha_queues=True
heartbeat_timeout_threshold=60
[oslo_messaging_zmq]
[oslo_middleware]
[oslo_policy]
[qos]
[quotas]
[ssl]

Comment 10 Marius Cornea 2018-07-15 02:41:17 UTC

Adding DFG:Upgrades as I think this is related to BZ#1499201

Comment 11 Assaf Muller 2018-07-23 13:44:03 UTC

Moving Networking DFG to observer / secondary DFG. If the hostname changes on the agents that explains the problem, and seems to be more Upgrades DFG related. Marius do you think that RHBZ 1499201 might not have been resolved in all cases?

Comment 12 Marius Cornea 2018-07-23 13:54:20 UTC

(In reply to Assaf Muller from comment #11)
> Moving Networking DFG to observer / secondary DFG. If the hostname changes
> on the agents that explains the problem, and seems to be more Upgrades DFG
> related. Marius do you think that RHBZ 1499201 might not have been resolved
> in all cases?

Yes, I could only spot this issue when using a Networker role so I suspect this is affecting only deployments involving custom roles(not the monolithic controllers).

Comment 13 Sofer Athlan-Guyot 2018-09-03 13:14:55 UTC

Hi,

so, we definitively have the host change:

Jul 11 01:10:16 networker-0 os-collect-config[3201]: [2018-07-11 01:10:15,354] (heat-config) [DEBUG] [2018-07-11 01:09:53,868] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/c0d7893b-22e4-4e0f-9099-fbe275ccb2d7"  FACTER_fqdn="networker-0.localdomain"  FACTER_deploy_config_name="NetworkerDeployment_Step3"  puppet apply --detailed-exitcodes --logdest console --modulepath /etc/puppet/modules:/opt/stack/puppet-modules:/usr/share/openstack-puppet/modules --debug --logdest /var/log/puppet/heat-debug.log /var/lib/heat-config/heat-config-puppet/c0d7893b-22e4-4e0f-9099-fbe275ccb2d7.pp
/Stage[main]/Neutron/Neutron_config[DEFAULT/host]/value: value changed ['networker-0.localdomain'] to ['networker-0']\

Jul 11 01:10:16 networker-0 os-collect-config[3201]: Debug: Loading facts from /usr/share/openstack-puppet/modules/tripleo/lib/facter/current_config_hosts.rb

and the current_config_hosts.rb is loaded.  So it output network-0 as a result of its "introspection", which is unexpected.

I will need to deploy the same env to see what is happening in this context.

As a workaround, the workaround scripts in https://bugzilla.redhat.com/show_bug.cgi?id=1499201 should work here as well (as temporary fix):

 - change host value in neutron.conf to the previous one,restart neutron on the networker role
 - use " Workaround to get l3ha routers rescheduled " to get the new value in.

Comment 24 Sofer Athlan-Guyot 2018-09-28 12:08:53 UTC

hi,

so the networker replacement test doesn't seem to use overcloud stack update command but some form of added templates to a deploy command.

So first I wonder if this change of protocol for testing invalidate the patch as we're still not sure that it's working or not for update.

Then I need logs or better yet a environment to be able to determine if the problem is similar to what we saw during update.

Lastly I would like to know which version of rhel we're using for osp10 deployment.

Thanks,

Comment 27 Sofer Athlan-Guyot 2018-10-03 10:00:00 UTC

Hi,

so here's the rpm build.

If it's delivered as an hotfix, it has to be installed on all the overcloud nodes.


One way to do that is:

------8<-------

RPM_PATH=[path to the rpm downloaded on the undercloud]
. ~/stackrc
openstack server list -f json > ~/server.json
jq -r '.[] | .Networks' ~/server.json  | cut -d= -f2 > ~/ips.txt

for ip in $(cat ~/ips.txt); do scp $RMP_PATH heat-admin@${ip}: ; done
for ip in $(cat ~/ips.txt); do ssh heat-admin@${ip} yum install -y ./$(basename $RMP_PATH) ; done

------>8-------

there are others, it's just given as an example.

Comment 30 Sofer Athlan-Guyot 2018-10-08 14:59:20 UTC

(In reply to Sofer Athlan-Guyot from comment #27)
Hi,

Mistake in the example script (missing sudo), sorry:

------8<-------

RPM_PATH=[path to the rpm downloaded on the undercloud]
. ~/stackrc
openstack server list -f json > ~/server.json
jq -r '.[] | .Networks' ~/server.json  | cut -d= -f2 > ~/ips.txt

for ip in $(cat ~/ips.txt); do scp $RMP_PATH heat-admin@${ip}: ; done
for ip in $(cat ~/ips.txt); do ssh heat-admin@${ip} sudo yum install -y ./$(basename $RMP_PATH) ; done

------>8-------


> Hi,
> 
> so here's the rpm build.
> 
> If it's delivered as an hotfix, it has to be installed on all the overcloud
> nodes.
> 
> 
> One way to do that is:
> 
> ------8<-------
> 
> RPM_PATH=[path to the rpm downloaded on the undercloud]
> . ~/stackrc
> openstack server list -f json > ~/server.json
> jq -r '.[] | .Networks' ~/server.json  | cut -d= -f2 > ~/ips.txt
> 
> for ip in $(cat ~/ips.txt); do scp $RMP_PATH heat-admin@${ip}: ; done
> for ip in $(cat ~/ips.txt); do ssh heat-admin@${ip} yum install -y
> ./$(basename $RMP_PATH) ; done
> 
> ------>8-------
> 
> there are others, it's just given as an example.

Comment 37 Sofer Athlan-Guyot 2018-10-22 12:52:51 UTC

*** Bug 1638303 has been marked as a duplicate of this bug. ***

Comment 41 Sofer Athlan-Guyot 2018-11-20 19:30:51 UTC

Moving this to verified as Marius has already tested the patch and the issue was solved when shipped on the client side.

Comment 46 errata-xmlrpc 2018-11-26 18:00:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3673

Comment 48 Sofer Athlan-Guyot 2019-06-10 10:10:18 UTC

Hi,

please check that comment [1] especially the part on the undercloud configuration needed to have scale out node with host parameter with fqdn[2]. The fact that cloud-init set it to short name indicate that it may be the case that the undercloud is not configured properly:

2019-04-02 14:31:43,422 - cc_set_hostname.py[DEBUG]: Setting the hostname to m1pl-st-comp0-12 (m1pl-st-comp0-12)
2019-04-02 14:31:43,422 - util.py[DEBUG]: Running command ['hostnamectl', 'set-hostname', 'm1pl-st-comp0-12'] with allowed return codes [0] (shell=False, capture=True)

So first check the undercloud configuration and make sure it match what is described in [2]. Please report you finding.

For better tracking I think it would be better to open a new bz, like "scale out compute node have short name host parameter, blocking ffu".

Thanks,

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1657692#c21
[2] https://access.redhat.com/solutions/2089051