Bug 1657889 - [OSP10] service hostname gets updated to fqdn make already spawned instance fail to start
Summary: [OSP10] service hostname gets updated to fqdn make already spawned instance f...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z11
: 10.0 (Newton)
Assignee: Martin Schuppert
QA Contact: nlevinki
URL:
Whiteboard:
: 1641682 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-10 16:32 UTC by Martin Schuppert
Modified: 2019-06-04 14:54 UTC (History)
8 users (show)

Fixed In Version: puppet-tripleo-5.6.8-22.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-30 16:58:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0921 0 None None None 2019-04-30 16:59:10 UTC

Description Martin Schuppert 2018-12-10 16:32:24 UTC
Description of problem:

like in BZ1600178 nova host= parameter change to fqdn which make already spawned instances fail to start after an update/upgrade as the reference is still the to the old service host.

here we track an automated way to update the host reference of existing instances.


(overcloud) [stack@undercloud-0 ~]$ openstack server list --long                           
+--------------------------------------+------+--------+------------+-------------+---------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------+------------+
| ID                                   | Name | Status | Task State | Power State | Networks            | Image Name | Image ID                             | Flavor Name | Flavor ID                            | Availability Zone | Host      | Properties |
+--------------------------------------+------+--------+------------+-------------+---------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------+------------+
| e097ee21-aa84-483a-a5b5-2b2f26a348be | test | ACTIVE | None       | Running     | private=192.168.0.3 | cirros     | 8d9717d5-20f3-4240-a9f7-7770515c9117 | m1.small    | ebf406c7-57a8-404d-b419-dbc9ad68027c | nova              | compute-1 |            |
+--------------------------------------+------+--------+------------+-------------+---------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------+------------+
                                                                                           
(overcloud) [stack@undercloud-0 ~]$ nova service-list                                      
+--------------------------------------+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+-------------+
| Id                                   | Binary           | Host                     | Zone     | Status  | State | Updated_at                 | Disabled Reason | Forced down |
+--------------------------------------+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+-------------+
| f390beea-cc02-4a0f-921c-036a773ef26d | nova-scheduler   | controller-0.localdomain | internal | enabled | up    | 2018-12-10T15:28:46.000000 | -               | False       |
| 282e83b4-536a-4f83-985a-e5dbec2fdb67 | nova-consoleauth | controller-0.localdomain | internal | enabled | up    | 2018-12-10T15:28:37.000000 | -               | False       |
| b82ce887-aca3-4c4d-8046-8786da128ce7 | nova-conductor   | controller-0.localdomain | internal | enabled | up    | 2018-12-10T15:28:46.000000 | -               | False       |
| 959d45b1-6856-4faf-b00a-ff8b927083c9 | nova-compute     | compute-1                | nova     | enabled | up    | 2018-12-10T15:28:39.000000 | -               | False       |
| 0559e3a8-d502-4bf2-944c-076c15417fa2 | nova-compute     | compute-0.localdomain    | nova     | enabled | up    | 2018-12-10T15:28:45.000000 | -               | False       |
+--------------------------------------+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+-------------+
                                                                                           
                                                                                           
                                                                                           
(overcloud) [stack@undercloud-0 ~]$ nova stop test                                         
                                                                                           
[root@compute-1 nova]# egrep ^host= /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf
host=compute-1.localdomain                                                                 
                                                                                           
[root@compute-1 nova]# docker restart nova_compute                                                                                       
                                                                                                   
(overcloud) [stack@undercloud-0 ~]$ nova service-list                                      
+--------------------------------------+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+-------------+
| Id                                   | Binary           | Host                     | Zone     | Status  | State | Updated_at                 | Disabled Reason | Forced down |
+--------------------------------------+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+-------------+
| f390beea-cc02-4a0f-921c-036a773ef26d | nova-scheduler   | controller-0.localdomain | internal | enabled | up    | 2018-12-10T15:31:56.000000 | -               | False       |
| 282e83b4-536a-4f83-985a-e5dbec2fdb67 | nova-consoleauth | controller-0.localdomain | internal | enabled | up    | 2018-12-10T15:31:57.000000 | -               | False       |
| b82ce887-aca3-4c4d-8046-8786da128ce7 | nova-conductor   | controller-0.localdomain | internal | enabled | up    | 2018-12-10T15:31:56.000000 | -               | False       |
| 959d45b1-6856-4faf-b00a-ff8b927083c9 | nova-compute     | compute-1                | nova     | enabled | down  | 2018-12-10T15:30:39.000000 | -               | False       |
| 0559e3a8-d502-4bf2-944c-076c15417fa2 | nova-compute     | compute-0.localdomain    | nova     | enabled | up    | 2018-12-10T15:31:55.000000 | -               | False       |
| 2e250b21-ff62-4581-b844-b448bc49b8f4 | nova-compute     | compute-1.localdomain    | nova     | enabled | up    | 2018-12-10T15:31:58.000000 | -               | False       |
+--------------------------------------+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+-------------+
                                                                                                   
                                                                                           
(overcloud) [stack@undercloud-0 ~]$ nova show test | egrep "host"                          
| OS-EXT-SRV-ATTR:host                 | compute-1                                                |
| OS-EXT-SRV-ATTR:hostname             | test                                                     |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-1                                                |
| hostId                               | 12aa7552c1ce791f98be9cc648910d5dd61ee6c0c1e6650b018c8c69 |
| host_status                          | UNKNOWN                                                  |


MariaDB [nova]> select created_at,updated_at,deleted_at,id,host from services where deleted <>1;
+---------------------+---------------------+------------+----+--------------------------+
| created_at          | updated_at          | deleted_at | id | host                     |
+---------------------+---------------------+------------+----+--------------------------+
| 2018-12-10 11:12:10 | 2018-12-10 15:34:46 | NULL       |  1 | controller-0.localdomain |
| 2018-12-10 11:12:16 | 2018-12-10 15:34:47 | NULL       |  2 | controller-0.localdomain |
| 2018-12-10 11:12:25 | 2018-12-10 15:34:46 | NULL       |  3 | controller-0.localdomain |
| 2018-12-10 11:12:31 | 2018-12-10 15:30:39 | NULL       |  4 | compute-1                |
| 2018-12-10 11:12:36 | 2018-12-10 15:34:45 | NULL       |  5 | compute-0.localdomain    |
| 2018-12-10 11:12:41 | NULL                | NULL       |  6 | 172.17.1.12              |
| 2018-12-10 11:12:44 | NULL                | NULL       |  8 | controller-0.localdomain |
| 2018-12-10 15:30:51 | 2018-12-10 15:34:48 | NULL       | 10 | compute-1.localdomain    |
+---------------------+---------------------+------------+----+--------------------------+
8 rows in set (0.00 sec)      


In general we'd need to:
1) delete the new service entry
2) update the host of the old service entry to match the new fqdn (host= parameter)
3) update the instance to reference the new host ( OS-EXT-SRV-ATTR:host and OS-EXT-SRV-ATTR:hypervisor_hostname)



1) delete the new service entry
MariaDB [nova]> update services set deleted_at=now(), deleted=1 where id=10;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
                              
2) update host of old service                                                             
MariaDB [nova]> update services set host='compute-1.localdomain' where id=4;                                                                
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
 
MariaDB [nova]> select created_at,updated_at,deleted_at,id,host from services where deleted <>1;
+---------------------+---------------------+------------+----+--------------------------+
| created_at          | updated_at          | deleted_at | id | host                     |
+---------------------+---------------------+------------+----+--------------------------+
| 2018-12-10 11:12:10 | 2018-12-10 15:37:36 | NULL       |  1 | controller-0.localdomain |
| 2018-12-10 11:12:16 | 2018-12-10 15:37:37 | NULL       |  2 | controller-0.localdomain |
| 2018-12-10 11:12:25 | 2018-12-10 15:37:36 | NULL       |  3 | controller-0.localdomain |
| 2018-12-10 11:12:31 | 2018-12-10 15:30:39 | NULL       |  4 | compute-1.localdomain    |
| 2018-12-10 11:12:36 | 2018-12-10 15:37:35 | NULL       |  5 | compute-0.localdomain    |
| 2018-12-10 11:12:41 | NULL                | NULL       |  6 | 172.17.1.12              |
| 2018-12-10 11:12:44 | NULL                | NULL       |  8 | controller-0.localdomain |
+---------------------+---------------------+------------+----+--------------------------+
7 rows in set (0.00 sec)

3) update instance                                                
MariaDB [nova]> update instances set host='compute-1.localdomain',node='compute-1.localdomain' where uuid='e097ee21-aa84-483a-a5b5-2b2f26a348be';
Query OK, 1 row affected (0.00 sec)                               
Rows matched: 1  Changed: 1  Warnings: 0                          
                                                                  
(overcloud) [stack@undercloud-0 ~]$ nova show test | egrep "host"                                  
| OS-EXT-SRV-ATTR:host                 | compute-1.localdomain                                    |
| OS-EXT-SRV-ATTR:hostname             | test                                                     |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-1.localdomain                                    |
| hostId                               | 07b7b695c2c3a4df635a23e84aff9993fc91347edd40b84d0d9fbdba |
| host_status                          | UNKNOWN                                                  |
                                                                 
(overcloud) [stack@undercloud-0 ~]$ openstack server list --long                                                                                                                                                                                                           
+--------------------------------------+------+--------+------------+-------------+---------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+
| ID                                   | Name | Status | Task State | Power State | Networks            | Image Name | Image ID                             | Flavor Name | Flavor ID                            | Availability Zone | Host                  | Properties |
+--------------------------------------+------+--------+------------+-------------+---------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+
| e097ee21-aa84-483a-a5b5-2b2f26a348be | test | ACTIVE | None       | Running     | private=192.168.0.3 | cirros     | 8d9717d5-20f3-4240-a9f7-7770515c9117 | m1.small    | ebf406c7-57a8-404d-b419-dbc9ad68027c | nova              | compute-1.localdomain |            |
+--------------------------------------+------+--------+------------+-------------+---------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+

Comment 2 Martin Schuppert 2018-12-10 17:08:50 UTC
We probably could just do 3) and in a later step remove old service entries (could be a manual cleanup step).

3) update the instance to reference the new host ( OS-EXT-SRV-ATTR:host and OS-EXT-SRV-ATTR:hypervisor_hostname)

Note: also ovs agent might need an update on the ports bindings, depending on the network type used:

MariaDB [ovs_neutron]> select * from ml2_port_bindings;
+--------------------------------------+--------------------------+----------+-----------+---------+---------------------------------------------------------------------------+--------+
| port_id                              | host                     | vif_type | vnic_type | profile | vif_details                                                               | status |
+--------------------------------------+--------------------------+----------+-----------+---------+---------------------------------------------------------------------------+--------+
| b0d10735-21e2-47aa-8c50-9124d30577b3 | compute-1.localdomain    | ovs      | normal    |         | {"port_filter": true, "datapath_type": "system", "ovs_hybrid_plug": true} | ACTIVE |
| f8d6edb4-b5a3-4761-b9d6-4655c97d34cc | controller-0.localdomain | ovs      | normal    |         | {"port_filter": true, "datapath_type": "system", "ovs_hybrid_plug": true} | ACTIVE |
+--------------------------------------+--------------------------+----------+-----------+---------+---------------------------------------------------------------------------+--------+
2 rows in set (0.00 sec)

Comment 3 Martin Schuppert 2018-12-11 10:55:25 UTC
step 3 is enough. 4 instances on two computes:

MariaDB [nova]> select created_at,updated_at,deleted_at,id,host,node from instances;                                                        
+---------------------+---------------------+---------------------+----+-----------+-----------------------+
| created_at          | updated_at          | deleted_at          | id | host      | node                  |
+---------------------+---------------------+---------------------+----+-----------+-----------------------+
| 2018-12-11 09:59:10 | 2018-12-11 10:06:04 | 2018-12-11 10:06:04 |  1 | NULL      | NULL                  |
| 2018-12-11 09:59:14 | 2018-12-11 10:06:04 | 2018-12-11 10:06:05 |  2 | NULL      | NULL                  |
| 2018-12-11 09:59:19 | 2018-12-11 10:06:05 | 2018-12-11 10:06:05 |  3 | NULL      | NULL                  |
| 2018-12-11 09:59:24 | 2018-12-11 10:06:05 | 2018-12-11 10:06:06 |  4 | NULL      | NULL                  |
| 2018-12-11 10:06:14 | 2018-12-11 10:09:24 | 2018-12-11 10:09:24 |  5 | NULL      | NULL                  |
| 2018-12-11 10:09:29 | 2018-12-11 10:19:22 | NULL                |  6 | compute-0 | compute-0.localdomain |
| 2018-12-11 10:09:44 | 2018-12-11 10:16:45 | NULL                |  7 | compute-1 | compute-1.localdomain |
| 2018-12-11 10:09:48 | 2018-12-11 10:16:48 | NULL                |  8 | compute-1 | compute-1.localdomain |
| 2018-12-11 10:09:52 | 2018-12-11 10:16:51 | NULL                |  9 | compute-0 | compute-0.localdomain |
+---------------------+---------------------+---------------------+----+-----------+-----------------------+
9 rows in set (0.00 sec)
 
$ nova show test1
+--------------------------------------+----------------------------------------------------------+
| Property                             | Value                                                    |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                   |
| OS-EXT-AZ:availability_zone          | nova                                                     |
| OS-EXT-SRV-ATTR:host                 | compute-0                                                |
| OS-EXT-SRV-ATTR:hostname             | test1                                                    |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-0.localdomain                                    |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000006                                        |

Update host column and set the host to the fqdn we have in node column:

MariaDB [nova]> update instances set host=node where host <> node and deleted_at is NULL;

MariaDB [nova]> select created_at,updated_at,deleted_at,id,host,node from instances;
+---------------------+---------------------+---------------------+----+-----------------------+-----------------------+
| created_at          | updated_at          | deleted_at          | id | host                  | node                  |
+---------------------+---------------------+---------------------+----+-----------------------+-----------------------+
| 2018-12-11 09:59:10 | 2018-12-11 10:06:04 | 2018-12-11 10:06:04 |  1 | NULL                  | NULL                  |
| 2018-12-11 09:59:14 | 2018-12-11 10:06:04 | 2018-12-11 10:06:05 |  2 | NULL                  | NULL                  |
| 2018-12-11 09:59:19 | 2018-12-11 10:06:05 | 2018-12-11 10:06:05 |  3 | NULL                  | NULL                  |
| 2018-12-11 09:59:24 | 2018-12-11 10:06:05 | 2018-12-11 10:06:06 |  4 | NULL                  | NULL                  |
| 2018-12-11 10:06:14 | 2018-12-11 10:09:24 | 2018-12-11 10:09:24 |  5 | NULL                  | NULL                  |
| 2018-12-11 10:09:29 | 2018-12-11 10:19:22 | NULL                |  6 | compute-0.localdomain | compute-0.localdomain |
| 2018-12-11 10:09:44 | 2018-12-11 10:16:45 | NULL                |  7 | compute-1.localdomain | compute-1.localdomain |
| 2018-12-11 10:09:48 | 2018-12-11 10:16:48 | NULL                |  8 | compute-1.localdomain | compute-1.localdomain |
| 2018-12-11 10:09:52 | 2018-12-11 10:16:51 | NULL                |  9 | compute-0.localdomain | compute-0.localdomain |
+---------------------+---------------------+---------------------+----+-----------------------+-----------------------+
9 rows in set (0.00 sec)
 
$ nova show test1
+--------------------------------------+----------------------------------------------------------+
| Property                             | Value                                                    |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                   |
| OS-EXT-AZ:availability_zone          | nova                                                     |
| OS-EXT-SRV-ATTR:host                 | compute-0.localdomain                                    |
| OS-EXT-SRV-ATTR:hostname             | test1                                                    |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-0.localdomain                                    |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000006                                        |

$ nova reset-state  --active test1
Reset state for server test1 succeeded; new state is active
$ nova stop test1
Request to stop server test1 has been accepted.
$ nova start test1
Request to start server test1 has been accepted.


Afterwards we can cleanup the down service entries:
$ nova service-list
+----+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary           | Host                     | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+
| 5  | nova-consoleauth | controller-0.localdomain | internal | enabled | up    | 2018-12-11T10:27:47.000000 | -               |
| 6  | nova-scheduler   | controller-0.localdomain | internal | enabled | up    | 2018-12-11T10:27:43.000000 | -               |
| 7  | nova-conductor   | controller-0.localdomain | internal | enabled | up    | 2018-12-11T10:27:44.000000 | -               |
| 11 | nova-compute     | compute-0                | nova     | enabled | down  | 2018-12-11T10:17:07.000000 | -               |
| 12 | nova-compute     | compute-1                | nova     | enabled | down  | 2018-12-11T10:17:42.000000 | -               |
| 13 | nova-compute     | compute-0.localdomain    | nova     | enabled | up    | 2018-12-11T10:27:46.000000 | -               |
| 14 | nova-compute     | compute-1.localdomain    | nova     | enabled | up    | 2018-12-11T10:27:44.000000 | -               |
+----+------------------+--------------------------+----------+---------+-------+----------------------------+-----------------+
$ nova service-delete 11
$ nova service-delete 12


Note: seems neutron using ovs is not that strict in host reference for the port mapping if the host= entry changes in neutron.conf on the compute. Networking still works with short hostname as host reference for the ports. New ports get fqdn reference:

MariaDB [ovs_neutron]> select * from ml2_port_bindings;
+--------------------------------------+--------------------------+----------+-----------+---------+------------------------------------------------+
| port_id                              | host                     | vif_type | vnic_type | profile | vif_details                                    |
+--------------------------------------+--------------------------+----------+-----------+---------+------------------------------------------------+
| 071aac12-2ead-48b3-ac25-89bd47f8827e | compute-1                | ovs      | normal    |         | {"port_filter": true, "ovs_hybrid_plug": true} |
| 65c4c4ee-10a6-4ee9-8bfe-2ca09d865932 | compute-0                | ovs      | normal    |         | {"port_filter": true, "ovs_hybrid_plug": true} |
| 95d8b3c9-52da-4a06-bd02-30b84e9558a1 | compute-1                | ovs      | normal    |         | {"port_filter": true, "ovs_hybrid_plug": true} |
| bc91eedf-3d24-4e99-8db4-ec079ed0337a | compute-1.localdomain    | ovs      | normal    |         | {"port_filter": true, "ovs_hybrid_plug": true} |
| d5edb47a-cb3d-418a-90e0-78c49f1dbf62 | compute-0                | ovs      | normal    |         | {"port_filter": true, "ovs_hybrid_plug": true} |
| f0118f75-20c4-4ed9-891d-5498b7b1c20b | controller-0.localdomain | ovs      | normal    |         | {"port_filter": true, "ovs_hybrid_plug": true} |
+--------------------------------------+--------------------------+----------+-----------+---------+------------------------------------------------+

Comment 5 Lukas Bezdicka 2019-01-28 12:29:08 UTC
*** Bug 1641682 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2019-04-30 16:58:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0921


Note You need to log in before you can comment on or make changes to this bug.