Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Description of problem: On a HA+Instance-ha deployment of osp14 puddle : 2018-10-08.4 Overcloud instance creation fails with : No valid host was found How reproducible: There is automation for this test at : https://rhos-ci-staging-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/pidone/view/instance-ha/job/DFG-pidone-instance-ha-14_director-rhel-virthost-3cont_2comp-ipv4-vxlan-instance-ha-test-suite/ overcloud and undercloud SOS-reports are at : http://rhos-release.virt.bos.redhat.com/log/pkomarov_sosreports/BZ1640472/ Additional info : Some docker containers in perticular nova-api are unhealthy, and openstack-nova-compute are stuck in restarting : (undercloud) [stack@undercloud-0 ~]$ ansible overcloud -mshell -b -a'docker ps |grep "unhealthy\|Restarting"' [WARNING]: Found both group and host with same name: undercloud overcloud-novacomputeiha-0 | SUCCESS | rc=0 >> 1c200146064d 192.168.24.1:8787/rhosp14/openstack-nova-compute:2018-10-08.4 "kolla_start" 47 hours ago Restarting (1) 17 hours ago nova_compute overcloud-novacomputeiha-1 | SUCCESS | rc=0 >> 570f57b4c17f 192.168.24.1:8787/rhosp14/openstack-nova-compute:2018-10-08.4 "kolla_start" 47 hours ago Restarting (1) 17 hours ago nova_compute controller-0 | SUCCESS | rc=0 >> 9c51edf48fbc 192.168.24.1:8787/rhosp14/openstack-nova-api:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) nova_metadata ae15fc9581ec 192.168.24.1:8787/rhosp14/openstack-heat-api-cfn:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) heat_api_cfn cc3b59e3e95b 192.168.24.1:8787/rhosp14/openstack-heat-api:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) heat_api controller-2 | SUCCESS | rc=0 >> 414f46d5bb00 192.168.24.1:8787/rhosp14/openstack-nova-api:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) nova_metadata 17f772c3968c 192.168.24.1:8787/rhosp14/openstack-heat-api-cfn:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) heat_api_cfn cf347ef00ecd 192.168.24.1:8787/rhosp14/openstack-neutron-server:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) neutron_api 0db2c0b1afe8 192.168.24.1:8787/rhosp14/openstack-heat-api:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) heat_api controller-1 | SUCCESS | rc=0 >> 552e905d6cbb 192.168.24.1:8787/rhosp14/openstack-nova-api:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) nova_metadata e06c85a37b8c 192.168.24.1:8787/rhosp14/openstack-heat-api-cfn:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) heat_api_cfn d8487659d15d 192.168.24.1:8787/rhosp14/openstack-neutron-server:2018-10-08.4 "kolla_start" 47 hours ago Up 25 hours (unhealthy) neutron_api bbe41eac5d07 192.168.24.1:8787/rhosp14/openstack-heat-api:2018-10-08.4 "kolla_start" 47 hours ago Up 47 hours (unhealthy) heat_api
Issue seems due to this: Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: ++ cat /run_command Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: + CMD='/var/lib/nova/instanceha/check-run-nova-compute ' Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: + ARGS= Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: + [[ ! -n '' ]] Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: + . kolla_extend_start Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: ++ [[ ! -d /var/log/kolla/nova ]] Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: +++ stat -c %a /var/log/kolla/nova Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: ++ [[ 2755 != \7\5\5 ]] Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: ++ chmod 755 /var/log/kolla/nova Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: ++ . /usr/local/bin/kolla_nova_extend_start Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: +++ [[ ! -d /var/lib/nova/instances ]] Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: Running command: '/var/lib/nova/instanceha/check-run-nova-compute ' Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: + echo 'Running command: '\''/var/lib/nova/instanceha/check-run-nova-compute '\''' Oct 16 09:08:11 overcloud-novacomputeiha-0 dockerd-current[14673]: + exec /var/lib/nova/instanceha/check-run-nova-compute Oct 16 09:08:12 overcloud-novacomputeiha-0 dockerd-current[14673]: Traceback (most recent call last): Oct 16 09:08:12 overcloud-novacomputeiha-0 dockerd-current[14673]: File "/var/lib/nova/instanceha/check-run-nova-compute", line 191, in <module> Oct 16 09:08:12 overcloud-novacomputeiha-0 dockerd-current[14673]: connection = create_nova_connection(config.sections["placement"]) Oct 16 09:08:12 overcloud-novacomputeiha-0 dockerd-current[14673]: File "/var/lib/nova/instanceha/check-run-nova-compute", line 147, in create_nova_connection Oct 16 09:08:12 overcloud-novacomputeiha-0 dockerd-current[14673]: region_name=options["os_region_name"][0], Oct 16 09:08:12 overcloud-novacomputeiha-0 dockerd-current[14673]: KeyError: 'os_region_name'
OSP13: [root@compute-0 nova]# crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf placement os_region_name regionOne OSP14: [root@overcloud-novacompute-0 nova]# crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf placement os_region_name Parameter not found: os_region_name Indeed in osp14 it is commented out: [root@overcloud-novacompute-0 nova]# grep -ir os_region nova.conf:#os_region_name=<None>
This broke because os_region_name is now deprecated and we need to use: commit f2e72352b1376ce719614e9cad4e4c71a3f9c3d8 Author: Juan Antonio Osorio Robles <jaosorior> Date: Thu Oct 4 15:52:40 2018 +0300 Fix placement region setting We were using a deprecated interfce to set this value. This uses the correct one. Closes-Bug: #1793665 Change-Id: Ib7717911aba3267f855ac6682b0144bfe92034fb diff --git a/puppet/services/nova-base.yaml b/puppet/services/nova-base.yaml index f12b0d816dea..3e43b8cf7477 100644 --- a/puppet/services/nova-base.yaml +++ b/puppet/services/nova-base.yaml @@ -260,7 +260,7 @@ outputs: nova::placement::project_name: 'service' nova::placement::password: {get_param: NovaPassword} nova::placement::auth_url: {get_param: [EndpointMap, KeystoneInternal, uri_no_suffix]} - nova::placement::os_region_name: {get_param: KeystoneRegion} + nova::placement::region_name: {get_param: KeystoneRegion} nova::placement::os_interface: {get_param: NovaPlacementAPIInterface} nova::database_connection: make_url: We need to use region_name: - OSP14 [root@overcloud-novacompute-0 nova]# crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf placement region_name regionOne - OSP13 [root@compute-0 nova]# crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf placement region_name Parameter not found: region_name
linked patch verified, (https://review.openstack.org/611551) #Redeployed using the attached patch: [root@undercloud-0 ~]# diff /usr/share/openstack-tripleo-heat-templates/extraconfig/tasks/instanceha/check-run-nova-compute ./check-run-nova-compute_ORG 114,120d113 < if 'region_name' in options: < region = options['region_name'][0] < elif 'os_region_name' in options: < region = options['os_region_name'][0] < else: # We actually try to make a client call even with an empty region < region = None < 146c139 < region_name=region, --- > region_name=options["os_region_name"][0], 154c147 < region_name=region, --- > region_name=options["os_region_name"][0], #Now instance creation is succesfull: (overcloud) [stack@undercloud-0 ~]$ openstack server create --flavor m1.nano --image cirros-0.3.4-x86_64-disk --wait osvm +-------------------------------------+-----------------------------------------------------------------+ | Field | Value | +-------------------------------------+-----------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | overcloud-novacomputeiha-0.localdomain | | OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-novacomputeiha-0.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-00000002 | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | ...//... #And nova_compute dockers on the computes are in healthy state: (overcloud) [stack@undercloud-0 ~]$ ansible compute -mshell -b -a'docker ps |grep nova_compute' [WARNING]: Found both group and host with same name: undercloud overcloud-novacomputeiha-1 | SUCCESS | rc=0 >> 37652a3d01f2 192.168.24.1:8787/rhosp14/openstack-nova-compute:2018-10-10.3 "kolla_start" 4 hours ago Up 4 hours (healthy) nova_compute overcloud-novacomputeiha-0 | SUCCESS | rc=0 >> abebac17239f 192.168.24.1:8787/rhosp14/openstack-nova-compute:2018-10-10.3 "kolla_start" 4 hours ago Up 4 hours (healthy) nova_compute
Verified - comment 7
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045