Description of problem: Deployment fails on one of the controller nodes. The deployment is on 3 controllers, 5 computes, and 2 ceph nodes at the moment. The error that causes the stack to fail comes from puppet on Controller-0 Aug 11 11:24:29 localhost os-collect-config: ::Server[6000]): The default incoming_chmod set to 0644 may yield in error prone directories and will be changed in a later release.\u001b[0m\n\u001b[1;31mWarning: Scope(Swift::Storage::Server[6000]): The default outgoing_chmod set to 0644 may yield in error prone directories and will be changed in a later release.\u001b[0m\n\u001b[1;31mWarning: The package type's allow_virtual parameter will be changing its default value from false to true in a future release. If you do not want to allow virtual packages, please explicitly set allow_virtual to false.\n (at /usr/share/ruby/vendor_ruby/puppet/type.rb:816:in `set_default')\u001b[0m\n\u001b[1;31mError: Received error response from Keystone server at http://172.17.0.10:35357/v3/domains: Unauthorized\u001b[0m\n\u001b[1;31mError: /Stage[main]/Heat::Keystone::Domain/Heat_domain_id_setter[heat_domain_id]/ensure: change from absent to present failed: Received error response from Keystone server at http://172.17.0.10:35357/v3/domains: Unauthorized\u001b[0m\n", "deploy_status_code": 6} The interesting part of that error (formatted nicely): Error: Received error response from Keystone server at http://172.17.0.10:35357/v3/domains: Unauthorized Error: /Stage[main]/Heat::Keystone::Domain/Heat_domain_id_setter[heat_domain_id]/ensure: change from absent to present failed: Received error response from Keystone server at http://172.17.0.10:35357/v3/domains: Unauthorized The connection to the keystone service works, as does everything else in the network I was able to test. In the Overcloud keystone logs, I see there's a login request for ceilometer that returns a 401:Unauthorized and seems to be unrelated, but there don't seem to be other failing requests.
Hitting this in the BAGL lab. Aug 11 16:59:55 localhost os-collect-config: [2015-08-11 16:59:55,117] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/43fe7cb8-0d1f-473a-b056-4364ee7f4e8d.pp. [6] Aug 11 16:59:55 localhost os-collect-config: ^[[1;31mError: Received error response from Keystone server at http://172.21.33.11:35357/v3/domains: Unauthorized^[[0m
Hi, We had the same issue. Your undercloud should be able to communicate with your overcloud external network.
This is not a networking or communication issue I think, because we have "Unauthorized". I guess this is a domain configuration issue...
fwiw, i saw this error when i had not specified a value for --ntp-server during a deployment
I believe this is caused by heat domain creation script due to asynchronous corner cases during multi-controller deployment. Once [1] and [2] will be used, then this issue should perish. Will rebase [2] and run some tests. [1] https://review.openstack.org/#/c/204541/ [2] https://review.openstack.org/#/c/180566/17
*** Bug 1251156 has been marked as a duplicate of this bug. ***
Is there a workaround for this issue? We have been hitting this issue on most of the deployments.
Any time I've seen this it has been an underlying issue with the keystone setup on the controllers, the domain creation script just happens to be the first thing to use keystone. Can you confirm that the time on each host is in sync? As times out of sync can cause tokens to be seen as invalid.
Splitting this bug because it needs a fix in openstack-puppet-modules. This part will be for the openstack-tripleo-heat-templates fix
The hwclock is in sync for all nodes. The error on the keystone log is: 2015-08-31 22:44:38.246 31739 WARNING keystone.common.wsgi [-] Authorization failed. Could not find user: ceilometer (Disable debug mode to suppress these details.) (Disable debug mode to suppress these details.) from 192.0.2.37 We are hitting this issue consistently on one of the setups. Would u suggest patching [1] into the overcloud image? And [2] to the heat-templates?
can we get the os-cloud-config logs from the controllers please?
Below a short output of the os-cloud-config logs (snippet including the error): Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + write_entries /etc/cloud/templates/hosts.redhat.tmpl '192.168.3.14 overcloud-compute-0.localdomain overcloud-compute-0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.13 overcloud-controller-0.localdomain overcloud-controller-0 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.12 overcloud-controller-1.localdomain overcloud-controller-1 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.15 overcloud-controller-2.localdomain overcloud-controller-2 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.4.11 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + local file=/etc/cloud/templates/hosts.redhat.tmpl Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + local 'entries=192.168.3.14 overcloud-compute-0.localdomain overcloud-compute-0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.13 overcloud-controller-0.localdomain overcloud-controller-0 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.12 overcloud-controller-1.localdomain overcloud-controller-1 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.15 overcloud-controller-2.localdomain overcloud-controller-2 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.4.11 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + '[' '!' -f /etc/cloud/templates/hosts.redhat.tmpl ']' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + grep -q '^# HEAT_HOSTS_START' /etc/cloud/templates/hosts.redhat.tmpl Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: ++ mktemp Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + temp=/tmp/tmp.48G4ZjjtgM Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + awk -v 'v=192.168.3.14 overcloud-compute-0.localdomain overcloud-compute-0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.13 overcloud-controller-0.localdomain overcloud-controller-0 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.12 overcloud-controller-1.localdomain overcloud-controller-1 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.15 overcloud-controller-2.localdomain overcloud-controller-2 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.4.11 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0' '/^# HEAT_HOSTS_START/ { Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: print $0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: print v Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: f=1 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: }f &&!/^# HEAT_HOSTS_END$/{next}/^# HEAT_HOSTS_END$/{f=0}!f' /etc/cloud/templates/hosts.redhat.tmpl Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + echo 'INFO: Updating hosts file /etc/cloud/templates/hosts.redhat.tmpl, check below for changes' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: INFO: Updating hosts file /etc/cloud/templates/hosts.redhat.tmpl, check below for changes Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + diff /etc/cloud/templates/hosts.redhat.tmpl /tmp/tmp.48G4ZjjtgM Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + cat /tmp/tmp.48G4ZjjtgM Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + write_entries /etc/hosts '192.168.3.14 overcloud-compute-0.localdomain overcloud-compute-0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.13 overcloud-controller-0.localdomain overcloud-controller-0 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.12 overcloud-controller-1.localdomain overcloud-controller-1 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.15 overcloud-controller-2.localdomain overcloud-controller-2 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.4.11 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + local file=/etc/hosts Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + local 'entries=192.168.3.14 overcloud-compute-0.localdomain overcloud-compute-0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.13 overcloud-controller-0.localdomain overcloud-controller-0 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.12 overcloud-controller-1.localdomain overcloud-controller-1 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.15 overcloud-controller-2.localdomain overcloud-controller-2 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.4.11 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + '[' '!' -f /etc/hosts ']' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + grep -q '^# HEAT_HOSTS_START' /etc/hosts Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: ++ mktemp Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + temp=/tmp/tmp.0ZqXoAQw2Z Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + awk -v 'v=192.168.3.14 overcloud-compute-0.localdomain overcloud-compute-0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.13 overcloud-controller-0.localdomain overcloud-controller-0 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.12 overcloud-controller-1.localdomain overcloud-controller-1 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.3.15 overcloud-controller-2.localdomain overcloud-controller-2 overcloud Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: 192.168.4.11 overcloud-cephstorage-0.localdomain overcloud-cephstorage-0' '/^# HEAT_HOSTS_START/ { Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: print $0 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: print v Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: f=1 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: }f &&!/^# HEAT_HOSTS_END$/{next}/^# HEAT_HOSTS_END$/{f=0}!f' /etc/hosts Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + echo 'INFO: Updating hosts file /etc/hosts, check below for changes' Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: INFO: Updating hosts file /etc/hosts, check below for changes Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + diff /etc/hosts /tmp/tmp.0ZqXoAQw2Z Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: + cat /tmp/tmp.0ZqXoAQw2Z Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: dib-run-parts Wed Sep 9 17:49:36 EDT 2015 51-hosts completed Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: dib-run-parts Wed Sep 9 17:49:36 EDT 2015 Running /usr/libexec/os-refresh-config/configure.d/55-heat-config Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,979] (heat-config) [WARNING] Skipping group os-apply-config with no hook script /var/lib/heat-config Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,980] (heat-config) [WARNING] Skipping group os-apply-config with no hook script /var/lib/heat-config Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,980] (heat-config) [WARNING] Skipping group os-apply-config with no hook script /var/lib/heat-config Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,980] (heat-config) [WARNING] Skipping config 73bed163-2b0b-4f0b-9551-8e9b7646dfe6, already deployed Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,980] (heat-config) [WARNING] To force-deploy, rm /var/run/heat-config/deployed/73bed163-2b0b-4f0b-95 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,980] (heat-config) [WARNING] Skipping group os-apply-config with no hook script /var/lib/heat-config Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] Skipping group Heat::Ungrouped with no hook script /var/lib/heat-config Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] Skipping config 6b555ffc-4b74-460c-86de-6f0430b960cf, already deployed Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] To force-deploy, rm /var/run/heat-config/deployed/6b555ffc-4b74-460c-86 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] Skipping config 75c52db4-5d51-47fb-ae4b-cd6b9934879e, already deployed Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] To force-deploy, rm /var/run/heat-config/deployed/75c52db4-5d51-47fb-ae Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] Skipping config d4899348-0202-40c4-9bb1-94814a7f2500, already deployed Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] To force-deploy, rm /var/run/heat-config/deployed/d4899348-0202-40c4-9b Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] Skipping config 08d76801-8739-4d9d-8283-ea8310ad882c, already deployed Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,981] (heat-config) [WARNING] To force-deploy, rm /var/run/heat-config/deployed/08d76801-8739-4d9d-82 Sep 09 17:49:36 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:49:36,982] (heat-config) [DEBUG] Running /var/lib/heat-config/hooks/puppet < /var/run/heat-config/deployed Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,129] (heat-config) [INFO] {"deploy_stdout": "\u001b[mNotice: Compiled catalog for overcloud-controll Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: :Mon/Ceph::Mon[overcloud-controller-0]/Exec[rm-keyring-overcloud-controller-0]/returns: executed successfully\u001b[0m\n\ Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: s\u001b[0m\n", "deploy_stderr": "\u001b[1;31mWarning: Scope(Class[Keystone]): Execution of db_sync does not depend on $en Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: :Server[6000]): The default incoming_chmod set to 0644 may yield in error prone directories and will be changed in a late Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,130] (heat-config) [DEBUG] [2015-09-09 17:49:37,016] (heat-config) [DEBUG] Running FACTER_heat_outpu Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,124] (heat-config) [INFO] Return code 6 Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,124] (heat-config) [INFO] Notice: Compiled catalog for overcloud-controller-0.localdomain in environ Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Main/Exec[galera-ready]/returns: executed successfully Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Main/Exec[neutron-server-start-wait-stop]/returns: executed successfully Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Keystone::Domain/Heat_config[DEFAULT/stack_domain_admin_password]/ensure: created Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Keystone::Domain/Heat_config[DEFAULT/stack_domain_admin]/ensure: created Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/File[/tmp/ceph-mon-keyring-overcloud-controller Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Exec[ceph-mon-mkfs-overcloud-controller-0]/retu Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Exec[ceph-mon-mkfs-overcloud-controller-0]/retu Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Exec[ceph-mon-mkfs-overcloud-controller-0]/retu Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Exec[ceph-mon-mkfs-overcloud-controller-0]/retu Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Exec[ceph-mon-ceph.client.admin.keyring-overclo Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Service[ceph-mon-overcloud-controller-0]/ensure Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Ceph::Profile::Mon/Ceph::Mon[overcloud-controller-0]/Exec[rm-keyring-overcloud-controller-0]/returns Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Neutron::Agents::Ml2::Ovs/Service[neutron-ovs-agent-service]/ensure: ensure changed 'running' to 'st Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Keystone::Roles::Admin/Keystone_tenant[admin]/ensure: created Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Keystone::Roles::Admin/Keystone_tenant[services]/ensure: created Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Keystone::Roles::Admin/Keystone_role[admin]/ensure: created Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]/ensure: created Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Keystone::Roles::Admin/Keystone_user_role[admin@admin]/roles: roles changed ['_member_'] to 'admin' Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Keystone::Domain/Exec[heat_domain_create]/returns: executed successfully Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Pacemaker::Corosync/Exec[enable-not-start-tripleo_cluster]/returns: executed successfully Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Pacemaker::Corosync/Exec[Set password for hacluster user on tripleo_cluster]/returns: executed succe Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Pacemaker::Corosync/Exec[auth-successful-across-all-nodes]/returns: executed successfully Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: Pacemaker has reported quorum achieved Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Pacemaker::Corosync/Notify[pacemaker settled]/message: defined 'message' as 'Pacemaker has reported Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Api_cfn/Service[heat-api-cfn]: Triggered 'refresh' from 2 events Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Engine/Service[heat-engine]: Triggered 'refresh' from 2 events Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Api/Service[heat-api]: Triggered 'refresh' from 2 events Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: /Stage[main]/Heat::Api_cloudwatch/Service[heat-api-cloudwatch]: Triggered 'refresh' from 2 events Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Notice: Finished catalog run in 63.72 seconds Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,124] (heat-config) [INFO] Warning: Scope(Class[Keystone]): Execution of db_sync does not depend on $ Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Class[Glance::Registry]): Execution of db_sync does not depend on $manage_service or $enabled anymore. Ple Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_host'; cla Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_protocol'; Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_port'; cla Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_path'; cla Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Class[Concat::Setup]): concat::setup is deprecated as a public API of the concat module and should no long Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Swift::Storage::Server[6002]): The default incoming_chmod set to 0644 may yield in error prone directories Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Swift::Storage::Server[6002]): The default outgoing_chmod set to 0644 may yield in error prone directories Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Swift::Storage::Server[6001]): The default incoming_chmod set to 0644 may yield in error prone directories Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Swift::Storage::Server[6001]): The default outgoing_chmod set to 0644 may yield in error prone directories Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Swift::Storage::Server[6000]): The default incoming_chmod set to 0644 may yield in error prone directories Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: Scope(Swift::Storage::Server[6000]): The default outgoing_chmod set to 0644 may yield in error prone directories Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Warning: The package type's allow_virtual parameter will be changing its default value from false to true in a future rel Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: (at /usr/share/ruby/vendor_ruby/puppet/type.rb:816:in `set_default') Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Error: Received error response from Keystone server at http://192.168.3.10:35357/v3/domains: Unauthorized Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: Error: /Stage[main]/Heat::Keystone::Domain/Heat_domain_id_setter[heat_domain_id]/ensure: change from absent to present fa Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,124] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/29a75921-42bb-4813- Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,130] (heat-config) [INFO] Completed /var/lib/heat-config/hooks/puppet Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,131] (heat-config) [DEBUG] Running heat-config-notify /var/run/heat-config/deployed/29a75921-42bb-48 Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,719] (heat-config) [INFO] Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,719] (heat-config) [DEBUG] [2015-09-09 17:51:01,539] (heat-config-notify) [DEBUG] Signaling to http: Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,701] (heat-config-notify) [DEBUG] Response <Response [200]> Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,719] (heat-config) [WARNING] Skipping config 58f1ed83-8258-4f00-868d-5cdd2b8ee2ad, already deployed Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,719] (heat-config) [WARNING] To force-deploy, rm /var/run/heat-config/deployed/58f1ed83-8258-4f00-86 Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,719] (heat-config) [WARNING] Skipping group os-apply-config with no hook script /var/lib/heat-config Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,720] (heat-config) [WARNING] Skipping group os-apply-config with no hook script /var/lib/heat-config Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: [2015-09-09 17:51:01,720] (heat-config) [WARNING] Skipping group Heat::Ungrouped with no hook script /var/lib/heat-config Sep 09 17:51:01 overcloud-controller-0.localdomain os-collect-config[4816]: dib-run-parts Wed Sep 9 17:51:01 EDT 2015 55-heat-config completed
Looking at the log I see that Exec[heat_domain_create] and Keystone_user[admin] have been successfully created, so from plain Puppet point of view there should not be reason for Heat_domain_id_setter[heat_domain_id] to fail.
Will this fix be present in y1? I dont see the [1]https://review.openstack.org/#/c/204541/13 in the interim build.
Raising the severity to urgent as we think this bug could be contributing to deployment failures seen on multiple setups. We view this as a must fix for Y1. I have updated the target Milestone from Y2 to Y1. Do we need both fixes mentioned above ? if so looks like in the initial pre-release build for Y1 the 1st fix is missing as pointed out by Shiv.
Can we confirm that the patch in question addresses the situation you are seeing? Martin was never able to reliably reproduce the reported issue so this patch has not been merged yet.
Created attachment 1078025 [details] crm_report
An excerpt fromt he keystone.log: 2015-09-28 10:18:23.900 6504 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:23] "GET /v2.0/users HTTP/1.1" 200 404 0.005013 2015-09-28 10:18:23.902 6504 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:23.904 6504 INFO keystone.common.wsgi [-] PUT /tenants/b04fbc1f1cb64787839368862aa1abd2/users/20097b05355f4745820bb2a905319294/roles/OS-KSADM/a59a16fa894f4664b6c42419324791db? 2015-09-28 10:18:23.930 6504 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:23] "PUT /v2.0/tenants/b04fbc1f1cb64787839368862aa1abd2/users/20097b05355f4745820bb2a905319294/roles/OS-KSADM/a59a16fa894f4664b6c42419324791db HTTP/1.1" 200 287 0.028553 2015-09-28 10:18:24.505 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.532 6498 INFO keystone.common.wsgi [-] GET /OS-KSADM/roles/_member_? 2015-09-28 10:18:24.572 6498 DEBUG oslo_db.sqlalchemy.session [-] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/session.py:513 2015-09-28 10:18:24.602 6498 WARNING keystone.common.wsgi [-] Could not find role: _member_ 2015-09-28 10:18:24.603 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "GET /v2.0/OS-KSADM/roles/_member_ HTTP/1.1" 404 315 0.100151 2015-09-28 10:18:24.605 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.607 6498 INFO keystone.common.wsgi [-] GET /OS-KSADM/roles? 2015-09-28 10:18:24.610 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "GET /v2.0/OS-KSADM/roles HTTP/1.1" 200 355 0.005075 2015-09-28 10:18:24.619 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.625 6498 INFO keystone.common.wsgi [-] GET /tenants/admin? 2015-09-28 10:18:24.631 6498 WARNING keystone.common.wsgi [-] Could not find project: admin 2015-09-28 10:18:24.632 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "GET /v2.0/tenants/admin HTTP/1.1" 404 315 0.013011 2015-09-28 10:18:24.634 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.635 6498 INFO keystone.common.wsgi [-] GET /tenants? 2015-09-28 10:18:24.642 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "GET /v2.0/tenants HTTP/1.1" 200 495 0.008189 2015-09-28 10:18:24.643 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.645 6498 INFO keystone.common.wsgi [-] GET /users/admin? 2015-09-28 10:18:24.649 6498 WARNING keystone.common.wsgi [-] Could not find user: admin 2015-09-28 10:18:24.650 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "GET /v2.0/users/admin HTTP/1.1" 404 312 0.006742 2015-09-28 10:18:24.652 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.653 6498 INFO keystone.common.wsgi [-] GET /users? 2015-09-28 10:18:24.657 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "GET /v2.0/users HTTP/1.1" 200 404 0.005165 2015-09-28 10:18:24.659 6498 DEBUG keystone.middleware.core [-] RBAC: auth_context: {} process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:239 2015-09-28 10:18:24.661 6498 INFO keystone.common.wsgi [-] DELETE /tenants/b04fbc1f1cb64787839368862aa1abd2/users/20097b05355f4745820bb2a905319294/roles/OS-KSADM/9fe2ff9ee4384b1894a90878d3e92bab? 2015-09-28 10:18:24.674 6498 DEBUG keystone.notifications [-] Invoking callback _user_callback for event identity invalidate_user_tokens internal for{'resource_info': u'20097b05355f4745820bb2a905319294'} notify_event_callbacks /usr/lib/python2.7/site-packages/keystone/notifications.py:307 2015-09-28 10:18:24.692 6498 DEBUG keystone.notifications [-] Invoking callback _delete_user_tokens_callback for event identity invalidate_user_tokens internal for{'resource_info': u'20097b05355f4745820bb2a905319294'} notify_event_callbacks /usr/lib/python2.7/site-packages/keystone/notifications.py:307 2015-09-28 10:18:24.789 6498 INFO eventlet.wsgi.server [-] 192.0.2.35 - - [28/Sep/2015 10:18:24] "DELETE /v2.0/tenants/b04fbc1f1cb64787839368862aa1abd2/users/20097b05355f4745820bb2a905319294/roles/OS-KSADM/9fe2ff9ee4384b1894a90878d3e92bab HTTP/1.1" 204 193 0.130007 2015-09-28 10:18:25.632 6504 WARNING keystone.middleware.core [-] RBAC: Invalid token 2015-09-28 10:18:25.633 6504 WARNING keystone.common.wsgi [-] The request you have made requires authentication. (Disable debug mode to suppress these details.) 2015-09-28 10:18:25.634 6504 INFO eventlet.wsgi.server [-] 192.0.2.30 - - [28/Sep/2015 10:18:25] "POST /v3/domains HTTP/1.1" 401 449 0.021350 2015-09-28 10:18:48.811 6567 DEBUG keystone.middleware.core [-] Auth token not in the request header. Will not build auth context. process_request /usr/lib/python2.7/site-packages/keystone/middleware/core.py:229
The keystone issue in finding the domain id is hit in an openstack overcloud deploy. Few logs below. Complete CRM_REPORT is attached. Here I was trying to deploy 3 controllers in HA and 1 compute node. The command used to deploy the same is openstack overcloud deploy --templates --ceph-storage-scale 0 --control-scale 3 --control-flavor control --compute-flavor compute -e /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/cisco-n1kv-config.yaml --neutron-network-type vlan --neutron-network-vlan-ranges datacentre:30:100 --neutron-tunnel-types vlan --swift-storage-scale 0 --swift-storage-flavor compute --ceph-storage-flavor compute --block-storage-flavor compute --compute-scale 1 --ntp-server 8.8.8.8 Point to note here is the NTP server IP that I've provided is not a valid IP. I'll be correcting the same and will try to redeploy it to confirm if that solves the issue. ######################## LOGS ################################################# ######Undercloud: ‘heat resource-show overcloud ControllerNodesPostDeployment’ | resource_name | ControllerNodesPostDeployment | | resource_status | CREATE_FAILED | | resource_status_reason | Error: resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step6.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 6 | ###########Controller-0 /var/log/messages Sep 28 10:18:43 localhost os-collect-config: ^[[1;31mError: heat-keystone-setup-domain returned 1 instead of one of [0]^[[0m Sep 28 10:18:43 localhost os-collect-config: ^[[1;31mError: /Stage[main]/Heat::Keystone::Domain/Exec[heat_domain_create]/returns: change from notrun to 0 failed: heat-keystone-setup-domain returned 1 instead of one of [0]^[[0m Sep 28 10:18:43 localhost os-collect-config: ^[[1;31mWarning: /Stage[main]/Heat::Keystone::Domain/Heat_domain_id_setter[heat_domain_id]: Skipping because of failed dependencies^[[0m Sep 28 10:18:43 localhost os-collect-config: [2015-09-28 10:18:43,849] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/063674a3-898f-4094-bb21-d479b099a429.pp. [6] ##########Controller-0 ‘pcs status’ doesn’t return any valid errors
If the Keystone log above is accurate, there are missing entires in the database for such basic things as roles and users. Is it possible that the values were written to one server, and Gallera did not properly replicate them to other servers? If the query then happened against the wrong Maria instance in the Gallera cluster, it would fail to find the roles etc.
Is it possible that you were testing with only the THT change and not the related openstack-puppet-modules change from bug 1258614 ?
*** Bug 1266848 has been marked as a duplicate of this bug. ***
Ok, I just brought upstream patch [1] to a workable state and CI is "passing". I believe that the patch together with puppet-heat update from bz#1258614 will fix this issue even though I was not able to reproduce it. [1] https://review.openstack.org/#/c/180566/
Per IRC, this shouldn't be needed anymore.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days