Description of problem: All the Overcloud keystone endpoints use the public IP when deploying with isolated networks. Version-Release number of selected component (if applicable): Steps to Reproduce: 1. Deploy overcloud with at least an external and internal api network. 2. Check keystone endpoints of the the deployed overcloud. Actual results: All the endpoint URLs contain the public IP address. Expected results: I expect the internal and admin endpoints to contain an IP address from the internal api net. Additional info: https://github.com/openstack/os-cloud-config/blob/master/os_cloud_config/keystone.py#L320
We've run into similar issues with the default os-cloud-config behavior on the undercloud. Fortunately there's a way to override the service parameters, which I believe would need to be done here: https://github.com/rdo-management/python-rdomanager-oscplugin/blob/master/rdomanager_oscplugin/v1/overcloud_deploy.py#L516 We'll need a way to query the appropriate address for each service too, which I'll leave to someone who knows more about it than me. :-)
What is the observed behavior with keystone on the public network?
(In reply to chris alfonso from comment #4) > What is the observed behavior with keystone on the public network? It relies on factors external to the cloud to work. The undercloud is not connected directly to the public network, so for it to reach the public VIP it has to go out the uplink and be routed somehow back to the public VIP. This is going to fail if there are any routing or firewall issues between the undercloud and the public VIP. It will also fail if the undercloud does not have a direct connection but instead uses a proxy to access the Internet. This is likely to fail in CI, where we can't guarantee that the public VIP is routable. We are much safer using the control plane VIP, because the undercloud is directly attached.
It's not clear to me that the undercloud should be connected to the internal API network *either*. In a really secure/isolated setup, I don't think you want the undercloud connected to anything but the provisioning network and the BMC. (Basil should probably weigh in here). That would mean that keystone needs to be listening on the provisioning network if we want the undercloud to be able to reach it and do configuration, *or* the configuration needs to be accomplished some other way entirely.
I think this BZ is only about the endpoints misconfiguration, there is another BZ 1236378 reporting about the postconfig steps failure
(In reply to Hugh Brock from comment #6) > It's not clear to me that the undercloud should be connected to the internal > API network *either*. In a really secure/isolated setup, I don't think you > want the undercloud connected to anything but the provisioning network and > the BMC. (Basil should probably weigh in here). That would mean that > keystone needs to be listening on the provisioning network if we want the > undercloud to be able to reach it and do configuration, *or* the > configuration needs to be accomplished some other way entirely. No, we DO NOT want the undercloud talking to the Internal API. The undercloud is not attached to the Internal API network, and there is no way to route to it. The Control Plane VIP I'm referring to is the VIP on the undercloud provisioning network called "ctlplane". That is the network that the undercloud shares with the overcloud, and that is where all the control traffic between Heat on the undercloud and the overcloud should occur.
OK, after a discussion with Dan Prince, Hugh Brock, and others, we decided that there are tradeoffs associated with putting the service on the control plane, and that requiring a route to the external public network is acceptable. The behavior is fine as-is, and we will document the requirement to have a route between the undercloud and the external network.
Please ignore my last two comments. There was some confusion between this and another bug. The endpoint configuration is still an issue.
Is this a real issue? Or is this not a concern? Are we ok with requiring that the undercloud have access to the public network?
Created attachment 1045005 [details] keystone.endpoints IMHO this is a valid bug and unrelated to BZ 1236378 Assuming the undercloud will run postconfig against the overcloud public ip, we still want the three urls of each keystone endpoint (internal/admin/public) configured appropriately. They should not be all three configured with the same IP, which is what this bug is about. See output attached to bug.
I think this is going to need to be fixed in the puppet manifests. It's possible that we need some additional data via Heat, but I'm not sure. I'd like to work with someone who knows the puppet side very well on this.
Assigning to Emilien, but also cc'ing Jirka
I don't think this is controlled by puppet. The Keystone endpoints are configured by the ucli: https://github.com/rdo-management/python-rdomanager-oscplugin/blob/master/rdomanager_oscplugin/v1/overcloud_deploy.py#L538 FWIW, this is fixed on the undercloud in https://review.gerrithub.io/#/c/229968/22/elements/undercloud-post-config/os-refresh-config/post-configure.d/98-undercloud-setup That's using the CLI for os-cloud-config instead of the Python API, but the basic concept should be the same.
Moving to Ana, then
It is indeed not puppet related because the endpoint creation are not managed by puppet. The following comments are related to upstream (I am not sure to which extent and what are the actual files impacted (mid|down)stream Everything happens in init-keystone (in devtest_overcloud.sh). The way it is called init-keystone passes the overcloud ip as --host/-o parameter[1]. This host parameter is then used to create the endpoint for all three networks[2] (public, admin, internal). I see two solutions to this issue: 1. Manage the endpoint creation directly in puppet. The code has been upstream for a while and has been proven to work 2. Patch init-keystone to support network partition [1] https://github.com/openstack/tripleo-incubator/blob/master/scripts/devtest_overcloud.sh#L582 [2] https://github.com/openstack/os-cloud-config/blob/master/os_cloud_config/keystone.py#L470-L472 Hope this help,
openstack overcloud deploy currently assumes that there is only one network. It is quite possible to add support to overcloud deploy to support both an external and an internal network, but there needs to be some discussion on behaviour here.
After discussion with Jiri Stransky it was agreed that the network information we added to the stack output. The CLI can then pick that up to configure the endpoints correctly. Jiri will provide a patch to add that information, and I will provide a patch for the CLI that uses that information.
I've added a work in progress patch for the CLI. The _get_service_ips() function needs to be modified to use the information that will actually exist in the "output" data instead of the hardcoded data currently used.
Patch posted for additional t-h-t outputs: https://review.openstack.org/#/c/199554 In the end i went with the solution of outputting the internal VIPs for all services separately, because in the future we might switch from a VIP-per-network approach to VIP-per-service approach. The concept of "the internal network VIP for all things" might then entirely disappear, so i wanted to make the solution future-proof. Is this approach OK for you, or do we need to output the per-network VIPs instead?
I think the patch you posted solves this issue in the most flexible way I can think of, and the approach matches the patch I posted perfectly.
Tripleo-heat-templates patch merged upstream, backport posted to internal gerrit.
the fix for this is causing me issues. while i do see that the internal and admin endpoints are now on the internal api network, the public endpoints are also on the internal api, when they should actually be set to the public vip instead. when using network isolation this causes the undercloud to not be able to talk to any of the overcloud api services since the undercloud has not nic configured on the overcloud internal api network. Here's output of keystone endpoint-list against my overcloud: +----------------------------------+-----------+-----------------------------------------------+-----------------------------------------------+------------------------------------------+----------------------------------+ | id | region | publicurl | internalurl | adminurl | service_id | +----------------------------------+-----------+-----------------------------------------------+-----------------------------------------------+------------------------------------------+----------------------------------+ | 007326d7e1bd4b01a7b28edb95db90ad | regionOne | http://172.17.0.10:8774/v2/$(tenant_id)s | http://172.17.0.10:8774/v2/$(tenant_id)s | http://172.17.0.10:8774/v2/$(tenant_id)s | 78ca75d9c6fb4d32a1d9ea1e296d314e | | 0bfabe6d981d4e6eac54e9e38c7b7187 | regionOne | http://172.17.0.10:9696/ | http://172.17.0.10:9696/ | http://172.17.0.10:9696/ | bf5a33c1805446f1a440f851af6f7697 | | 114b2d7bf7b547abb986ca6b715d9c39 | regionOne | http://10.1.244.10:5000/v2.0 | http://10.1.244.10:5000/v2.0 | http://10.1.244.10:35357/v2.0 | ed6fc6129e4e4afaabe8548e7c37e0d4 | | 262e7086d5ed49f9ae183cf895ed16a6 | regionOne | http://172.18.0.10:9292/ | http://172.18.0.10:9292/ | http://172.18.0.10:9292/ | 1480329765fa4f058248c656b9154415 | | 3d7fcfd41bb24e75a7b67a1be0743156 | regionOne | http://172.17.0.10:8776/v2/%(tenant_id)s | http://172.17.0.10:8776/v2/%(tenant_id)s | http://172.17.0.10:8776/v2/%(tenant_id)s | 06e09f13386e48f6b33576479878b57e | | 707df25cf0bf4d3bac6e4c233427b4bf | regionOne | http://172.18.0.10:8080/v1/AUTH_%(tenant_id)s | http://172.18.0.10:8080/v1/AUTH_%(tenant_id)s | http://172.18.0.10:8080/v1 | def39e3d12ec464d96f0f62aafb71dbe | | 8087539b19154f8bb770222cad0c1a60 | regionOne | http://172.17.0.10:8774/v3 | http://172.17.0.10:8774/v3 | http://172.17.0.10:8774/v3 | e55dccdbdace49f68d7b7a7c3d07d571 | | 8f64f1975fb745ae9b10db1e6fa05d8f | regionOne | http://10.1.244.10:8773/services/Cloud | http://10.1.244.10:8773/services/Cloud | http://10.1.244.10:8773/services/Admin | 4ce7af28e76d4e8480762722635ccbb0 | | 90fddd45009e474f8ff4f7619cc498df | regionOne | http://172.17.0.10:8776/v1/%(tenant_id)s | http://172.17.0.10:8776/v1/%(tenant_id)s | http://172.17.0.10:8776/v1/%(tenant_id)s | 4f9669fabdae48b1b9b4c5c7d6bc420a | | d1a4425a4c194397abfedb1f49ef8993 | regionOne | http://10.1.244.10:None/ | http://10.1.244.10:None/ | http://10.1.244.10:None/admin | e0ca493b70934f02bbbd2403f6b3e92e | | d977154e2425485f9fc0675a05e77654 | regionOne | http://172.17.0.10:8777/ | http://172.17.0.10:8777/ | http://172.17.0.10:8777/ | a73a05f6a57040d0b5a4f77f17fc5165 | | e3df889d64754ed5876d809d9e7d83fe | regionOne | http://172.17.0.10:8004/v1/%(tenant_id)s | http://172.17.0.10:8004/v1/%(tenant_id)s | http://172.17.0.10:8004/v1/%(tenant_id)s | a0b76859b8234e06b490dbff63e229b1 | +----------------------------------+-----------+-----------------------------------------------+-----------------------------------------------+------------------------------------------+----------------------------------+ note the publicurl's are on 172.17.0.0/24, which is my internal api. those should actually be on my public vip of 10.1.244.10
there's another new issue as well in that the horizon endpoint is configured with a port of "None": | d1a4425a4c194397abfedb1f49ef8993 | regionOne | http://10.1.244.10:None/ | http://10.1.244.10:None/ | http://10.1.244.10:None/admin | e0ca493b70934f02bbbd2403f6b3e92e
I poked at this a bit today... deployed (vms) like: -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml --control-scale 3 --ceph-storage-scale 1 My env (yesterday's poodle) didn't have 2 of the changes mentioned here so I applied https://review.gerrithub.io/#/c/240095/1/rdomanager_oscplugin/v1/overcloud_deploy.py and https://review.gerrithub.io/#/c/240096/2/rdomanager_oscplugin/v1/overcloud_deploy.py After the stack was created the deploy hangs at keystone init. I found out it was because the undercloud has no way of reaching the overcloud vip @ 10.0.0.x - so I added a route via the br-ctlplane like: sudo ip route add 10.0.0.0/24 via 192.0.2.1 dev br-ctlplane I was also pointed at https://bugzilla.redhat.com/show_bug.cgi?id=1236378 by slagle, where this is discussed. I could then source overcloudrc and talk to services (and deploy which was hanging until then completes OK). keystone endpoint-list is like (public urls are all on the 10.0.0.4): [stack@instack ~]$ keystone endpoint-list /usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient. 'python-keystoneclient.', DeprecationWarning) +----------------------------------+-----------+--------------------------------------------+----------------------------------------------+-----------------------------------------+----------------------------------+ | id | region | publicurl | internalurl | adminurl | service_id | +----------------------------------+-----------+--------------------------------------------+----------------------------------------------+-----------------------------------------+----------------------------------+ | 0494a943ca224c278343c8b2eca92182 | regionOne | http://10.0.0.4:8773/services/Cloud | http://10.0.0.4:8773/services/Cloud | http://10.0.0.4:8773/services/Admin | d4c93ca15d0d42a7b5cc7d579e813881 | | 04e46cee14044dcc9f67fa33b6e75616 | regionOne | http://10.0.0.4:80/dashboard/ | http://10.0.0.4:80/dashboard/ | http://10.0.0.4:80/dashboard/admin | 72ae6a8c9a77404584ce2f8bfe636f50 | | 0d42f1cc9f2a49cb96b13bcd2303069a | regionOne | http://10.0.0.4:8774/v3 | http://172.16.2.5:8774/v3 | http://172.16.2.5:8774/v3 | ba0b71e12d9b48f1b97bc92127ac5c39 | | 1261c1a156e744babb42619f946304be | regionOne | http://10.0.0.4:8776/v2/%(tenant_id)s | http://172.16.2.5:8776/v2/%(tenant_id)s | http://172.16.2.5:8776/v2/%(tenant_id)s | d7084a118167404b838c643354c04051 | | 61523c7164c441a1b6de3c4141c26035 | regionOne | http://10.0.0.4:9696/ | http://172.16.2.5:9696/ | http://172.16.2.5:9696/ | 0906700787d04c198b77c5b2e5914a2f | | 6e1c590da8a647baad84a79bd1f064ea | regionOne | http://10.0.0.4:8004/v1/%(tenant_id)s | http://172.16.2.5:8004/v1/%(tenant_id)s | http://172.16.2.5:8004/v1/%(tenant_id)s | 043af0f3af8e4ddbadaedb9d1fb77cac | | 6fa7d1037d244b4080df79d037ca3a3f | regionOne | http://10.0.0.4:8774/v2/$(tenant_id)s | http://172.16.2.5:8774/v2/$(tenant_id)s | http://172.16.2.5:8774/v2/$(tenant_id)s | 4bc61150fa0646bd9a566801c38f5bb6 | | 786d2c95eae84f0d9d85354ba7a0dd97 | regionOne | http://10.0.0.4:9292/ | http://172.16.1.4:9292/ | http://172.16.1.4:9292/ | cc985e35655041eeb00a1efc9a7d918d | | 936808e7f700495fb118a0b3a842bf3d | regionOne | http://10.0.0.4:8777/ | http://172.16.2.5:8777/ | http://172.16.2.5:8777/ | 05527b3fd435481f859c74e35019253d | | 965e9f96a6324a0eaf49fff904aec73a | regionOne | http://10.0.0.4:5000/v2.0 | http://10.0.0.4:5000/v2.0 | http://10.0.0.4:35357/v2.0 | 0970fc01e2574423ab2b1b010b686b21 | | 9fc24d8bd0a44a34ba2089f3d1f9bdbf | regionOne | http://10.0.0.4:8080/v1/AUTH_%(tenant_id)s | http://172.16.1.4:8080/v1/AUTH_%(tenant_id)s | http://172.16.1.4:8080/v1 | 2b3f71b49c894dbaab5b9b716b69ed04 | | ff8e8cc6fd504854aee495b5d877d42c | regionOne | http://10.0.0.4:8776/v1/%(tenant_id)s | http://172.16.2.5:8776/v1/%(tenant_id)s | http://172.16.2.5:8776/v1/%(tenant_id)s | 3d9723423dd941099495359baf96e044 |
1. Swit internalurl and adminurl run on the storage network. 2. Glance internalurl and adminurl run on the storage network. 3. Nova EC2 internalurl and adminurl run on the external network. 4. There is an entry with the Horizon URL (http://172.16.23.10:80/dashboard/) and I'm not sure it is used. 5. All Keystone endpoints are using the external network [stack@instack ~]$ cat network-environment.yaml parameters: Controller-1::NeutronExternalNetworkBridge: "''" parameter_defaults: InternalApiNetCidr: 172.16.20.0/24 StorageNetCidr: 172.16.21.0/24 StorageMgmtNetCidr: 172.16.19.0/24 TenantNetCidr: 172.16.22.0/24 ExternalNetCidr: 172.16.23.0/24 InternalApiAllocationPools: [{'start': '172.16.20.10', 'end': '172.16.20.100'}] StorageAllocationPools: [{'start': '172.16.21.10', 'end': '172.16.21.100'}] StorageMgmtAllocationPools: [{'start': '172.16.19.10', 'end': '172.16.19.100'}] TenantAllocationPools: [{'start': '172.16.22.10', 'end': '172.16.22.100'}] ExternalAllocationPools: [{'start': '172.16.23.10', 'end': '172.16.23.100'}] ExternalInterfaceDefaultRoute: 172.16.23.251 [stack@instack ~]$ . overcloudrc [stack@instack ~]$ keystone endpoint-list /usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient. 'python-keystoneclient.', DeprecationWarning) +----------------------------------+-----------+------------------------------------------------+------------------------------------------------+-------------------------------------------+----------------------------------+ | id | region | publicurl | internalurl | adminurl | service_id | +----------------------------------+-----------+------------------------------------------------+------------------------------------------------+-------------------------------------------+----------------------------------+ | 26c5d2b74b964d40bf60fb0bd6806645 | regionOne | http://172.16.23.10:8774/v2/$(tenant_id)s | http://172.16.20.11:8774/v2/$(tenant_id)s | http://172.16.20.11:8774/v2/$(tenant_id)s | ed20e1e9da34423c823aba52ea87fc55 | | 379903b8e5234087ad32c8bce1514523 | regionOne | http://172.16.23.10:8004/v1/%(tenant_id)s | http://172.16.20.11:8004/v1/%(tenant_id)s | http://172.16.20.11:8004/v1/%(tenant_id)s | 2103dc5c2882471e8c88091b09823b09 | | 57d256afb1ba4ec3907f2dbbce04d979 | regionOne | http://172.16.23.10:8776/v1/%(tenant_id)s | http://172.16.20.11:8776/v1/%(tenant_id)s | http://172.16.20.11:8776/v1/%(tenant_id)s | d73bab42eeac471786fb0c36cb8a1743 | | 8ba9d61321824fe3b9200a42a0f66284 | regionOne | http://172.16.23.10:8080/v1/AUTH_%(tenant_id)s | http://172.16.21.10:8080/v1/AUTH_%(tenant_id)s | http://172.16.21.10:8080/v1 | b482b44844124f769fa95bdbec4bfe31 | | 8e2cc98a2a7046e798d8f4168ddb5c10 | regionOne | http://172.16.23.10:9292/ | http://172.16.21.10:9292/ | http://172.16.21.10:9292/ | ecb4e466760f42e6895b7cac80c7d832 | | 98b00e22f6474d2d9996b8cdbacf2694 | regionOne | http://172.16.23.10:8776/v2/%(tenant_id)s | http://172.16.20.11:8776/v2/%(tenant_id)s | http://172.16.20.11:8776/v2/%(tenant_id)s | 725ce263e7aa4b928aed862972afc4df | | ab0b3839d86e40d9a3c26cb29d0215da | regionOne | http://172.16.23.10:8773/services/Cloud | http://172.16.23.10:8773/services/Cloud | http://172.16.23.10:8773/services/Admin | c54d345bac664acfae516db997488509 | | bebe1f83956f4f0d881dad9d31fa5cce | regionOne | http://172.16.23.10:80/dashboard/ | http://172.16.23.10:80/dashboard/ | http://172.16.23.10:80/dashboard/admin | c1fbc9361c424a468a334c9f8302b23c | | cd40941227f541b48acd55b9ed16e9c4 | regionOne | http://172.16.23.10:5000/v2.0 | http://172.16.23.10:5000/v2.0 | http://172.16.23.10:35357/v2.0 | 955f99682d9c4941aea7686198f1f472 | | e46d9d933c424a02965f506cff70b0e7 | regionOne | http://172.16.23.10:9696/ | http://172.16.20.11:9696/ | http://172.16.20.11:9696/ | 7217ecd7cada497497a1a1ee1db5fc9c | | eedfce3fb47c47b88b88b8c5d8e63690 | regionOne | http://172.16.23.10:8774/v3 | http://172.16.20.11:8774/v3 | http://172.16.20.11:8774/v3 | 5aa4fdbdd2cd4d06bfaa21c58658a9d9 | | fb1f05e0beab4b359ee8dbd1ed8bbb42 | regionOne | http://172.16.23.10:8777/ | http://172.16.20.11:8777/ | http://172.16.20.11:8777/ | 3cf5c4581d544607b011639ed1cfff03 | +----------------------------------+-----------+------------------------------------------------+------------------------------------------------+-------------------------------------------+----------------------------------+
1, 2, 3: The overcloud setup uses the IP's that it's given from the outputs of the templates. So if these are wrong, it's likely to be the templates that are wrong. 4. The undercloud Horizon uses that to get the overcloud Horizon URL. 5. That seems to be a bug, I'll take a look at that.
Confirmed with gfidente that 1 and 2 are actually set properly and it's expected behavior to set the endpoints for glance and swift on the storage network.
Found the following error in the latest poodle CalledProcessError: Command '['ssh', '-oStrictHostKeyChecking=no', '-t', '-l', 'heat-admin', u'172.16.20.11', 'sudo', 'keystone-manage', 'pki_setup', '--keystone-user', "$(getent passwd | grep '^keystone' | cut -d: -f1)", '--keystone-group', "$(getent group | grep '^keystone' | cut -d: -f1)"]' returned non-zero exit status 255 [2015-07-22 13:33:05,888] DEBUG openstackclient.shell clean_up DeployOvercloud [2015-07-22 13:33:05,888] DEBUG openstackclient.shell got an error: Command '['ssh', '-oStrictHostKeyChecking=no', '-t', '-l', 'heat-admin', u'172.16.20.11', 'sudo', 'keystone-manage', 'pki_setup', '--keystone-user', "$(getent passwd | grep '^keystone' | cut -d: -f1)", '--keystone-group', "$(getent group | grep '^keystone' | cut -d: -f1)"]' returned non-zero exit status 255 [2015-07-22 13:33:05,903] ERROR openstackclient.shell Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/openstackclient/shell.py", line 176, in run return super(OpenStackShell, self).run(argv) File "/usr/lib/python2.7/site-packages/cliff/app.py", line 230, in run result = self.run_subcommand(remainder) File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand result = cmd.run(parsed_args) File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run self.take_action(parsed_args) File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 882, in take_action self._deploy_postconfig(stack, parsed_args) File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 680, in _deploy_postconfig user='heat-admin') File "/usr/lib/python2.7/site-packages/os_cloud_config/keystone.py", line 149, in initialize _perform_pki_initialization(host, user) File "/usr/lib/python2.7/site-packages/os_cloud_config/keystone.py", line 485, in _perform_pki_initialization "$(getent group | grep '^keystone' | cut -d: -f1)"]) File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command '['ssh', '-oStrictHostKeyChecking=no', '-t', '-l', 'heat-admin', u'172.16.20.11', 'sudo', 'keystone-manage', 'pki_setup', '--keystone-user', "$(getent passwd | grep '^keystone' | cut -d: -f1)", '--keystone-group', "$(getent group | grep '^keystone' | cut -d: -f1)"]' returned non-zero exit status 255
I am pretty sure this bug is squashed. I just tested a bare metal deployment on the 2015-07-17.2 puddle, and my dashboard endpoints are on the External network, Swift Proxy and Glance API are on Storage, and Cinder API is on Internal API (as expected). The following is true of my environment: InternalApiNetCidr: 172.17.0.0/24 StorageNetCidr: 172.18.0.0/24 ExternalNetCidr: 10.8.148.0/24 Here is the Keystone endpoint list: [stack@host01 ~]$ keystone endpoint-list /usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient. 'python-keystoneclient.', DeprecationWarning) +----------------------------------+-----------+-----------------------------------------------+-----------------------------------------------+------------------------------------------+----------------------------------+ | id | region | publicurl | internalurl | adminurl | service_id | +----------------------------------+-----------+-----------------------------------------------+-----------------------------------------------+------------------------------------------+----------------------------------+ | 1d3771c9823e4ac1a4bdcae494b38b02 | regionOne | http://10.8.148.10:8004/v1/%(tenant_id)s | http://172.17.0.11:8004/v1/%(tenant_id)s | http://172.17.0.11:8004/v1/%(tenant_id)s | b16466e05e4c45e7968ad01155e790f1 | | 35abef313f05417c89fd268c57f02af5 | regionOne | http://10.8.148.10:8080/v1/AUTH_%(tenant_id)s | http://172.18.0.10:8080/v1/AUTH_%(tenant_id)s | http://172.18.0.10:8080/v1 | 457dee06548840d4ac855eb29067f42c | | 4855fadca2cf4856bd38c9c25f9f01a5 | regionOne | http://10.8.148.10:9696/ | http://172.17.0.11:9696/ | http://172.17.0.11:9696/ | 8eed9dfa80d0466daad7540de7bb6e14 | | 545377bc47a74656b3d959c925d6b480 | regionOne | http://10.8.148.10:8776/v2/%(tenant_id)s | http://172.17.0.11:8776/v2/%(tenant_id)s | http://172.17.0.11:8776/v2/%(tenant_id)s | d8d5f6e78ecf4a2eb1aed056631d324a | | 59387c15fa1f4ba4a7dcfa76a5877f91 | regionOne | http://10.8.148.10:80/dashboard/ | http://10.8.148.10:80/dashboard/ | http://10.8.148.10:80/dashboard/admin | ada23a909188491fb4fff7550b0d1123 | | 6186d106f7084ad8afb2e7b99339a53f | regionOne | http://10.8.148.10:8774/v3 | http://172.17.0.11:8774/v3 | http://172.17.0.11:8774/v3 | 9590901356234258863e0da3287eb0ff | | 8670254e57e84c09abad55d357342389 | regionOne | http://10.8.148.10:9292/ | http://172.18.0.10:9292/ | http://172.18.0.10:9292/ | 186f81378718405ab1500b4f46da5548 | | 8a0bf595960341bf901c982b504f07f3 | regionOne | http://10.8.148.10:8773/services/Cloud | http://10.8.148.10:8773/services/Cloud | http://10.8.148.10:8773/services/Admin | 51182d84805a4aef9e960bf6a6cf7728 | | b04bd718402340de9ed3c5d36bcd27ef | regionOne | http://10.8.148.10:8774/v2/$(tenant_id)s | http://172.17.0.11:8774/v2/$(tenant_id)s | http://172.17.0.11:8774/v2/$(tenant_id)s | 0caacacc52624aa89f2bc3053da243c8 | | d10eeb86a8fa4fd89c20a5f8093d4606 | regionOne | http://10.8.148.10:8776/v1/%(tenant_id)s | http://172.17.0.11:8776/v1/%(tenant_id)s | http://172.17.0.11:8776/v1/%(tenant_id)s | 6984efbc35a84fb4a3a4c916a3403305 | | e690627ca206409a8744697f0e573965 | regionOne | http://10.8.148.10:8777/ | http://172.17.0.11:8777/ | http://172.17.0.11:8777/ | 3b75533053ac4d7e99da19178257fe11 | | fd16dd42afa14cfdbce44bb465b6e4c6 | regionOne | http://10.8.148.10:5000/v2.0 | http://10.8.148.10:5000/v2.0 | http://10.8.148.10:35357/v2.0 | 6ffd8d3371e5480da7fe048bfd7e0ba7 | +----------------------------------+-----------+-----------------------------------------------+-----------------------------------------------+------------------------------------------+----------------------------------+
(In reply to Dan Sneddon from comment #33) It turns out that Keystone is still pointing to the Public VIP on all endpoints, the internal and admin endpoint should be on the Internal API network: Service: identity +-------------+----------------------------------+ | Property | Value | +-------------+----------------------------------+ | adminURL | http://10.8.148.10:35357/v2.0 | | id | 5a8845393e7542c49a7f55197549787a | | internalURL | http://10.8.148.10:5000/v2.0 | | publicURL | http://10.8.148.10:5000/v2.0 | | region | regionOne | +-------------+----------------------------------+
we reverted the patch that caused os-cloud-config to try and ssh via the internal ip which is not reachable from the undercloud: https://code.engineering.redhat.com/gerrit/#/c/53557/ that should hopefully get us a passing poodle CI again. moving forward with this one, if we want to fix it so that the keystone endpoints are set up correctly with the internal and admin endpoints on the internal api network, then we need this patch for os-cloud-config: https://review.openstack.org/#/c/204760/1 once that lands, we can propose a revert of the revert. it's not clear to me if the fix for the keystone endpoints is a blocker or not. we should discuss this one in tomorrow's triage. However, if we're going to respin again after tomorrow, it's probably worthwhile to get it in.
So it was discovered that this patch was causing the errors reported in comment 32: https://code.engineering.redhat.com/gerrit/gitweb?p=python-rdomanager-oscplugin.git;a=commitdiff;h=df619be2a209480bfc7b9264a7b1bc9d40b96aee The patch was reverted, but until a new, working patch is made the Keystone endpoints will all be on the external VIP. To change this, you can modify the SQL directly: 1) sudo mysql 2) use keystone 3) select * from endpoints; 4) Identify the "internal" endpoint using :5000 (ignore the one with :5000 that is listed as "public"). 5) Run this SQL to update the keystone internal endpoint: UPDATE endpoint SET url='http://<IP>:5000/v2.0' WHERE id='<UUID>' [Note: <IP> is the internal VIP (like most other endpoints), and UUID is the id of the internal endpoint] 6) Identify the "admin" endpoint using :35357 7) Run this SQL to update the keystone admin endpoint: UPDATE endpoint SET url='http://<IP>:35357/v2.0' WHERE id='<UUID>' [Note: where <IP> is the internal VIP (like most other endpoints), and UUID is the id of the internal endpoint] None of this should be needed after the patch is merged to os_cloud_config, it should be happening early Jul 23rd.
(In reply to Dan Sneddon from comment #36) It should be obvious, but in the comment above, I made a typo: > [Note: where <IP> is the internal VIP (like most other endpoints), and UUID > is the id of the internal endpoint] Should read that UUID is the id of the admin network in the second command.
i've posted the os-cloud-config patch downstream: https://code.engineering.redhat.com/gerrit/#/c/53564/ Note that this needs to pass CI, get built into a poodle, then we can propose the patch again to set the keystone internal/admin endpoints correctly to the internal api network (which should be tested with network isolation prior to merging). Given the time to make that happen, and the above available workaround, that's why i'm saying we may not need to block on this one any further.
ok, i went ahead and built a new python-rdomanager-oscplugin with the revert applied. when testing my os-cloud-config patch, I was seeing a new issue. I got pass the pki initialize where it was failing before, but when it went to create the keystone endpoints, it got a token from the publicURL for keystone ($OS_AUTH_URL), but then keystoneclient switches over to use the adminURL returned in the catalog with the token response. So, we were then back to the same issue since the adminURL is on the internal API network, the requests to create the endpoints just eventually fails. The revert is built into: python-rdomanager-oscplugin-0.0.8-43.el7ost
The problem with the patch that was tried ( https://code.engineering.redhat.com/gerrit/gitweb?p=python-rdomanager-oscplugin.git;a=commitdiff;h=df619be2a209480bfc7b9264a7b1bc9d40b96aee ) is that we can't use the Internal API network because the Undercloud can't reach it. If we could pass the ctlplane IP to the same method called in that patch, then we could configure the endpoints on the ctlplane, which would close this bug. Unfortunately, we can't use the same lookup method that we used in that patch (keystone_ip = service_ips.get('KeystoneInternalVip'), because we aren't including the ControlPlaneVip in that list of IPs. We need a way to pull out the control plane VIP and pass that as the host to the keystone.initialize method (as seen in the above patch). Ideas?
(In reply to Dan Sneddon from comment #36) The workaround presented does not fix the issue, you would only want to move the Internal API (not Admin API). One possible fix for this is this: Make a one-line change to overcloud-without-mergepy.yaml: KeystoneInternalVip: description: VIP for Keystone API internal endpoint - value: {get_attr: [VipMap, net_ip_map, {get_param: [ServiceNetMap, KeystonePublicApiNetwork]}] + value: {get_attr: [ControlVirtualIP, fixed_ips, 0, ip_address]} Then we could actually use the patch that was reverted, because it would put Keystone's adminURL and internalURL on the ctlplane network.
(In reply to Dan Sneddon from comment #42) Dan Prince has patches on review to add KeystoneAdminVip stack output: https://review.openstack.org/205278 https://review.openstack.org/205349
What is the current expected setup for the Keystone endpoints? Public endpoint on the external network and the internal and admin endpoints on the ctlplane network?
(In reply to Marius Cornea from comment #46) > What is the current expected setup for the Keystone endpoints? Public > endpoint on the external network and the internal and admin endpoints on the > ctlplane network? The Public endpoint will listen on the Internal API network, but HAProxy will be listening on the external network. The Admin API is now on the ctlplane network.
[stack@instack ~]$ rpm -qa | grep openstack-tripleo-heat-templates openstack-tripleo-heat-templates-0.8.6-64.el7ost.noarch [stack@instack ~]$ . stackrc [stack@instack ~]$ neutron net-list +--------------------------------------+--------------+-----------------------------------------------------+ | id | name | subnets | +--------------------------------------+--------------+-----------------------------------------------------+ | 109fd936-fd84-4205-9a7d-0ab5419c912a | external | b0755c0f-fb92-41f0-a328-1bf0cfcd6c92 172.16.23.0/24 | | 53d6c110-8921-4367-99ce-d16d833a6cd3 | tenant | ac29a2c3-dde5-4ddb-9153-fe7f4089e8c3 172.16.22.0/24 | | 1af0de7e-a38e-4ec1-a16a-54d9f394dad3 | internal_api | 060f9b60-0e5c-42b7-bf34-b19f2e9b182c 172.16.20.0/24 | | 819e2ec0-fb1e-41a6-8593-57f6eb5d30d1 | storage_mgmt | 8a51ccd9-b6c6-4382-8b2b-c69b5a7ec498 172.16.19.0/24 | | 4a50104a-5a4a-4f2d-ad55-9dc887c670b0 | storage | 57e734ae-eb26-4df1-ad6f-46be165f4495 172.16.21.0/24 | | b0d073be-0ed0-4497-9045-76b2d0d923ca | ctlplane | 51c59420-fc93-4ed0-8086-c27056b4c161 192.0.2.0/24 | +--------------------------------------+--------------+-----------------------------------------------------+ [stack@instack ~]$ . overcloudrc [stack@instack ~]$ keystone catalog --service identity /usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient. 'python-keystoneclient.', DeprecationWarning) Service: identity +-------------+----------------------------------+ | Property | Value | +-------------+----------------------------------+ | adminURL | http://192.0.2.17:35357/v2.0 | | id | 4ba8351937d54a79b55f8109fcabfacd | | internalURL | http://192.0.2.17:5000/v2.0 | | publicURL | http://172.16.23.10:5000/v2.0 | | region | regionOne | +-------------+----------------------------------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:1862