Bug 1623335
Summary: | ose-plane pods names are taken from openstack's metadata although cloud provider is not configured | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Lukas Bednar <lbednar> | ||||||||
Component: | Installer | Assignee: | Vadim Rutkovsky <vrutkovs> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Johnny Liu <jialiu> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 3.11.0 | CC: | aos-bugs, eedri, jialiu, jokerman, mifiedle, mmccomas, ncredi, sdodson, tsedovic, tzumainn, wsun | ||||||||
Target Milestone: | --- | Keywords: | Regression | ||||||||
Target Release: | 3.11.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: |
Cause: openshift-ansible was incorrectly assumed openstack install would use openstack as a cloud provider
Consequence: incorrect pod names were expected
Fix: node hostname is used if cloudprovider is not set
Result: openshift can be installed on openstack without cloudprovider set
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 1629726 (view as bug list) | Environment: | |||||||||
Last Closed: | 2018-12-21 15:23:10 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1629726 | ||||||||||
Attachments: |
|
Description
Lukas Bednar
2018-08-29 06:19:09 UTC
marking as regression this used to work with ocp3.10 (In reply to Nelly Credi from comment #1) > marking as regression > this used to work with ocp3.10 with openshift-ansible-3.10.27-1 (In reply to Lukas Bednar from comment #0) > Additional info: > > Based on discussion with Jianlin: > [root@cnv-executor-lbednar-master1 ~]# curl > http://169.254.169.254/openstack/latest/meta_data.json > {"random_seed": > "6PBttx5YYSqsbl+srAQGJrQPXdrO0R6QM7nGxwJ/ > P2bGKmUbW4PXXNJXuz4E4gNp8jw1qU5APF4ZHe1D0w2M1hyyCv1UxDyEuzfecVfnyZYMGywWa/ > P1JyMlzr/opPEGXPuE5FVtyZ+H3Jl1XeKl2uYNIhmPDGjYkLKLmKnyU/ > xEJezkVZpm07Y4CukXw9T8rB+34+R7uMirqm0frH+bZFzVhsW9Z7b/ > WgSd5vo4d7Kr1u1tkPYpCrQQGoEEFTXfpEZ+jwTgXVuJ8in6yklyObRcwPIV8h49GfgQ7+Bqrd7ks > N/ > 8wDMePqaJi9l35tyGHveJxy8effJ3G3QbEBg+9ov5yyAACivkHUOgjjlI2OWJp1DUxVg8rF6VRy0m > yzOa7Db0X3U9BCgAH5JFXaKpa8UMSlbN5h7wJnZBRTZN2sQs7LqGIeyig+Mx9WnzpWHUPIzz6WCnM > IlxAbtnV4E1SPP75CZ4sGHAI+HSmFCjfpGb5rYXRbXl31I1F4yJMp9Ek2QyeqacYYqTcuLgPbCiH3 > lcUGzRchEZm6wIjhkBl0bO40VCof5+2aTSb3ymVp7OFuB8L04HWkHBYo/ > sX2kWlzQJ9wemYRr8W3icRobo8FrilTuYTiiVNTuKHY4qMTbQ180m0s1LuvmFoHpTTVIFiX0Xd+I0 > uc3w9ej46tG6B4w=", "uuid": "2f052601-6a2c-4ea5-a58b-d26974974a02", > "availability_zone": "nova", "keys": [{"data": "ssh-rsa > AAAAB3NzaC1yc2EAAAADAQABAAABAQCj47ubVnxR16JU7ZfDli3N5QVBAwJBRh2xMryyjk5dtfugo > 5JIPGB2cyXTqEDdzuRmI+Vkb/A5duJyBRlA+9RndGGmhhMnj8and3wu5/ > cEb7DkF6ZJ25QV4LQx3K/ > i57LStUHXRTvruHOZ2nCuVXWqi7wSvz5YcvEv7O8pNF5uGmqHlShBdxQxcjurXACZ1YY0YDJDr3AJ > ai1KF9zehVJODuSbrnOYpThVWGjFuFAnNxbtuZ8EOSougN2aYTf2qr/ > KFGDHtewIkzZmP6cjzKO5bN3pVbXxmb2Gces/BYHntY4MXBTUqwsmsCRC5SAz14bEP/ > vsLtrNhjq9vCS+BjMT root", "type": "ssh", "name": > "cnv-executor-lbednar-net-key"}], "hostname": > "cnv-executor-lbednar-master1", "launch_index": 0, "devices": [], > "public_keys": {"cnv-executor-lbednar-net-key": "ssh-rsa > AAAAB3NzaC1yc2EAAAADAQABAAABAQCj47ubVnxR16JU7ZfDli3N5QVBAwJBRh2xMryyjk5dtfugo > 5JIPGB2cyXTqEDdzuRmI+Vkb/A5duJyBRlA+9RndGGmhhMnj8and3wu5/ > cEb7DkF6ZJ25QV4LQx3K/ > i57LStUHXRTvruHOZ2nCuVXWqi7wSvz5YcvEv7O8pNF5uGmqHlShBdxQxcjurXACZ1YY0YDJDr3AJ > ai1KF9zehVJODuSbrnOYpThVWGjFuFAnNxbtuZ8EOSougN2aYTf2qr/ > KFGDHtewIkzZmP6cjzKO5bN3pVbXxmb2Gces/BYHntY4MXBTUqwsmsCRC5SAz14bEP/ > vsLtrNhjq9vCS+BjMT root"}, "project_id": > "86a72fb2d59f4deabcef608d3c6e1a47", "name": "cnv-executor-lbednar-master1"} > > The only suspicion for me your instance name is > "cnv-executor-lbednar-master1", > not "cnv-executor-lbednar-master1.example.com". An addition info, is just a guess what might be wrong there - it wasn't confirmed. What names do your nodes get registered as? If the control plane is up and running `oc get nodes` Tomas, Tzu-Mainn, This is likely happening because openshift_facts tries to introspect metadata api based on the BIOS type of a machine. This seemed to work for many years but now we no longer allow people to override the names nodes use and this is becoming a problem. This was done to ensure that the name the kubelet registers with matches the instance name as required by kubernetes. However that's only relevant if you've enabled the cloud provider and it's managing cloud resources. If they're not configuring the openstack provider, is there any need to be this restrictive? I'm thinking we should only enable that metadata inspection when the cloud provider is actually configured. (In reply to Scott Dodson from comment #4) > What names do your nodes get registered as? If the control plane is up and > running `oc get nodes` At this point of deployment state, there is only master node available, compute nodes are added later once ose-plane is ready. [cloud-user@cnv-executor-lbednar-master1 ~]$ oc get nodes NAME STATUS ROLES AGE VERSION cnv-executor-lbednar-master1.example.com Ready <none> 5h v1.11.0+d4cacc0 (In reply to Scott Dodson from comment #5) > Tomas, Tzu-Mainn, > > This is likely happening because openshift_facts tries to introspect > metadata api based on the BIOS type of a machine. This seemed to work for > many years but now we no longer allow people to override the names nodes use > and this is becoming a problem. This was done to ensure that the name the > kubelet registers with matches the instance name as required by kubernetes. > However that's only relevant if you've enabled the cloud provider and it's > managing cloud resources. > > If they're not configuring the openstack provider, is there any need to be > this restrictive? I'm thinking we should only enable that metadata > inspection when the cloud provider is actually configured. Hi Scott! I think you're right, there's probably no need for that restriction. Here is some PR story. https://github.com/openshift/openshift-ansible/pull/9876 is submit for fixing this issue firstly, but it introduced a critical regression - BZ#1626812, then a new PR is submit: https://github.com/openshift/openshift-ansible/pull/10024, but the new PR has never been backported to 3.11 yet. Johnny, Do you have the ability to test that this unmerged PR doesn't regress in the same way observed in bug 1626812? This is a pretty significant change and we're worried about accepting it this late in the cycle even though the change makes sense and seems like the right thing to do. https://github.com/openshift/openshift-ansible/pull/10044 When https://github.com/openshift/openshift-ansible/pull/10024 is submit, I already did some testing. cluster on openstack + cloudprovider enabled cluster on gce + cloudprovider enabled cluster on gce + no cloudprovider enabled All are working well. Seem good to merge the backport. PR to fix this in 3.10 - https://github.com/openshift/openshift-ansible/pull/10036 (In reply to Vadim Rutkovsky from comment #13) > PR to fix this in 3.10 - > https://github.com/openshift/openshift-ansible/pull/10036 We need to make sure to clone bugs when backporting. One bug per release we're making the change in. The PR 10024 has been merged to openshift-ansible-3.11.7-1,please check the bug. Verified this bug with openshift-ansible-3.11.8-1.git.0.ee46b6b.el7_5.noarch, and PASS. 1. cluster running on OSP without cloudprovider enabled + short hostname. [root@qe-jialiu1-mrre-1 ~]# oc get node NAME STATUS ROLES AGE VERSION qe-jialiu1-mrre-1 Ready compute,master 2h v1.11.0+d4cacc0 [root@qe-jialiu1-mrre-1 ~]# oc get po -n kube-system NAME READY STATUS RESTARTS AGE master-api-qe-jialiu1-mrre-1 1/1 Running 0 1h master-controllers-qe-jialiu1-mrre-1 1/1 Running 0 1h master-etcd-qe-jialiu1-mrre-1 1/1 Running 0 1h [root@qe-jialiu1-mrre-1 ~]# hostname -f qe-jialiu1-mrre-1.int.0919-1l1.qe.rhcloud.com [root@qe-jialiu1-mrre-1 ~]# hostname qe-jialiu1-mrre-1 [root@qe-jialiu1-mrre-1 ~]# oc get po -n install-test NAME READY STATUS RESTARTS AGE mongodb-1-92gqc 1/1 Running 0 1h nodejs-mongodb-example-1-build 0/1 Completed 0 1h nodejs-mongodb-example-1-wxvmd 1/1 Running 0 1h TASK [Gather Cluster facts] **************************************************** Wednesday 19 September 2018 11:13:23 +0800 (0:00:00.055) 0:00:07.779 *** changed: [host-8-250-60.host.centralci.eng.rdu2.redhat.com] => {"ansible_facts": {"openshift": {"common": {"all_hostnames": ["172.16.122.81", "host-8-250-60.host.centralci.eng.rdu2.redhat.com", "qe-jialiu1-mrre-1.int.0919-1l1.qe.rhcloud.com"], "config_base": "/etc/origin", "dns_domain": "cluster.local", "generate_no_proxy_hosts": true, "hostname": "qe-jialiu1-mrre-1.int.0919-1l1.qe.rhcloud.com", "internal_hostnames": ["172.16.122.81", "qe-jialiu1-mrre-1.int.0919-1l1.qe.rhcloud.com"], "ip": "172.16.122.81", "kube_svc_ip": "172.30.0.1", "portal_net": "172.30.0.0/16", "public_hostname": "host-8-250-60.host.centralci.eng.rdu2.redhat.com", "public_ip": "172.16.122.81", "raw_hostname": "qe-jialiu1-mrre-1"}, "current_config": {}}}, "changed": true} 2. cluster running on OSP without cloudprovider enabled + a FDQN hostname. [root@qe-jialiu3-mrre-1 ~]# oc get node NAME STATUS ROLES AGE VERSION qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com Ready compute,master 9m v1.11.0+d4cacc0 [root@qe-jialiu3-mrre-1 ~]# oc get po -n kube-system NAME READY STATUS RESTARTS AGE master-api-qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com 1/1 Running 0 9m master-controllers-qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com 1/1 Running 0 9m master-etcd-qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com 1/1 Running 0 9m [root@qe-jialiu3-mrre-1 ~]# hostname -f qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com [root@qe-jialiu3-mrre-1 ~]# hostname qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com [root@qe-jialiu3-mrre-1 ~]# oc get po -n install-test NAME READY STATUS RESTARTS AGE mongodb-1-8ppdr 1/1 Running 0 5m nodejs-mongodb-example-1-build 0/1 Completed 0 5m nodejs-mongodb-example-1-q5fgm 1/1 Running 0 4m TASK [Gather Cluster facts] **************************************************** Wednesday 19 September 2018 15:42:41 +0800 (0:00:00.074) 0:00:09.223 *** changed: [host-8-252-157.host.centralci.eng.rdu2.redhat.com] => {"ansible_facts": {"openshift": {"common": {"all_hostnames": ["qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com", "172.16.122.54", "host-8-252-157.host.centralci.eng.rdu2.redhat.com"], "config_base": "/etc/origin", "dns_domain": "cluster.local", "generate_no_proxy_hosts": true, "hostname": "qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com", "internal_hostnames": ["qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com", "172.16.122.54"], "ip": "172.16.122.54", "kube_svc_ip": "172.30.0.1", "portal_net": "172.30.0.0/16", "public_hostname": "host-8-252-157.host.centralci.eng.rdu2.redhat.com", "public_ip": "172.16.122.54", "raw_hostname": "qe-jialiu3-mrre-1.int.0919-2k8.qe.rhcloud.com"}, "current_config": {}}}, "changed": true} 3. cluster running on OSP with cloudprovider enabled + a short hostname. [root@qe-jialiu2-mrre-1 ~]# oc get node NAME STATUS ROLES AGE VERSION qe-jialiu2-mrre-1 Ready compute,master 1h v1.11.0+d4cacc0 [root@qe-jialiu2-mrre-1 ~]# oc get po -n kube-system NAME READY STATUS RESTARTS AGE master-api-qe-jialiu2-mrre-1 1/1 Running 0 1h master-controllers-qe-jialiu2-mrre-1 1/1 Running 0 1h master-etcd-qe-jialiu2-mrre-1 1/1 Running 0 1h [root@qe-jialiu2-mrre-1 ~]# hostname -f qe-jialiu2-mrre-1.int.0919-l31.qe.rhcloud.com [root@qe-jialiu2-mrre-1 ~]# hostname qe-jialiu2-mrre-1 [root@qe-jialiu2-mrre-1 ~]# oc get po -n install-test NAME READY STATUS RESTARTS AGE mongodb-1-l28zr 1/1 Running 0 1h nodejs-mongodb-example-1-build 0/1 Completed 0 1h nodejs-mongodb-example-1-v5lps 1/1 Running 0 1h TASK [Gather Cluster facts] **************************************************** Wednesday 19 September 2018 11:14:32 +0800 (0:00:00.074) 0:00:09.220 *** changed: [host-8-241-188.host.centralci.eng.rdu2.redhat.com] => {"ansible_facts": {"openshift": {"common": {"all_hostnames": ["10.8.241.188", "qe-jialiu2-mrre-1", "172.16.122.83", "host-8-241-188.host.centralci.eng.rdu2.redhat.com"], "cloudprovider": "openstack", "config_base": "/etc/origin", "dns_domain": "cluster.local", "generate_no_proxy_hosts": true, "hostname": "qe-jialiu2-mrre-1", "internal_hostnames": ["qe-jialiu2-mrre-1", "172.16.122.83"], "ip": "172.16.122.83", "kube_svc_ip": "172.30.0.1", "portal_net": "172.30.0.0/16", "public_hostname": "host-8-241-188.host.centralci.eng.rdu2.redhat.com", "public_ip": "10.8.241.188", "raw_hostname": "qe-jialiu2-mrre-1"}, "current_config": {}, "provider": {"metadata": {"availability_zone": "nova", "devices": [], "ec2_compat": {"ami-id": "ami-0000abce", "ami-launch-index": "0", "ami-manifest-path": "FIXME", "block-device-mapping": {"ami": "vda", "root": "/dev/vda"}, "hostname": "qe-jialiu2-mrre-1", "instance-action": "none", "instance-id": "i-009e5826", "instance-type": "m1.medium", "local-hostname": "qe-jialiu2-mrre-1", "local-ipv4": "172.16.122.83", "placement": {"availability-zone": "nova"}, "public-hostname": "qe-jialiu2-mrre-1", "public-ipv4": "10.8.241.188", "public-keys/": "0=libra", "reservation-id": "r-esll8q2z", "security-groups": "default"}, "hostname": "qe-jialiu2-mrre-1", "keys": [{"data": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUq7W38xCZ9WGSWCvustaMGMT04tRohw6AKGzI7P7xql5lhCAReyt72n9qWQRZsE1YiCSQuTfXI1oc8NpSM7+lMLwj12G8z3I1YT31JHr9LLYg/XIcExkzfBI920CaS82VqmKOpI9+ARHSJBdIbKRI0f5Y+u4xbc5UzKCJX8jcKGG7nEiw8zm+cvAlfOgssMK+qJppIbVcb2iZNTsw5i2aX6FDMyC+b17DQHzBGpNbhZYxuoERZVRcnYctgIzuo6fD60gniX0fVvrchlOnubB1sRYbloP2r6UE22w/dpLKOFE5i7CA0ZzNBERZ94cIKumIH9MiJs1a6bMe89VOjjNV", "name": "libra", "type": "ssh"}], "launch_index": 0, "name": "qe-jialiu2-mrre-1", "project_id": "e0fa85b6a06443959d2d3b497174bed6", "uuid": "30cb431f-4cf3-43e8-9f1a-4947d51252ce"}, "name": "openstack", "network": {"hostname": "qe-jialiu2-mrre-1", "interfaces": [], "ip": "172.16.122.83", "ipv6_enabled": false, "public_hostname": "10.8.241.188", "public_ip": "10.8.241.188"}, "zone": "nova"}}}, "changed": true} Created attachment 1485109 [details]
installation log with inventory file embedded for qeos10
Created attachment 1485110 [details]
installation log with inventory file embedded for snvl2
Closing bugs that were verified and targeted for GA but for some reason were not picked up by errata. This bug fix should be present in current 3.11 release content. |