Created attachment 1614128 [details] install logs Description of problem: OVN install fails on Bare Metal. There is not much info except install logs. Bootstrapping is not successful. API is up but refusing the connections like $ oc login -u kubeadmin -p password error: dial tcp x.x.x.x:6443: connect: connection refused - verify you have provided the correct host and port and that the server is currently running. I could't find a way to go inside the cluster. Its failing in early stages. Version-Release number of selected component (if applicable):4.2.0-0.nightly-2019-09-11-074500 How reproducible:Always Steps to Reproduce: 1. Install OVNKubernetes on Bare Metal cluster 2. 3. Actual results: unsuccessful installed Expected results:successful installation Additional info:
> I could't find a way to go inside the cluster. Its failing in early stages. It's bare metal... just make sure the machine is set up to let you ssh in, then "journal -u bootkube"
I will keep looking at it. Currently, our automation jobs are not disclosing actual baremetal hostname/ip but api.x.x.x:lb hostname. Gathering more info on that part..
Created attachment 1614565 [details] bootkube logs
Some more info along with attachment $ oc get pods -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE ovnkube-master-76c57ddbd-jnzqx 4/4 Running 0 39m ovnkube-node-9zfnx 2/3 Running 8 39m ovnkube-node-bwmgc 3/3 Running 0 39m ovnkube-node-t66pt 2/3 Running 8 39m $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE cloud-credential 4.2.0-0.nightly-2019-09-11-074500 True False False 39m dns 4.2.0-0.nightly-2019-09-11-074500 True True True 38m insights 4.2.0-0.nightly-2019-09-11-074500 True False False 39m kube-apiserver False True True 39m kube-controller-manager False True True 39m kube-scheduler 4.2.0-0.nightly-2019-09-11-074500 False True True 39m machine-api 4.2.0-0.nightly-2019-09-11-074500 True False False 39m machine-config 4.2.0-0.nightly-2019-09-11-074500 False True True 39m network False True False 40m openshift-apiserver 4.2.0-0.nightly-2019-09-11-074500 Unknown Unknown True 39m openshift-controller-manager 4.2.0-0.nightly-2019-09-11-074500 False False False 33m operator-lifecycle-manager 4.2.0-0.nightly-2019-09-11-074500 True False False 38m operator-lifecycle-manager-catalog 4.2.0-0.nightly-2019-09-11-074500 True False False 38m operator-lifecycle-manager-packageserver False True False 38m service-ca 4.2.0-0.nightly-2019-09-11-074500 True False False 39m
Anurag, were you able to get a must-gather? It will be tough to diagnose without the logs.
(In reply to Casey Callendrello from comment #5) > Anurag, > were you able to get a must-gather? It will be tough to diagnose without the > logs. Hi Casey, i attached the "journal -u bootkube" as requested by Dan. Let me see again if must-gather is obtainable
I believe this is fixed with the 4.3 work.
Created attachment 1637836 [details] bootkube logs 11/19
Please provide the RHCOS used. Also, oc get clusterversion and oc -n openshift-ovn-kubernetes get kube-apiserver -oyaml. I'm working on https://bugzilla.redhat.com/show_bug.cgi?id=1750606 , and I think they may be the same issue.
This is still failing on recent 4.3 with same symptoms mentioned in comment 14
Just add a note about deploying OpenShift on OVN baremetal environment. We have a Baremetal CI job which runs UPI deployment of latest 4.4 OpenShift, it passes with version: 4.4.0-0.ci-2020-01-15-133915 https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OCP-UPI-Install-4.3/10/console And a periodical run of 4.3 deployment shows that it passed with version : 4.3.0-0.nightly-2020-01-07-212456 https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OVN-UPI-Install-4.3/19/console @Anurag, would you please try with latest 4.3 or 4.4 and see if it gives a different result?
Syncing with Phil on forum-sdn. FYI: i manually approved the pending CSRs but didn't see any progress on the cluster post that. Attaching bootkube logs here as well
Created attachment 1654914 [details] bootkube_logs Jan 23
Continue to track this in https://bugzilla.redhat.com/show_bug.cgi?id=1794775 Closing this one. *** This bug has been marked as a duplicate of bug 1794775 ***