Bug 1824426 - [osp] Machines stuck in Provisioned status and have no nodeRef after insallation
Summary: [osp] Machines stuck in Provisioned status and have no nodeRef after insallation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.5.0
Assignee: Mike Fedosin
QA Contact: David Sanz
URL:
Whiteboard:
: 1824425 (view as bug list)
Depends On:
Blocks: 1839012
TreeView+ depends on / blocked
 
Reported: 2020-04-16 08:33 UTC by sunzhaohua
Modified: 2020-08-24 22:28 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:27:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-openstack pull 88 0 None closed Bug 1824426: Allow to define primary ip address for machines 2021-02-16 11:49:02 UTC
Github openshift installer pull 3483 0 None closed Bug 1824426: tag primary OpenStack networks 2021-02-16 11:49:02 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:28:28 UTC

Description sunzhaohua 2020-04-16 08:33:36 UTC
Description of problem:
Add workers to additional network by adding additionalNetworkIDs in the install-config.yaml. After installation, machines stuck in Provisioned status and have no nodeRef.

Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-04-15-223247

How reproducible:
Always

Steps to Reproduce:
1. Setup env with additional networks for your machines
2. Check machine, node
3. Check logs

Actual results:
Machines stuck in Provisioned status and have no nodeRef after insallation
$ oc get node
NAME                             STATUS   ROLES    AGE     VERSION
zhsun416osp-lhbg5-master-0       Ready    master   4h27m   v1.18.0-rc.1
zhsun416osp-lhbg5-master-1       Ready    master   4h27m   v1.18.0-rc.1
zhsun416osp-lhbg5-master-2       Ready    master   4h27m   v1.18.0-rc.1
zhsun416osp-lhbg5-worker-8242k   Ready    worker   4h12m   v1.18.0-rc.1
zhsun416osp-lhbg5-worker-9hc8x   Ready    worker   4h15m   v1.18.0-rc.1
zhsun416osp-lhbg5-worker-kb7ws   Ready    worker   4h12m   v1.18.0-rc.1
$ oc get machineset
NAME                       DESIRED   CURRENT   READY   AVAILABLE   AGE
zhsun416osp-lhbg5-worker   3         3                             4h28m
$ oc get machine
NAME                             PHASE         TYPE        REGION      ZONE   AGE
zhsun416osp-lhbg5-master-0       Provisioned   m1.xlarge   regionOne   nova   4h28m
zhsun416osp-lhbg5-master-1       Provisioned   m1.xlarge   regionOne   nova   4h28m
zhsun416osp-lhbg5-master-2       Running       m1.xlarge   regionOne   nova   4h28m
zhsun416osp-lhbg5-worker-8242k   Provisioned   m1.xlarge   regionOne   nova   4h19m
zhsun416osp-lhbg5-worker-9hc8x   Provisioned   m1.xlarge   regionOne   nova   4h19m
zhsun416osp-lhbg5-worker-kb7ws   Provisioned   m1.xlarge   regionOne   nova   4h19m
$ oc get machine -o yaml | grep "noAllowedAddressPairs: true"
          noAllowedAddressPairs: true
          noAllowedAddressPairs: true
          noAllowedAddressPairs: true
          noAllowedAddressPairs: true
          noAllowedAddressPairs: true
          noAllowedAddressPairs: true

$ oc get machine zhsun416osp-lhbg5-worker-kb7ws -o yaml
status:
  addresses:
  - address: 172.16.34.33
    type: InternalIP
  - address: zhsun416osp-lhbg5-worker-kb7ws
    type: Hostname
  - address: zhsun416osp-lhbg5-worker-kb7ws
    type: InternalDNS
  lastUpdated: "2020-04-16T03:00:12Z"
  phase: Provisioned

I0416 03:04:19.174776       1 machineservice.go:230] Cloud provider CA cert not provided, using system trust bundle
I0416 03:04:19.793463       1 controller.go:284] Reconciling machine "zhsun416osp-lhbg5-worker-9hc8x" triggers idempotent update
I0416 03:04:19.793868       1 controller.go:164] Reconciling Machine "zhsun416osp-lhbg5-worker-kb7ws"
I0416 03:04:19.793905       1 controller.go:376] Machine "zhsun416osp-lhbg5-worker-kb7ws" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0416 03:04:19.802110       1 machineservice.go:230] Cloud provider CA cert not provided, using system trust bundle
I0416 03:04:20.329441       1 controller.go:284] Reconciling machine "zhsun416osp-lhbg5-worker-kb7ws" triggers idempotent update
I0416 03:04:20.329700       1 controller.go:164] Reconciling Machine "zhsun416osp-lhbg5-master-0"
I0416 03:04:20.329708       1 controller.go:376] Machine "zhsun416osp-lhbg5-master-0" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0416 03:04:20.339747       1 machineservice.go:230] Cloud provider CA cert not provided, using system trust bundle
I0416 03:04:20.843770       1 controller.go:284] Reconciling machine "zhsun416osp-lhbg5-master-0" triggers idempotent update
I0416 03:04:20.843977       1 controller.go:164] Reconciling Machine "zhsun416osp-lhbg5-master-1"
I0416 03:04:20.843990       1 controller.go:376] Machine "zhsun416osp-lhbg5-master-1" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0416 03:04:20.850918       1 machineservice.go:230] Cloud provider CA cert not provided, using system trust bundle
I0416 03:04:21.688438       1 controller.go:284] Reconciling machine "zhsun416osp-lhbg5-master-1" triggers idempotent update
I0416 03:04:21.688711       1 controller.go:164] Reconciling Machine "zhsun416osp-lhbg5-master-2"
I0416 03:04:21.688725       1 controller.go:376] Machine "zhsun416osp-lhbg5-master-2" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0416 03:04:21.700064       1 machineservice.go:230] Cloud provider CA cert not provided, using system trust bundle
I0416 03:04:22.412165       1 controller.go:284] Reconciling machine "zhsun416osp-lhbg5-master-2" triggers idempotent update

Expected results:
Machines status shoud be running and have nodeRef 

Additional info:

Comment 1 Mike Fedosin 2020-04-20 20:05:35 UTC
Important: https://github.com/openshift/installer/pull/3483 is just a part of the fix. The second part will be in CAPO.

Comment 4 David Sanz 2020-05-12 11:25:10 UTC
Verified on 4.5.0-0.nightly-2020-05-12-083345

[morenod@morenod-laptop ~]$ oc get nodes
NAME                            STATUS   ROLES    AGE     VERSION
mrnd-6nics-klfsc-master-0       Ready    master   14m     v1.18.2
mrnd-6nics-klfsc-master-1       Ready    master   15m     v1.18.2
mrnd-6nics-klfsc-master-2       Ready    master   14m     v1.18.2
mrnd-6nics-klfsc-worker-6846c   Ready    worker   60s     v1.18.2
mrnd-6nics-klfsc-worker-hzpp2   Ready    worker   3m16s   v1.18.2
[morenod@morenod-laptop ~]$ oc get machines
NAME                            PHASE     TYPE           REGION      ZONE   AGE
mrnd-6nics-klfsc-master-0       Running   ci.m1.xlarge   regionOne   nova   16m
mrnd-6nics-klfsc-master-1       Running   ci.m1.xlarge   regionOne   nova   16m
mrnd-6nics-klfsc-master-2       Running   ci.m1.xlarge   regionOne   nova   16m
mrnd-6nics-klfsc-worker-6846c   Running   ci.m1.xlarge   regionOne   nova   10m
mrnd-6nics-klfsc-worker-hzpp2   Running   ci.m1.xlarge   regionOne   nova   10m

Comment 5 Mike Fedosin 2020-05-14 14:26:00 UTC
*** Bug 1824425 has been marked as a duplicate of this bug. ***

Comment 6 errata-xmlrpc 2020-07-13 17:27:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.