Bug 1952448 - Switch from Managed to Disabled mode: no IP removed from configuration and no container metal3-static-ip-manager stopped
Summary: Switch from Managed to Disabled mode: no IP removed from configuration and no...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: sdasu
QA Contact: Aleksandra Malykhin
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-22 09:45 UTC by Oleg Sher
Modified: 2021-07-27 23:03 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 23:02:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-baremetal-operator pull 140 0 None open Bug 1952448: Update condition for starting the metal3StaticIpManager container 2021-04-30 21:34:05 UTC
Github openshift cluster-baremetal-operator pull 146 0 None open Bug 1952448: Fix condition for starting the metal3StaticIpiManager container 2021-05-25 15:45:19 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:03:13 UTC

Description Oleg Sher 2021-04-22 09:45:41 UTC
Version:

$ openshift-install version
12:25:44 workspace > oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-fc.0   True        False         42h     Cluster version is 4.8.0-fc.0

Platform:
libvirt

Please specify:
* IPI

What happened (Step to reproduction)?
1. Deploy cluster with Managed mode (in current case: 3 workers provisioned but 0 deployed), Disconnected mode
2. Switch Managed to Disable mode by:
2.0 get current configuration by oc get provisioning -o yaml >> original.yaml
2.1 create set_disabled_mode.yaml from original.yaml with following values:
name: provisioning-configuration
    resourceVersion: "392077"
    uid: 373cb2fd-af6c-4cd1-ba07-c72a0258a5a7
  spec:
    provisioningNetwork: Disabled
    provisioningOSDownloadURL: http://registry.ocp-edge-cluster-0.qe.lab.redhat.com:8080/images/rhcos-48.83.202103221318-0-openstack.x86_64.qcow2.gz?sha256=10f55ea6f71d4dc382183597f9360aad6c6551fcc94aa100bbdadaecfe888452
pay attention for removed lines:
provisioningDHCPRange: fd00:1101:0:1::a,fd00:1101:0:1:ffff:ffff:ffff:fffe
    provisioningIP: fd00:1101:0:1::3
    provisioningInterface: enp4s0
    provisioningNetwork: Managed
    provisioningNetworkCIDR: fd00:1101:0:1::/64

2.3 oc apply -f set_disabled_mode.yaml

What did you expect to happen?
1. the system should stop two containers: metal3-static-ip-manager and metal3-dnsmasq
2. remove provisioning network configuration

Actual Result:

metal3-static-ip-manager - just restarted and running and no configuration for provision network removed

Comment 1 sdasu 2021-05-03 19:15:34 UTC
When the Provisioning CR contains just the Provisioning Network and Provisioning OS Download URL, metal3 pod ends up with just 8 containers and that is expected behavior.

Spec:
  Provisioning Network:          Disabled
  Provisioning OS Download URL:  http://192.168.111.1/images/rhcos-48.83.202103221318-0-openstack.x86_64.qcow2.gz?sha25
6=323e7ba4ba3448e340946543c963823136e1367ed0b229d2a05e1cf537642bb8

[stack@localhost dev-scripts]$ oc get pods -n openshift-machine-api
NAME                                           READY   STATUS    RESTARTS   AGE
cluster-autoscaler-operator-68ff977bd5-q5k6l   2/2     Running   0          60m
cluster-baremetal-operator-846d767c44-lph69    2/2     Running   0          60m
machine-api-controllers-c6fb94c57-8lnlp        7/7     Running   1          54m
machine-api-operator-868d49f997-llzhc          2/2     Running   0          60m
metal3-5f476b595b-tj872                        8/8     Running   0          3m34s
metal3-image-cache-5xzjw                       1/1     Running   0          52m
metal3-image-cache-hlx8s                       1/1     Running   0          52m
metal3-image-cache-kslhq                       1/1     Running   0          52m

But, when Provisioning CR is edited to only change the Provisioning Network from Managed to Disabled (all other fields are left intact), then we see that 9 containers are active after the metal3 pod terminates and restarts. So, the conditions under which this error is seen is not listed accurately in the description.

Comment 6 Aleksandra Malykhin 2021-06-29 06:32:34 UTC
Verified on the OCP version Cluster version is 4.8.0-rc.1

1. Verify that there are 10/10 pods running
[kni@provisionhost-0-0 ~]$ oc get pods -n openshift-machine-api
...
metal3-64fdf54f4d-26tkn                       10/10   Running   0          50m


2. Save the config file
[kni@provisionhost-0-0 ~]$ oc get provisioning -o yaml > new_disabled_mode.yaml

3. Remove the lines from the config file provisioningDHCPRange, provisioningIP, provisioningInterface, provisioningNetworkCIDR and change the provisioningNetwork type
 The spec should be looks like:
  spec:
    provisioningNetwork: Disabled
    provisioningOSDownloadURL: http://registry.ocp-edge-cluster-0.qe.lab.redhat.com:8080/images/rhcos-48.84.202106091622-0-openstack.x86_64.qcow2.gz?sha256=2efc7539f200ffea150272523a9526ba393a9a0b8312b40031b13bfdeda36fde

4. Apply the new config file
[kni@provisionhost-0-0 ~]$ oc apply -f set_disabled_mode.yaml
provisioning.metal3.io/provisioning-configuration configured

5. Check the pods status ( only 8/8 pods are running)
[kni@provisionhost-0-0 ~]$ oc get pods -n openshift-machine-api
NAME                                          READY   STATUS    RESTARTS   AGE
...
metal3-76c6758645-5l5zc                       8/8     Running   0          81s

6. Verify the config file
[kni@provisionhost-0-0 ~]$ oc get provisioning -o yaml
...
  spec:
    provisioningNetwork: Disabled
    provisioningOSDownloadURL: http://registry.ocp-edge-cluster-0.qe.lab.redhat.com:8080/images/rhcos-48.84.202106091622-0-openstack.x86_64.qcow2.gz?sha256=2efc7539f200ffea150272523a9526ba393a9a0b8312b40031b13bfdeda36fde
...

Comment 8 errata-xmlrpc 2021-07-27 23:02:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.