Hide Forgot
Description of problem: Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Always Steps to Reproduce: 1. set up a cluster with 4.0.0-0.nightly-2019-03-19-004004 payload 2. log into machine, check rhcos version [core@ip-10-0-136-62 ~]$ rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: ● pivot://registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-03-19-004004@sha256:65406dd82ead5a7cc6bd34f9c8e49b6212a7ab1db9cc9d33ba14613719e3771f CustomOrigin: Managed by pivot tool Version: 410.8.20190315.0 (2019-03-15T13:32:33Z) pivot://docker-registry-default.cloud.registry.upshift.redhat.com/redhat-coreos/maipo@sha256:c09f455cc09673a1a13ae7b54cc4348cda0411e06dfa79ecd0130b35d62e8670 CustomOrigin: Provisioned from oscontainer Version: 400.7.20190306.0 (2019-03-06T22:16:26Z) 3. change default OSImageURL via user customized machineconfig. # cat ~/master-os-update.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: 0-master-os-01 spec: config: ignition: config: {} security: tls: {} timeouts: {} version: 2.2.0 networkd: {} passwd: {} storage: {} systemd: {} osImageURL: "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:399582f711226ab1a0e76d8928ec55436dea9f8dc60976c10790d308b9d92181" # oc create -f ~/master-os-update.yaml # oc get machineconfig NAME GENERATEDBYCONTROLLER IGNITIONVERSION CREATED 0-master-os-01 2.2.0 3m19s 00-master 4.0.22-201903181722-dirty 2.2.0 44h 00-master-ssh 4.0.22-201903181722-dirty 2.2.0 44h 00-worker 4.0.22-201903181722-dirty 2.2.0 44h 00-worker-ssh 4.0.22-201903181722-dirty 2.2.0 44h 01-master-container-runtime 4.0.22-201903181722-dirty 2.2.0 44h 01-master-kubelet 4.0.22-201903181722-dirty 2.2.0 44h 01-worker-container-runtime 4.0.22-201903181722-dirty 2.2.0 44h 01-worker-kubelet 4.0.22-201903181722-dirty 2.2.0 44h 99-master-4f75c9ab-4ae1-11e9-91fa-06b0504a45fe-registries 4.0.22-201903181722-dirty 2.2.0 44h 99-worker-4f76d90c-4ae1-11e9-91fa-06b0504a45fe-registries 4.0.22-201903181722-dirty 2.2.0 44h master-419a0d921d5f348740605c2f198fe4d4 4.0.22-201903181722-dirty 2.2.0 44h master-af21f7284bb0dfd003ef17cbeabd95bc 4.0.22-201903181722-dirty 2.2.0 3m14s worker-7a222c854cc1d2ecc25d9cdcd80537c0 4.0.22-201903181722-dirty 2.2.0 44h 4. After the new machineconfig is applied, log into machine, check rhcos version again. [core@ip-10-0-136-62 ~]$ rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:399582f711226ab1a0e76d8928ec55436dea9f8dc60976c10790d308b9d92181 CustomOrigin: Managed by pivot tool Version: 47.330 (2019-02-23T04:17:13Z) pivot://registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-03-19-004004@sha256:65406dd82ead5a7cc6bd34f9c8e49b6212a7ab1db9cc9d33ba14613719e3771f CustomOrigin: Managed by pivot tool Version: 410.8.20190315.0 (2019-03-15T13:32:33Z) 5. Trigger a upgrade to '4.0.0-0.nightly-2019-03-20-153904', it succeed. 6. Check machineconfig again, and machine rhcos version Actual results: Two new machineconfigs (master-c3fcb0712f17ba7b1c94e3bd1e0d0443 and worker-7750b88f2147f3d8325b44403417a5df) are listed there. # oc get machineconfig NAME GENERATEDBYCONTROLLER IGNITIONVERSION CREATED 0-master-os-01 2.2.0 165m 00-master 4.0.22-201903191645-dirty 2.2.0 47h 00-master-ssh 4.0.22-201903191645-dirty 2.2.0 47h 00-worker 4.0.22-201903191645-dirty 2.2.0 47h 00-worker-ssh 4.0.22-201903191645-dirty 2.2.0 47h 01-master-container-runtime 4.0.22-201903191645-dirty 2.2.0 47h 01-master-kubelet 4.0.22-201903191645-dirty 2.2.0 47h 01-worker-container-runtime 4.0.22-201903191645-dirty 2.2.0 47h 01-worker-kubelet 4.0.22-201903191645-dirty 2.2.0 47h 99-master-4f75c9ab-4ae1-11e9-91fa-06b0504a45fe-registries 4.0.22-201903191645-dirty 2.2.0 47h 99-worker-4f76d90c-4ae1-11e9-91fa-06b0504a45fe-registries 4.0.22-201903191645-dirty 2.2.0 47h master-419a0d921d5f348740605c2f198fe4d4 4.0.22-201903181722-dirty 2.2.0 47h master-af21f7284bb0dfd003ef17cbeabd95bc 4.0.22-201903181722-dirty 2.2.0 165m master-c3fcb0712f17ba7b1c94e3bd1e0d0443 4.0.22-201903191645-dirty 2.2.0 59m worker-7750b88f2147f3d8325b44403417a5df 4.0.22-201903191645-dirty 2.2.0 59m worker-7a222c854cc1d2ecc25d9cdcd80537c0 4.0.22-201903181722-dirty 2.2.0 47h But machine is still using user customized setting's rhcos version [core@ip-10-0-136-62 ~]$ rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:399582f711226ab1a0e76d8928ec55436dea9f8dc60976c10790d308b9d92181 CustomOrigin: Managed by pivot tool Version: 47.330 (2019-02-23T04:17:13Z) pivot://registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-03-19-004004@sha256:65406dd82ead5a7cc6bd34f9c8e49b6212a7ab1db9cc9d33ba14613719e3771f CustomOrigin: Managed by pivot tool Version: 410.8.20190315.0 (2019-03-15T13:32:33Z) # oc get machineconfig master-c3fcb0712f17ba7b1c94e3bd1e0d0443 -o yaml|grep osImageURL osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:399582f711226ab1a0e76d8928ec55436dea9f8dc60976c10790d308b9d92181 Expected results: OSImageURL customization should be protected or limited for customer. Additional info: 1. Allow user to set OSImageURL via customized machineconfig, this is really convenient for testing os update for QE and dev; 2. We do not expect customer do that, but seem like no obvious warning to stop it, it is still possible to be set by customer. 3. Once customer set it, machine's rhcos version would be out of control of CVO upgrade. 4. Seem like https://github.com/openshift/machine-config-operator/issues/465 is talking about how to prevent such things happen, if the prevent happened, is there still other way for QE or Dev to set OSImageURL for os update testing?
This is the PR to fix this https://github.com/openshift/machine-config-operator/pull/475 I do not feel QE or anyone else should test os upgrades through osImageURL. The expected way to do this is always through the payload. If we start testing in another, not supported way, what's the point of the test? The upgrade testing should be exercised through only machine-os-content in the payload. The PR I linked should take care of this BZ by dropping the ability to use osImageURL for testing as well - to reiterate, QE should test os upgrades through a payload which overrides machine-os-content, not by creating a machine-config with osImageURL
(In reply to Antonio Murdaca from comment #1) > This is the PR to fix this > https://github.com/openshift/machine-config-operator/pull/475 > > I do not feel QE or anyone else should test os upgrades through osImageURL. > The expected way to do this is always through the payload. If we start > testing in another, not supported way, what's the point of the test? The > upgrade testing should be exercised through only machine-os-content in the > payload. > > The PR I linked should take care of this BZ by dropping the ability to use > osImageURL for testing as well - to reiterate, QE should test os upgrades > through a payload which overrides machine-os-content, not by creating a > machine-config with osImageURL Sometime QE was requested to do some exploration testing against RHCOS version, maybe the version was not included in any payload yet. We customized osImageURL via machinceconfig to achive the os upgrade manually. I totally agree to drop the ability to use osImageURL, that would keep only one entry for os upgrade for all audience, whatever customer, dev or QE. One more question, once the PR is landed, QE could follow [1] to override machine-os-content in the payload for os upgrade (not for the whole cluster ugprade). [1]: https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusterversion.md#setting-objects-unmanaged
(In reply to Johnny Liu from comment #2) > (In reply to Antonio Murdaca from comment #1) > > This is the PR to fix this > > https://github.com/openshift/machine-config-operator/pull/475 > > > > I do not feel QE or anyone else should test os upgrades through osImageURL. > > The expected way to do this is always through the payload. If we start > > testing in another, not supported way, what's the point of the test? The > > upgrade testing should be exercised through only machine-os-content in the > > payload. > > > > The PR I linked should take care of this BZ by dropping the ability to use > > osImageURL for testing as well - to reiterate, QE should test os upgrades > > through a payload which overrides machine-os-content, not by creating a > > machine-config with osImageURL > > Sometime QE was requested to do some exploration testing against RHCOS > version, > maybe the version was not included in any payload yet. We customized > osImageURL > via machinceconfig to achive the os upgrade manually. I totally agree to > drop the > ability to use osImageURL, that would keep only one entry for os upgrade for > all > audience, whatever customer, dev or QE. > > One more question, once the PR is landed, QE could follow [1] to override > machine-os-content in the payload for os upgrade (not for the whole cluster > ugprade). > > [1]: > https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/ > clusterversion.md#setting-objects-unmanaged Once that PR merges, I believe the correct way to test machine-os-content is to build a payload overriding just machine-os-content. You can follow this guide https://github.com/openshift/machine-config-operator/blob/master/docs/HACKING.md#build-a-custom-release-payload and just override "machine-os-content". Ping me if issues arise following that.
PR has been merged, moving to MODIFIED for QE
The PR is not landed onto OCP nightly build yet. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-03-25-180911 True False 174m Cluster version is 4.0.0-0.nightly-2019-03-25-180911 # oc adm release info --commits|grep machine-config-operator machine-config-controller https://github.com/openshift/machine-config-operator 72a74aa98c29ee3a71460dc8795dd7181ce20ab0 machine-config-daemon https://github.com/openshift/machine-config-operator 72a74aa98c29ee3a71460dc8795dd7181ce20ab0 machine-config-operator https://github.com/openshift/machine-config-operator 72a74aa98c29ee3a71460dc8795dd7181ce20ab0 machine-config-server https://github.com/openshift/machine-config-operator 72a74aa98c29ee3a71460dc8795dd7181ce20ab0 setup-etcd-environment https://github.com/openshift/machine-config-operator 72a74aa98c29ee3a71460dc8795dd7181ce20ab0 [jialiu@dhcp-141-223 machine-config-operator]$ git log --first-parent --format='%ad %h %d %s' --date=iso 72a74aa98c29ee3a71460dc8795dd7181ce20ab0^..origin/master | cat 2019-03-25 17:37:43 -0700 dc9b354 (HEAD -> master, origin/release-4.0, origin/master, origin/HEAD) Merge pull request #573 from runcom/add-retrying 2019-03-25 15:35:06 -0700 c83a2df Merge pull request #575 from cgwalters/pool-subsumes 2019-03-25 14:04:07 -0700 4b62b08 Merge pull request #574 from rphillips/fixes/add_feature_gates_permissions 2019-03-25 11:20:56 -0700 7add825 Merge pull request #490 from runcom/crc-race 2019-03-25 09:41:57 -0700 7952b20 Merge pull request #572 from runcom/get-cc-directly 2019-03-25 05:23:18 -0700 31f4139 Merge pull request #475 from runcom/no-override-osimageurl 2019-03-22 19:18:38 -0700 72a74aa Merge pull request #553 from rphillips/feat/kubelet_config_features_fixed
Hi Siva, Could it be verified now?If the PR is still not the latest payload,please move the bug to MODIFIED status. Thanks!
The osImageURL cannot be set by user using the machineconfig anymore. Even after changing the osImageURL via a configmap, when the cluster is upgraded using a payload overriding the machine-os-content the os image version is getting updating. Hence moving this to verified. Versions used: upgrade from 4.0.0-0.9 to 4.0.0-0.10 machine-os-content used Version=410.8.20190322.0(quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:41973eb774db51c505f91d9a9428de4a578ffe5b8d9a7a48333300862f11af7f) Version: 410.8.20190329.0(quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d762ceee9f46a141f54cc4dc9689fa19048c1df9b26aae5d8016d6d44995a08d) Steps to reproduce: 1. install a cluster 2. try to update the machine-os-content by creating a machine config 3. Note that there will be no new rendered machineconfig and the update will not be picked up 4. Update the config map for os image url in the openshift-machine-config-operator namespace after disabling the cvo 5. Note that the changes get picked up and the os machine content get upgraded in all the machines 6. Now enable the cvo and upgrade the cluster with a newer payload using cvo 7. Note that the machines get upgraded to the machine os content in the payload.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758