Description of problem: We're testing upgrades from 4.9.41 -> 4.10.21 -> 4.11.0-rc.0 -> 4.11.0-rc.1 Failure is seen during upgrade from 4.11.0-rc.0 to 4.11.0-rc.1. The 4.9 -> 4.10 -> 4.11-rc.0 upgrades had no issues. Must gather logs: https://drive.google.com/file/d/1vLMStKyA6z1yyRYwOV9eUaPU53rVo8Pw/view?usp=sharing The error we're seeing in mcp is: message: 'Node master-0 is reporting: "unexpected on-disk state validating against rendered-master-c75d108f5b1bd9301a67a1fbd8be19cc: expected target osImageURL \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2e657044831a9bd296b20d37b693698adb1d7eb2d3cd9090db7724779ecbf608\", have \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba\""' reason: 1 nodes are reporting degraded status on sync message: 'Node worker-1 is reporting: "unexpected on-disk state validating against rendered-worker-44da19ca71d1d08ed7aa28e94b7421e1: expected target osImageURL \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2e657044831a9bd296b20d37b693698adb1d7eb2d3cd9090db7724779ecbf608\", have \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba\""' reason: 1 nodes are reporting degraded status on sync status: "True" type: NodeDegraded and the rpm-ostree status command shows No space left on device. The node has 120GB which is the same size always used and no issues have been seen previously. sh-4.4# rpm-ostree status State: idle Warning: failed to finalize previous deployment error: Installing kernel: regfile copy: No space left on device check `journalctl -b -1 -u ostree-finalize-staged.service` Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba CustomOrigin: Managed by machine-config-operator Version: 411.86.202206280018-0 (2022-06-28T00:25:24Z) pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:737dda6647e7925bbe4fb5e4b8cddebedcf83c8bc233fee38bb4c5cdd97a1cd5 CustomOrigin: Managed by machine-config-operator Version: 410.84.202206240419-0 (2022-06-24T04:25:30Z) Version-Release number of selected component (if applicable): # oc version Client Version: 4.9.41 Server Version: 4.11.0-rc.1 Kubernetes Version: v1.24.0+2dd8bb1 Steps to Reproduce: 1.Perform upgrades from 4.9.41 -> 4.10.21 -> 4.11.0-rc.0 -> 4.11.0-rc.1 4.9.41 -> 4.10.21 -> 4.11.0-rc.0 was successful but failure seen when upgrading to 4.11.0-rc.1 Actual results: ClusterVersion: Updating to "4.11.0-rc.1" from "4.11.0-rc.0" for 11 hours: Unable to apply 4.11.0-rc.1: an unknown error has occurred: MultipleErrors Expected results: Upgrade should be successful Additional info: # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-rc.0 True True 10h Unable to apply 4.11.0-rc.1: an unknown error has occurred: MultipleErrors # oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready,SchedulingDisabled master 2d8h v1.24.0+9ddc8b1 master-1 Ready master 2d8h v1.24.0+9ddc8b1 master-2 Ready master 2d8h v1.24.0+9ddc8b1 worker-0 Ready worker 2d8h v1.24.0+9ddc8b1 worker-1 Ready,SchedulingDisabled worker 2d8h v1.24.0+9ddc8b1 # oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.11.0-rc.1 True False True 130m APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver ()... baremetal 4.11.0-rc.1 True False False 2d8h cloud-controller-manager 4.11.0-rc.1 True False False 2d8h cloud-credential 4.11.0-rc.1 True False False 2d8h cluster-autoscaler 4.11.0-rc.1 True False False 2d8h config-operator 4.11.0-rc.1 True False False 2d8h console 4.11.0-rc.1 True False False 9h csi-snapshot-controller 4.11.0-rc.1 True False False 2d2h dns 4.11.0-rc.1 True False False 2d8h etcd 4.11.0-rc.1 True False False 2d8h image-registry 4.11.0-rc.1 True False False 10h ingress 4.11.0-rc.1 True False True 29h The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: PodsScheduled=False (PodsNotScheduled: Some pods are not scheduled: Pod "router-default-56d77b4f4d-b9mvv" cannot be scheduled: 0/5 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 1 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable, 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/5 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 4 Preemption is not helpful for scheduling. Make sure you have sufficient worker nodes.), DeploymentReplicasAllAvailable=False (DeploymentReplicasNotAvailable: 1/2 of replicas are available) insights 4.11.0-rc.1 True False False 2d8h kube-apiserver 4.11.0-rc.1 True False False 2d8h kube-controller-manager 4.11.0-rc.1 True False False 2d8h kube-scheduler 4.11.0-rc.1 True False False 2d8h kube-storage-version-migrator 4.11.0-rc.1 True False False 10h machine-api 4.11.0-rc.1 True False False 2d8h machine-approver 4.11.0-rc.1 True False False 2d8h machine-config 4.11.0-rc.0 True True True 9h Unable to apply 4.11.0-rc.1: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool worker is not ready, retrying. Status: (pool degraded: true total: 2, ready 0, updated: 0, unavailable: 1)] marketplace 4.11.0-rc.1 True False False 2d8h monitoring 4.11.0-rc.1 False True True 9h Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error. network 4.11.0-rc.1 True False False 2d8h node-tuning 4.11.0-rc.1 True False False 10h openshift-apiserver 4.11.0-rc.1 True False True 2d8h APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver () openshift-controller-manager 4.11.0-rc.1 True False False 2d8h openshift-samples 4.11.0-rc.1 True False False 10h operator-lifecycle-manager 4.11.0-rc.1 True False False 2d8h operator-lifecycle-manager-catalog 4.11.0-rc.1 True False False 2d8h operator-lifecycle-manager-packageserver 4.11.0-rc.1 True False False 30h service-ca 4.11.0-rc.1 True False False 2d8h storage 4.11.0-rc.1 True False False # oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-679e46b83af12a9f90a61c64f086cbc6 False True True 3 0 0 1 2d8h worker rendered-worker-ce0f663b8f5c37a8bde873ecbb8311e8 False True True 2 0 0 1 2d8h # oc get mcp master -oyaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: creationTimestamp: "2022-07-04T08:45:58Z" generation: 7 labels: machineconfiguration.openshift.io/mco-built-in: "" operator.machineconfiguration.openshift.io/required-for-upgrade: "" pools.operator.machineconfiguration.openshift.io/master: "" name: master resourceVersion: "1360931" uid: 7ffe8daa-08de-4abb-b38d-55db9a5adee8 spec: configuration: name: rendered-master-c75d108f5b1bd9301a67a1fbd8be19cc source: - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 00-master - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-container-runtime - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-chrony-configuration - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-generated-crio-seccomp-use-default - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-generated-registries - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-ssh machineConfigSelector: matchLabels: machineconfiguration.openshift.io/role: master nodeSelector: matchLabels: node-role.kubernetes.io/master: "" paused: false status: conditions: - lastTransitionTime: "2022-07-04T08:47:31Z" message: "" reason: "" status: "False" type: RenderDegraded - lastTransitionTime: "2022-07-06T07:09:12Z" message: "" reason: "" status: "False" type: Updated - lastTransitionTime: "2022-07-06T07:09:12Z" message: All nodes are updating to rendered-master-c75d108f5b1bd9301a67a1fbd8be19cc reason: "" status: "True" type: Updating - lastTransitionTime: "2022-07-06T07:16:11Z" message: 'Node master-0 is reporting: "unexpected on-disk state validating against rendered-master-c75d108f5b1bd9301a67a1fbd8be19cc: expected target osImageURL \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2e657044831a9bd296b20d37b693698adb1d7eb2d3cd9090db7724779ecbf608\", have \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba\""' reason: 1 nodes are reporting degraded status on sync status: "True" type: NodeDegraded - lastTransitionTime: "2022-07-06T07:16:11Z" message: "" reason: "" status: "True" type: Degraded configuration: name: rendered-master-679e46b83af12a9f90a61c64f086cbc6 source: - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 00-master - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-container-runtime - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-chrony-configuration - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-generated-crio-seccomp-use-default - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-generated-registries - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-ssh degradedMachineCount: 1 machineCount: 3 observedGeneration: 7 readyMachineCount: 0 unavailableMachineCount: 1 updatedMachineCount: 0 The master node which shows SchedulingDisabled, rpm-ostree shows "No space left on device" sh-4.4# rpm-ostree status State: idle Warning: failed to finalize previous deployment error: Installing kernel: regfile copy: No space left on device check `journalctl -b -1 -u ostree-finalize-staged.service` Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba CustomOrigin: Managed by machine-config-operator Version: 411.86.202206280018-0 (2022-06-28T00:25:24Z) pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:737dda6647e7925bbe4fb5e4b8cddebedcf83c8bc233fee38bb4c5cdd97a1cd5 CustomOrigin: Managed by machine-config-operator Version: 410.84.202206240419-0 (2022-06-24T04:25:30Z) # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda4 120G 29G 92G 24% / tmpfs 8.0G 0 8.0G 0% /sys/fs/cgroup devtmpfs 7.9G 0 7.9G 0% /dev tmpfs 8.0G 128K 8.0G 1% /dev/shm tmpfs 8.0G 93M 7.9G 2% /run shm 64M 0 64M 0% /run/containers/storage/overlay-containers/7649fea8a9f22c2fa4d7d454380c261bca15a34b3b2c75e8503919cdb65755ae/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/7dbcf0b1e7c0725f91eed644fbfa3f30c4d41165ccb43d560faf995da79b1451/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/9b39f3ef0687628c5f7d6639a18b38147f31013e455ab22b68043418c2e953c3/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/c7b6f9393e55d8270550ab8a4ccb2e42a4784623e1d2a0901f30d662a8291a5f/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/23dcaaa0cf8459be2034ef76a658782b51da69d2979c6e06f96e534dc7fce55d/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/d106800bf4ea805fe5df5d2dc6a1416f17b983d1eac08d6ca41addba72a5f2b6/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/6fd657ea28e3335de406979dd4e052f3ceef5bf62e50073fb51331dc0cd9f858/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/c0127e0d16e86b97f802921c336b2549246f967b03ce642b62e00e14dfbdcddb/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/13248f323be7115007c5797fc79374d8383fb2c824bee4e10879ed89979ae325/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/ae24555c8ac30a5689a6b2a12e2a51a69b40bddb7b69267747c194cffcc42104/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/072999891e0af7625c4a69456ad28003d94214a8929c6d631cd09062595e98e4/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/6c502811e9dc0ed12c3768964a5d5bd5a06d673265dccfc972ecc22dea816047/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/9c875c2647fb15f438654895d77b297e40a4baa4f53278499c8a12de9e5d0071/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/2067db47b62476ce457bfcb0f2b36b944d9191991fed5efa225bf2581c5e92b1/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/3f33deb9494424a0d0653dbd171243785a65a535a36960ee92924c27813a670b/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/6d68b8669f7c956a1c399e315b931087c0a306d6875d4a3f54eeece93dee90b8/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/d613e523b5ae6ce901224f4c018d8691e1a0b196f097d078d96dc5c611f64dac/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/a9db656e85403abf52c9c8b940621d92ae1c0291355cea50beb9762dc56f212e/userdata/shm shm 64M 0 64M 0% /run/containers/storage/overlay-containers/7849996a3ba2e0c582f1d561638bb6e79d4d7b04a109ef7893d4653a51d5ab9d/userdata/shm shm 64M 0 64M 0% /var/lib/containers/storage/overlay/58c70b88804fe53bb3f6abd09068d797d60b1de50718c8c9a7317336080f4cd9/merged/dev/shm tmpfs 8.0G 64K 8.0G 1% /tmp overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/74cbd66f09a019daa6b380fad95e1cf137588f9427fa7a09e609c17cb4913d5a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/a2bf51318544322cdab4dbdec432c6a44fdbfe1941521ded6e42de8f20dcc9a5/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/0d6a12c4cedd397ba548756a3394ea3fa0a8d92e293c74ac894a69bd255bfd8a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/207d6350ff9cddc1bc927d7246ccf1cdf8d4d2a5da8d4e7fda40fc7f5c586eb1/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/35fa96660585c22869604bc2adfe7ba5dae5e98264248a22618ba3a77b9b01c8/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/c78ba99d0466bbb0fdc527097516bc3afa394b976d1af2bf98f6a1a00f0d36d5/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/44e54c4bdc8256882dc6c552ff9bcd13b14c8c9137d5735685c877bae0b16898/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/9246569565c2f6853fbae1459e0bdfde24dc70ee705f2b7959ca3a4d580f8614/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/d1784f9508fd316e038f3de86a3a0d94671710978b442f7320a6dba35c59d097/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/7d49aa9d690330efbca62b29382d8b98ac37c212ba246991f919a8f7122a8f9a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/810a309f9c28435dccf28b6bbf2e0b478987da168f302d5c38ffc63cec0cca1d/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/4ebd116b4fd0c1b6dcfee481ee99e79e6f58363a3339ba8a6e9856afc91ada1d/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/87c0507e33fdf3aee4bf360e1dfd6e8cac8d47c4366c415c8cb5223e15818cda/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/36e44bf7bd05c1f29d09f6b5631813113d5d7c7ba9b1974fd1babf18682e0ce7/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/6d64637eaa711cb4adccd09d798176088e2b5c5c350216bdb81913ef28c5a26a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/1e42f32a853da1d1aa69064d87e1ba80a30512d101b60c8032e954809a50e2e3/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/38bcadb922b144e48d3c2468d4d907b1003271d7827ecde6d0152642a7aff9dd/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/8860a2acce06c63616753f70297dc6f7a535dc4b057619b2f9c3d54306e28f11/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/219f50bd78b55073f67c07d5c42f61504b694baae4c4b2a71729a561ae7dc754/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/76a57b9167e99079d28d8cbd1d4b6c272dbc739dec5d14aa35d4f9595eb56fe7/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/db5c1db06cacb60a21d7a837b1b2dd95ce649b01c0fa815a74784d7a2770ab4b/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/617ca0bf2b66921ab96c449b36cd5795993f699148b39bfc21c8c8b0d7182463/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/d7825f691375e46d90e8a4f8f274d991714851f1b28de8adda3509d6cae5582e/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/7fcad1d9aedd38ce29648ed42ceb2700af56269f00f48a1e1075fb5a6ae791d4/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/a5bba405f1ade715b1c1c262d8a9fc1996260bb2a57f72d591105195ff053ff0/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/b597c10fa290a32262f1a234f0c3459006db360dd59fffcb8ad018e1e13481d1/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/5cd0896a0cea72fe0d9d8f5e4541ac85d04c0bc7223458c6e0ecd28b429af92b/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/e113aef1a3925c4addd4f4dee2c9c4c869d13d1f4a9254abdbe9f75d73200761/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/cccf8df999427e00b0cfe9251eb42c480c82934e5c9a067fca7fc598a5fcc65a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/6a09208e93c9640981d81df25adf39e3e3aa1442c40829bb436a0d449b8e6789/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/c28a84fa2dd2d7e1b316088575a9ba95091106b3ae2fec81f1011b42b4988d76/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/d49a521d159caae9e35c5cd61027c37cedbe31d90773c353444023964fccbddc/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/2628c2ddb7cada7f45414ceec1ac7f6d838533988bd84d97bc06b280677ea90a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/2381711c2666ebce5e0d843d048b4e72451405b9938666f21f51eb782a62dd9b/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/1651cd13c13c71211bd8e1bd00b03e524184f7cd70a342d11842546c7b41a316/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/008b9c6a387186aefacb6e66efb0b62ad2522ee3e82fff13f1154743589b17ec/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/6fcf532d999fe7a8c5f650fb1732f49c1fd05d32860dbb863dc238b5efdb71df/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/0a60f82a8e4302128e9649673f5a2960bf096e4e9c1ba75684d0decefa17474e/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/1d39309225432518d77f9d88350c386a9a588f883aaefd53d82a6f47af23f3cc/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/862d52a13a97bda79f6b0034466d6fd57860dfea9491813c86faef825d5ecd6e/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/a61929c80480b2b73cd9ec4473ebf2a7bbc1679f0210b38e1c73e8156460cd9a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/5930d82feba22075c77a790431fef5e6b328d505b5dcf4722203e61cb9c08c7e/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/699a26ddb9cc963785865db12ec43df8300d9fa163de5eee2e0acfa7ef01c757/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/e4ccccb65ad4582606f0f2f3c47a0e171cad51f635aef7be1ff7d9ec1607efde/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/0f7459e495f8a413bdeca09c4f0b308c027e99233b65ff2ca1c4cc00173d4d42/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/2e98e461a51e752ea04fa6a4a05fb90f83a3a3b1b0d098138d5716890d4b8c9e/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/6e802c011f43f0365f05bdccfa3ee63f1be9aa18bcb6cf8e72e820d7be26e53c/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/570e86287b39f0243fc3e2f4f5d5de2ae9928f980690c00b7063d738bc5feb7b/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/ed6379bc5468fc0e5cd99c5d2a66d9db90d54a5d0d30627aca51845be950a6b8/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/390c28bc760426445be7e1134fac7752cff70ac131b8a8174931051b6f3ee28c/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/e39468566dc0728b058ff80692671c936636dc1d928bf1d2c2576365dcf8602e/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/94d7dbf34119f65b24d8f79cb09218ba5a14ae03ddcd4609ea00c2f1f8ec34b3/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/99671df43387654349867ad680bbd4e089430d2830fa568bb3d91d2c24deb645/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/a19026519e1fe43740b0a3aa08bd927f9d85293ff33461a61d2e83573feb5e4c/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/40e29e08a6714713ea91ab9bac89f16700d9a97dc4285b6d06e6e20e55bc9033/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/748afa3ae2a928d646c3b8168f9e210f4db662b1a344f2998a49b6b0367d7c3f/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/f12a7cbe53d3e3c439a4137e2307032c77108d527fd7affd4de38c109ac07b12/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/58af3dbb8abd5aaaa9cce160da06a1193deb8cf7b0c01992255b1a301d348675/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/bdfbd69d4745a66d8990598d97d90b8beee3bd22f90290514b3be036df5bbcf9/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/b42ae7e7e33f5ddc9700806f4aa45870984ee60d3efe834f4744017b8b934234/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/c3d0d89cea7e5429e9088a8cf4789b516841837771a1c08b7400204cd1ea0829/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/a37d1b59abe259b5cfe29dc00d3af0bd4d256eb01fa917573c1210272551a299/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/9e0601986463bdf98ed4c811c5021324330d18380e4f2b68f58e8cd2190e6adf/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/db5bbea59e6a897a0d7afb9fdb1dee3d8a9444c82b731d43f5a9511528947921/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/b75af4b99cd974864d0794f2c94f0068d0229836df659c38f36630187ad463a1/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/2810ba50f210316dbf2ec97bbc4a1b1ccfd77c2b17c7a0f8b8637ffed365e582/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/63bcdf65d9e066ea62a4068c739c715f4556a344cbb8e327670ac6470c867d2c/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/9a056c95c186e8a86c5a1d54cf38d6ee82bf150cf71711398d9950ae63e965b2/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/10216edeb11cd9086ce89ccff3b94e5c651b8ff7928b2bed06e2735e75e0d790/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/af8804ab535ee457b8b2d97e085b3e6b65278c9fa896045ffed398047f4b647a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/7b581a408892eb681cbc05c8d480cdc78a8bffe628f30efa2ce8b66909b520e3/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/d1792accc713b7f3c5e47aa77b13b4b385947a4349765cbd7f5afccd39b7d707/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/33e2a26543733f65c7eb5b3e7f8c43dac6471a8976ea9a36d4de673ddf0b74ba/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/555aab703ec997f75d71738cf30587b42fb99890d001d217fcefd3a8ff27fa80/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/4543abaef27655d54e25b08d867075f3bfa164137d130f94e9593aa32b3a7833/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/68eb567e7678693b66ff1d378bfc065fb8f50958b0158df9e06fa3526aed6d4a/merged overlay 120G 29G 92G 24% /var/lib/containers/storage/overlay/58c70b88804fe53bb3f6abd09068d797d60b1de50718c8c9a7317336080f4cd9/merged tmpfs 64M 0 64M 0% /var/lib/containers/storage/overlay/58c70b88804fe53bb3f6abd09068d797d60b1de50718c8c9a7317336080f4cd9/merged/dev tmpfs 8.0G 0 8.0G 0% /var/lib/containers/storage/overlay/58c70b88804fe53bb3f6abd09068d797d60b1de50718c8c9a7317336080f4cd9/merged/sys/fs/cgroup tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/45651ff9-3f73-4925-b696-8255c59bf8c2/volumes/kubernetes.io~secret/node-bootstrap-token tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/45651ff9-3f73-4925-b696-8255c59bf8c2/volumes/kubernetes.io~secret/certs tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/af9ccda7-a5f1-47fe-b359-3bab5d9a7b3f/volumes/kubernetes.io~projected/kube-api-access-92wl5 tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/45651ff9-3f73-4925-b696-8255c59bf8c2/volumes/kubernetes.io~projected/kube-api-access-t5tk9 tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/6145dbf0-605f-4380-9358-5f7bf971f25e/volumes/kubernetes.io~projected/kube-api-access-vclcj tmpfs 15G 64K 15G 1% /var/lib/kubelet/pods/6145dbf0-605f-4380-9358-5f7bf971f25e/volumes/kubernetes.io~secret/node-exporter-kube-rbac-proxy-config tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/6145dbf0-605f-4380-9358-5f7bf971f25e/volumes/kubernetes.io~secret/node-exporter-tls tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/0d0e1319-1f01-4423-9541-90d95791a8d0/volumes/kubernetes.io~secret/sdn-controller-metrics-certs tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/5a5b814d-7d1e-49e4-8007-5b655371fc1c/volumes/kubernetes.io~secret/metrics-tls tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/58cf6b33-d5f1-496e-bd8b-fbf49c4aac73/volumes/kubernetes.io~secret/serving-cert tmpfs 15G 64K 15G 1% /var/lib/kubelet/pods/c7581191-e489-488b-9e17-8e148a024820/volumes/kubernetes.io~secret/cookie-secret tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/4d231826-0fe8-4bab-91a4-9380094b92e0/volumes/kubernetes.io~secret/webhook-certs tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/02c26524-bbc7-44c2-bfb0-6187b4f2abe0/volumes/kubernetes.io~secret/metrics-certs tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/c7581191-e489-488b-9e17-8e148a024820/volumes/kubernetes.io~secret/proxy-tls tmpfs 15G 128K 15G 1% /var/lib/kubelet/pods/91517de1-83fd-4bc1-9329-91b436f18088/volumes/kubernetes.io~secret/sdn-metrics-certs tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/02c26524-bbc7-44c2-bfb0-6187b4f2abe0/volumes/kubernetes.io~projected/kube-api-access-v82nt tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/4d231826-0fe8-4bab-91a4-9380094b92e0/volumes/kubernetes.io~projected/kube-api-access-skhct tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/c7581191-e489-488b-9e17-8e148a024820/volumes/kubernetes.io~projected/kube-api-access-kg4xg tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/fbe012b0-d7d6-4647-a29d-17e832cdbcec/volumes/kubernetes.io~projected/kube-api-access-j5hbr tmpfs 1.0G 256K 1.0G 1% /var/lib/kubelet/pods/996dbdf7-9f61-4d1d-82ed-44514836a980/volumes/kubernetes.io~projected/kube-api-access-5rlll tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/aa36428d-9ec2-4601-a993-f7e34a335d7e/volumes/kubernetes.io~projected/kube-api-access-qjxxj tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/91517de1-83fd-4bc1-9329-91b436f18088/volumes/kubernetes.io~projected/kube-api-access-q54ch tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/a876c756-76ac-460b-915e-43b23f5fe01e/volumes/kubernetes.io~projected/kube-api-access-qw5qz tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/df98d7d5-1ee5-4a73-8f46-cb7851b59c6a/volumes/kubernetes.io~projected/kube-api-access-xmqbr tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/0d0e1319-1f01-4423-9541-90d95791a8d0/volumes/kubernetes.io~projected/kube-api-access-kbs75 tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/58cf6b33-d5f1-496e-bd8b-fbf49c4aac73/volumes/kubernetes.io~projected/kube-api-access-4wdhz tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/e49617fc-8b23-40d3-9699-6a32e460d993/volumes/kubernetes.io~projected/kube-api-access-jn2bz tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/5a5b814d-7d1e-49e4-8007-5b655371fc1c/volumes/kubernetes.io~projected/kube-api-access-kpnkr tmpfs 15G 256K 15G 1% /var/lib/kubelet/pods/2a00c1c3-2818-4fd7-8040-7810d495bdf6/volumes/kubernetes.io~projected/kube-api-access-lkrmz /dev/sda3 364M 277M 65M 82% /boot sh-4.4# ls -lart /boot/* lrwxrwxrwx. 1 root root 1 Oct 8 2021 /boot/boot -> . lrwxrwxrwx. 1 root root 8 Jul 5 11:40 /boot/loader -> loader.0 /boot/lost+found: total 14 drwx------. 2 root root 12288 Oct 8 2021 . drwxr-xr-x. 7 root root 1024 Jul 6 07:13 .. /boot/grub2: total 185 drwxr-xr-x. 2 root root 8192 Oct 8 2021 powerpc-ieee1275 drwxr-xr-x. 2 root root 1024 Oct 8 2021 locale -rw-r--r--. 1 root root 1024 Oct 8 2021 grubenv -rwxr-xr-x. 1 root root 2625 Oct 8 2021 grub.cfg -rw-r--r--. 1 root root 164668 Oct 8 2021 grub drwxr-xr-x. 2 root root 1024 Oct 8 2021 fonts drwxr-xr-x. 5 root root 1024 Oct 8 2021 . drwxr-xr-x. 7 root root 1024 Jul 6 07:13 .. /boot/loader.0: total 6 drwxr-xr-x. 3 root root 1024 Jul 5 11:40 . drwxr-xr-x. 2 root root 1024 Jul 5 11:40 entries drwxr-xr-x. 7 root root 1024 Jul 6 07:13 .. /boot/ostree: total 10 drwxr-xr-x. 2 root root 1024 Jul 4 14:44 rhcos-0bfcbbf7c37c79bff574f3829112a64ecee6c084597b1d93e90bc00f07954a73 drwxr-xr-x. 2 root root 1024 Jul 5 11:40 rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f drwxr-xr-x. 2 root root 1024 Jul 6 07:13 rhcos-1d101bb28fbd71daf2b6ffb8cf1261576cfd670cec4604a91b20a713523370a2 drwxr-xr-x. 7 root root 1024 Jul 6 07:13 .. drwxr-xr-x. 5 root root 1024 Jul 6 07:13 . /boot/loader.1: total 6 drwxr-xr-x. 2 root root 1024 Jul 6 07:13 entries drwxr-xr-x. 7 root root 1024 Jul 6 07:13 .. drwxr-xr-x. 3 root root 1024 Jul 6 07:13 . Must gather logs: https://drive.google.com/file/d/1vLMStKyA6z1yyRYwOV9eUaPU53rVo8Pw/view?usp=sharing
Also adding the output of journalctl -b -1 -u ostree-finalize-staged.service sh-4.4# journalctl -b -1 -u ostree-finalize-staged.service -- Logs begin at Mon 2022-07-04 08:40:42 UTC, end at Wed 2022-07-06 17:47:26 UTC. -- Jul 06 07:12:22 master-0 systemd[1]: Started OSTree Finalize Staged Deployment. Jul 06 07:13:50 master-0 systemd[1]: Stopping OSTree Finalize Staged Deployment... Jul 06 07:13:50 master-0 ostree[1256867]: Finalizing staged deployment Jul 06 07:13:56 master-0 ostree[1256867]: Copying /etc changes: 15 modified, 0 removed, 1471 added Jul 06 07:13:56 master-0 ostree[1256867]: Copying /etc changes: 15 modified, 0 removed, 1471 added Jul 06 07:13:59 master-0 ostree[1256867]: error: Installing kernel: regfile copy: No space left on device Jul 06 07:13:59 master-0 systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited status=1 Jul 06 07:13:59 master-0 systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'. Jul 06 07:13:59 master-0 systemd[1]: Stopped OSTree Finalize Staged Deployment. Jul 06 07:13:59 master-0 systemd[1]: ostree-finalize-staged.service: Consumed 2.164s CPU time
looks like preparing the kernel and copying it into /boot/ostree is erroring out because of space being used up in /boot: /dev/sda3 364M 277M 65M 82% /boot moving to coreos team for further investigation.
Looks like a real issue with /boot running out of space. rpm-ostree cleanup -rp would cleanup the pending and rollback deployments and freeup space. I went from 28% to 18% on my environment by running this. I would guess that if you run a upgrade directly from 4.11.0-rc.0 -> 4.11.0-rc.1 this would not happen because there is space in the device?
the default size of the boot partition is 384M (https://github.com/coreos/coreos-assembler/blob/main/src/create_disk.sh) - should that be increased?
We've moved the discussion about the size of the boot partition in https://github.com/coreos/fedora-coreos-tracker/issues/1247. There are however 3 boot entries here when there should only be two if I'm not mistaken. Could we get the full journal log for the node?
Master-0 journalctl log: https://drive.google.com/file/d/1FyftBN9WWzYjtBlACYQLpAnRows-FeWW/view?usp=sharing On master-1 which does not have any error, there are only 2 entries in /boot/ostree master-1: sh-4.4# df -h | grep /boot /dev/sdb3 364M 243M 99M 72% /boot sh-4.4# ls -lart /boot/* lrwxrwxrwx. 1 root root 1 Oct 8 2021 /boot/boot -> . lrwxrwxrwx. 1 root root 8 Jul 5 11:49 /boot/loader -> loader.0 /boot/lost+found: total 14 drwx------. 2 root root 12288 Oct 8 2021 . drwxr-xr-x. 6 root root 1024 Jul 5 11:49 .. /boot/grub2: total 185 drwxr-xr-x. 2 root root 8192 Oct 8 2021 powerpc-ieee1275 drwxr-xr-x. 2 root root 1024 Oct 8 2021 locale -rw-r--r--. 1 root root 1024 Oct 8 2021 grubenv -rwxr-xr-x. 1 root root 2625 Oct 8 2021 grub.cfg -rw-r--r--. 1 root root 164668 Oct 8 2021 grub drwxr-xr-x. 2 root root 1024 Oct 8 2021 fonts drwxr-xr-x. 5 root root 1024 Oct 8 2021 . drwxr-xr-x. 6 root root 1024 Jul 5 11:49 .. /boot/loader.0: total 6 drwxr-xr-x. 3 root root 1024 Jul 5 11:49 . drwxr-xr-x. 2 root root 1024 Jul 5 11:49 entries drwxr-xr-x. 6 root root 1024 Jul 5 11:49 .. /boot/ostree: total 8 drwxr-xr-x. 2 root root 1024 Jul 4 14:53 rhcos-0bfcbbf7c37c79bff574f3829112a64ecee6c084597b1d93e90bc00f07954a73 drwxr-xr-x. 2 root root 1024 Jul 5 11:49 rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f drwxr-xr-x. 6 root root 1024 Jul 5 11:49 .. drwxr-xr-x. 4 root root 1024 Jul 5 11:50 .
Yeah the default behavior is two have two boot entries. I can't seem to replicate getting 3 on my system without pinning a deployment. But there is none here. I wonder if there was an error during one of the upgrades cleaning up one of them. I am trying to see if there is any indication on the logs of this.
ostree doesn't clean up until *after* a new upgrade is complete, so transiently we will have 3. Another way to say this is we keep two deployments always by default; which we implement by temporarily having 3. We don't remove the rollback before trying to pull the new version.
This time I did a direct deployment of 4.11.0-rc.0 and could reproduce this issue while doing an upgrade to 4.11.0-rc.1 =========================BEFORE UPGRADE=================================== # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-rc.0 True False 12m Cluster version is 4.11.0-rc.0 # oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-e40f9114a24d4f2d98b9b9581bcd07c5 True False False 3 3 3 0 136m worker rendered-worker-7cc9dba1f9fcc7271b0f999e2284ea1e True False False 3 3 3 0 136m On one of the masters: sh-4.4# rpm-ostree status State: idle Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba CustomOrigin: Managed by machine-config-operator Version: 411.86.202206280018-0 (2022-06-28T00:25:24Z) 02573b714e58cac3fce9c9d97f8d10227de38dd9c46d197b47619afd0a7ad57d Version: 411.85.202203090210-0 (2022-03-09T02:17:14Z) sh-4.4# df -h /boot Filesystem Size Used Avail Use% Mounted on /dev/sdg3 362M 243M 96M 72% /boot /boot/ostree: total 8 drwxr-xr-x. 2 root root 1024 Mar 9 02:20 rhcos-7022fef0ce6fe2dbc74dc84855f917bc677928b7d1f75ac0eb422b3d52472e29 drwxr-xr-x. 2 root root 1024 Jul 7 15:21 rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f drwxr-xr-x. 4 root root 1024 Jul 7 15:21 . drwxr-xr-x. 6 root root 1024 Jul 7 15:21 .. ============================================AFTER UPGRADE======================================== # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-rc.0 True True 106m Unable to apply 4.11.0-rc.1: an unknown error has occurred: MultipleErrors # oc get nodes NAME STATUS ROLES AGE VERSION tor01-master-0.rdr-ocp-pravin-upi-707.redhat.com Ready master 3h42m v1.24.0+9ddc8b1 tor01-master-1.rdr-ocp-pravin-upi-707.redhat.com Ready master 3h36m v1.24.0+9ddc8b1 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com Ready,SchedulingDisabled master 3h43m v1.24.0+9ddc8b1 tor01-worker-0.rdr-ocp-pravin-upi-707.redhat.com Ready,SchedulingDisabled worker 3h6m v1.24.0+9ddc8b1 tor01-worker-1.rdr-ocp-pravin-upi-707.redhat.com Ready worker 3h3m v1.24.0+9ddc8b1 tor01-worker-2.rdr-ocp-pravin-upi-707.redhat.com Ready worker 3h3m v1.24.0+9ddc8b1 # oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-e40f9114a24d4f2d98b9b9581bcd07c5 False True True 3 0 0 1 3h39m worker rendered-worker-7cc9dba1f9fcc7271b0f999e2284ea1e False True True 3 0 0 1 3h39m message: 'Node tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com is reporting: "unexpected on-disk state validating against rendered-master-2d9c1ab445025b7d3ff2bdffd1875f7b: expected target osImageURL \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2e657044831a9bd296b20d37b693698adb1d7eb2d3cd9090db7724779ecbf608\", have \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba\""' reason: 1 nodes are reporting degraded status on sync on master-2: sh-4.4# rpm-ostree status State: idle Warning: failed to finalize previous deployment error: Installing kernel: regfile copy: No space left on device check `journalctl -b -1 -u ostree-finalize-staged.service` Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba CustomOrigin: Managed by machine-config-operator Version: 411.86.202206280018-0 (2022-06-28T00:25:24Z) 02573b714e58cac3fce9c9d97f8d10227de38dd9c46d197b47619afd0a7ad57d Version: 411.85.202203090210-0 (2022-03-09T02:17:14Z) sh-4.4# df -h /boot Filesystem Size Used Avail Use% Mounted on /dev/sdh3 362M 277M 62M 82% /boot sh-4.4# ls -lart /boot/ostree/ total 10 drwxr-xr-x. 2 root root 1024 Mar 9 02:20 rhcos-7022fef0ce6fe2dbc74dc84855f917bc677928b7d1f75ac0eb422b3d52472e29 drwxr-xr-x. 2 root root 1024 Jul 7 15:20 rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f drwxr-xr-x. 7 root root 1024 Jul 7 18:13 .. drwxr-xr-x. 2 root root 1024 Jul 7 18:13 rhcos-1d101bb28fbd71daf2b6ffb8cf1261576cfd670cec4604a91b20a713523370a2 drwxr-xr-x. 5 root root 1024 Jul 7 18:13 . Journalctl of failing master-2 node: https://drive.google.com/file/d/1jkptdv1eDF9mA1toeovKYQw4RS2UzAR3/view?usp=sharing Must gather logs: https://drive.google.com/file/d/1h_2i8SnNQWLV00jZBZAU1daB9XxbUhkv/view?usp=sharing
Interesting that before the upgrade you already have two boot entries and /boot is at 72%. Looking at the new logs to see if there is anything else I can identify there. I have been "grepping" for: "ostree/rhcos-" & "Resolved OSTree target to" to try and see if I can identify what is going wrong. Nothing jumps at me yet.
On the Journalctl of failing master-2 node journalctl-master-2-rc0-rc1.log I see an rhcos update happen here: Jul 07 15:20:03 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com machine-config-daemon[2859]: I0707 15:20:03.786114 2859 rpm-ostree.go:296] Executing rebase from repo path /run/mco-machine-os-content/os-content-1715853007/srv/repo with customImageURL pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:74b8a768d5b354263cab962106cc7faa391014acfd4897356de625667ebac5ba and checksum 22d10ba43839f4f4f5f7ab51cb112a1b96448a3371e0adfe592201e8f78fe332 The kernel upgrade: Jul 07 15:20:14 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com machine-config-daemon[2859]: kernel 4.18.0-348.12.2.el8_5 -> 4.18.0-372.9.1.el8 Followed by a first deployment that is successful here that adds a new boot entry: ``` Jul 07 15:20:15 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[3803]: Finalizing staged deployment Jul 07 15:20:15 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com kernel: EXT4-fs (sdh3): re-mounted. Opts: Jul 07 15:20:17 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[3803]: Copying /etc changes: 14 modified, 0 removed, 110 added Jul 07 15:20:17 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[3803]: Copying /etc changes: 14 modified, 0 removed, 110 added Jul 07 15:20:18 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[3803]: Bootloader updated; bootconfig swap: yes; bootversion: boot.0.1, deployment count change: 1 Jul 07 15:20:18 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[3803]: Bootloader updated; bootconfig swap: yes; bootversion: boot.0.1, deployment count change: 1 Jul 07 15:20:18 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com systemd[1]: ostree-finalize-staged.service: Succeeded. ``` Then I see a second deployment that fails here: ``` Jul 07 18:13:36 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[335838]: Finalizing staged deployment Jul 07 18:13:38 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[335838]: Copying /etc changes: 15 modified, 0 removed, 1431 added Jul 07 18:13:38 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[335838]: Copying /etc changes: 15 modified, 0 removed, 1431 added Jul 07 18:13:39 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com ostree[335838]: error: Installing kernel: regfile copy: No space left on device Jul 07 18:13:39 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited status=1 Jul 07 18:13:39 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'. Jul 07 18:13:39 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com systemd[1]: Stopped OSTree Finalize Staged Deployment. Jul 07 18:13:39 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com systemd[1]: ostree-finalize-staged.service: Consumed 1.867s CPU time Jul 07 18:13:39 tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com systemd[1]: ostree-finalize-staged.path: Succeeded. ``` Still looking
I don't see any issues in the journal, other than the failure we already know that happens because of lack of space. The initial package updates finish successfully and looks normal. Unless the one initramfs is bigger than we expect or something else has changed sizes significantly the 2 deployments should be OK. The third one we see after the failure is expected as Colin explained but we should have enough space for that one too. Can you please share the full sizes of the files in the directory? To get the full picture you might need to be root. for example: ➜ ~ sudo -i [root@silverblue ~]# find /boot -type f | xargs du -b | sort -g 110 /boot/efi/EFI/fedora/BOOTX64.CSV 112 /boot/efi/EFI/fedora/BOOTIA32.CSV 144 /boot/efi/EFI/fedora/grub.cfg 161 /boot/ostree/fedora-c47d0c01d852ff47ad9699dfdd4348e0ae25b7ea639dfe4a8bdea1ebb0eeb6a1/.vmlinuz-5.17.9-300.fc36.x86_64.hmac 547 /boot/loader.1/entries/ostree-1-fedora.conf 547 /boot/loader.1/entries/ostree-2-fedora.conf 1024 /boot/grub2/grubenv 8206 /boot/loader.1/grub.cfg 61561 /boot/efi/EFI/fedora/fwupdx64.efi 68136 /boot/efi/EFI/BOOT/fbia32.efi 87152 /boot/efi/EFI/BOOT/fbx64.efi 676040 /boot/efi/EFI/fedora/mmia32.efi 740344 /boot/efi/EFI/BOOT/BOOTIA32.EFI 740344 /boot/efi/EFI/fedora/shimia32.efi 850032 /boot/efi/EFI/fedora/mmx64.efi 928592 /boot/efi/EFI/BOOT/BOOTX64.EFI 928592 /boot/efi/EFI/fedora/shim.efi 928592 /boot/efi/EFI/fedora/shimx64.efi 1639688 /boot/efi/EFI/fedora/grubia32.efi 2394108 /boot/grub2/fonts/unicode.pf2 2598152 /boot/efi/EFI/fedora/grubx64.efi 11802608 /boot/ostree/fedora-c47d0c01d852ff47ad9699dfdd4348e0ae25b7ea639dfe4a8bdea1ebb0eeb6a1/vmlinuz-5.17.9-300.fc36.x86_64 25039792 /boot/efi/EFI/fedora/fw/fwupd-55d04ffc-714a-4457-b982-d244343e1958.cap 71097429 /boot/ostree/fedora-c47d0c01d852ff47ad9699dfdd4348e0ae25b7ea639dfe4a8bdea1ebb0eeb6a1/initramfs-5.17.9-300.fc36.x86_64.img [root@silverblue ~]#
On master-2 = tor01-master-2.rdr-ocp-pravin-upi-707.redhat.com [root@tor01-master-2 core]# find /boot -type f | xargs du -b | sort -g 12 /boot/grub2/powerpc-ieee1275/video.lst 17 /boot/grub2/powerpc-ieee1275/parttool.lst 37 /boot/.root_uuid 53 /boot/grub2/bootuuid.cfg 54 /boot/grub2/powerpc-ieee1275/terminal.lst 111 /boot/grub2/powerpc-ieee1275/partmap.lst 165 /boot/ostree/rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f/.vmlinuz-4.18.0-372.9.1.el8.ppc64le.hmac 168 /boot/ostree/rhcos-7022fef0ce6fe2dbc74dc84855f917bc677928b7d1f75ac0eb422b3d52472e29/.vmlinuz-4.18.0-348.12.2.el8_5.ppc64le.hmac 219 /boot/grub2/powerpc-ieee1275/fs.lst 472 /boot/grub2/powerpc-ieee1275/all_video.mod 661 /boot/loader.0/entries/ostree-2-rhcos.conf 667 /boot/loader.0/entries/ostree-1-rhcos.conf 788 /boot/grub2/powerpc-ieee1275/setjmp.mod 936 /boot/grub2/powerpc-ieee1275/crypto.lst 1024 /boot/grub2/grubenv 1388 /boot/grub2/powerpc-ieee1275/pkcs1_v15.mod 1420 /boot/grub2/powerpc-ieee1275/hello.mod 1500 /boot/grub2/powerpc-ieee1275/div.mod 1556 /boot/grub2/powerpc-ieee1275/suspend.mod 1628 /boot/grub2/powerpc-ieee1275/halt.mod 1640 /boot/grub2/powerpc-ieee1275/reboot.mod 1656 /boot/grub2/powerpc-ieee1275/trig.mod 1712 /boot/grub2/powerpc-ieee1275/test_blockarg.mod 1732 /boot/grub2/powerpc-ieee1275/true.mod 1776 /boot/grub2/powerpc-ieee1275/exfctest.mod 1780 /boot/grub2/powerpc-ieee1275/pbkdf2.mod 1856 /boot/grub2/powerpc-ieee1275/adler32.mod 1896 /boot/grub2/powerpc-ieee1275/raid5rec.mod 2044 /boot/grub2/powerpc-ieee1275/eval.mod 2048 /boot/grub2/powerpc-ieee1275/cmosdump.mod 2048 /boot/grub2/powerpc-ieee1275/offsetio.mod 2124 /boot/grub2/powerpc-ieee1275/read.mod 2168 /boot/grub2/powerpc-ieee1275/afsplitter.mod 2272 /boot/grub2/powerpc-ieee1275/part_dvh.mod 2272 /boot/grub2/powerpc-ieee1275/time.mod 2372 /boot/grub2/powerpc-ieee1275/setjmp_test.mod 2408 /boot/grub2/powerpc-ieee1275/version.mod 2484 /boot/grub2/powerpc-ieee1275/lsmmap.mod 2492 /boot/grub2/powerpc-ieee1275/part_sun.mod 2512 /boot/grub2/powerpc-ieee1275/datehook.mod 2604 /boot/grub2/powerpc-ieee1275/part_acorn.mod 2646 /boot/grub2/powerpc-ieee1275/modinfo.sh 2672 /boot/grub2/powerpc-ieee1275/keystatus.mod 2692 /boot/grub2/powerpc-ieee1275/part_sunpc.mod 2740 /boot/grub2/powerpc-ieee1275/command.lst 2744 /boot/grub2/powerpc-ieee1275/password.mod 2760 /boot/grub2/powerpc-ieee1275/gcry_arcfour.mod 2772 /boot/grub2/powerpc-ieee1275/part_amiga.mod 2820 /boot/grub2/powerpc-ieee1275/sleep.mod 2828 /boot/grub2/powerpc-ieee1275/xnu_uuid_test.mod 2831 /boot/grub2/grub.cfg 2860 /boot/grub2/powerpc-ieee1275/dm_nv.mod 2860 /boot/grub2/powerpc-ieee1275/part_dfly.mod 2868 /boot/grub2/powerpc-ieee1275/gcry_rsa.mod 2868 /boot/grub2/powerpc-ieee1275/priority_queue.mod 2920 /boot/grub2/powerpc-ieee1275/pbkdf2_test.mod 2928 /boot/grub2/powerpc-ieee1275/crc64.mod 2992 /boot/grub2/powerpc-ieee1275/mdraid09_be.mod 3024 /boot/grub2/powerpc-ieee1275/echo.mod 3028 /boot/grub2/powerpc-ieee1275/strtoull_test.mod 3040 /boot/grub2/powerpc-ieee1275/part_plan.mod 3068 /boot/grub2/powerpc-ieee1275/cmp.mod 3112 /boot/grub2/powerpc-ieee1275/mdraid09.mod 3112 /boot/grub2/powerpc-ieee1275/part_apple.mod 3216 /boot/grub2/powerpc-ieee1275/videotest_checksum.mod 3220 /boot/grub2/powerpc-ieee1275/mul_test.mod 3240 /boot/grub2/powerpc-ieee1275/testspeed.mod 3248 /boot/grub2/powerpc-ieee1275/mdraid1x.mod 3312 /boot/grub2/powerpc-ieee1275/configfile.mod 3324 /boot/grub2/powerpc-ieee1275/gcry_dsa.mod 3360 /boot/grub2/powerpc-ieee1275/memdisk.mod 3372 /boot/grub2/powerpc-ieee1275/date.mod 3380 /boot/grub2/powerpc-ieee1275/increment.mod 3396 /boot/grub2/powerpc-ieee1275/tr.mod 3412 /boot/grub2/powerpc-ieee1275/xnu_uuid.mod 3572 /boot/grub2/powerpc-ieee1275/blocklist.mod 3636 /boot/grub2/powerpc-ieee1275/ctz_test.mod 3680 /boot/grub2/powerpc-ieee1275/msdospart.mod 3732 /boot/grub2/powerpc-ieee1275/boot.mod 3736 /boot/grub2/powerpc-ieee1275/sleep_test.mod 3788 /boot/grub2/powerpc-ieee1275/cmostest.mod 3800 /boot/grub2/powerpc-ieee1275/part_gpt.mod 3884 /boot/grub2/powerpc-ieee1275/procfs.mod 3908 /boot/grub2/powerpc-ieee1275/cmdline_cat_test.mod 3908 /boot/grub2/powerpc-ieee1275/progress.mod 3930 /boot/grub2/powerpc-ieee1275/moddep.lst 3944 /boot/grub2/powerpc-ieee1275/bufio.mod 3980 /boot/grub2/powerpc-ieee1275/raid6rec.mod 4044 /boot/grub2/powerpc-ieee1275/help.mod 4120 /boot/grub2/powerpc-ieee1275/part_msdos.mod 4224 /boot/grub2/powerpc-ieee1275/testload.mod 4372 /boot/grub2/powerpc-ieee1275/cat.mod 4388 /boot/grub2/powerpc-ieee1275/gfxterm_background.mod 4400 /boot/grub2/powerpc-ieee1275/bitmap.mod 4428 /boot/grub2/powerpc-ieee1275/memrw.mod 4504 /boot/grub2/powerpc-ieee1275/bswap_test.mod 4612 /boot/grub2/powerpc-ieee1275/hexdump.mod 4860 /boot/grub2/powerpc-ieee1275/cpio_be.mod 4860 /boot/grub2/powerpc-ieee1275/odc.mod 4872 /boot/grub2/powerpc-ieee1275/password_pbkdf2.mod 4900 /boot/grub2/powerpc-ieee1275/disk.mod 4940 /boot/grub2/powerpc-ieee1275/shift_test.mod 4956 /boot/grub2/powerpc-ieee1275/loopback.mod 4992 /boot/grub2/powerpc-ieee1275/part_bsd.mod 4996 /boot/grub2/powerpc-ieee1275/cpio.mod 5180 /boot/grub2/powerpc-ieee1275/search.mod 5184 /boot/grub2/powerpc-ieee1275/gcry_rfc2268.mod 5212 /boot/grub2/powerpc-ieee1275/search_fs_file.mod 5224 /boot/grub2/powerpc-ieee1275/gcry_md4.mod 5256 /boot/grub2/powerpc-ieee1275/search_label.mod 5320 /boot/grub2/powerpc-ieee1275/newc.mod 5416 /boot/grub2/powerpc-ieee1275/hfspluscomp.mod 5452 /boot/grub2/powerpc-ieee1275/archelp.mod 5532 /boot/grub2/powerpc-ieee1275/search_fs_uuid.mod 5544 /boot/grub2/powerpc-ieee1275/videotest.mod 5704 /boot/grub2/powerpc-ieee1275/macbless.mod 5744 /boot/grub2/powerpc-ieee1275/cbfs.mod 5840 /boot/grub2/powerpc-ieee1275/escc.mod 5864 /boot/grub2/powerpc-ieee1275/gcry_idea.mod 5872 /boot/grub2/powerpc-ieee1275/tar.mod 6120 /boot/grub2/powerpc-ieee1275/ls.mod 6192 /boot/grub2/powerpc-ieee1275/probe.mod 6280 /boot/grub2/powerpc-ieee1275/gptsync.mod 6360 /boot/grub2/powerpc-ieee1275/gcry_md5.mod 6444 /boot/grub2/powerpc-ieee1275/fshelp.mod 6476 /boot/grub2/powerpc-ieee1275/videoinfo.mod 6680 /boot/grub2/powerpc-ieee1275/minicmd.mod 6796 /boot/grub2/powerpc-ieee1275/gcry_sha256.mod 6856 /boot/grub2/powerpc-ieee1275/luks.mod 7132 /boot/grub2/powerpc-ieee1275/gfxterm_menu.mod 7776 /boot/grub2/powerpc-ieee1275/json.mod 7832 /boot/grub2/powerpc-ieee1275/video_colors.mod 8080 /boot/grub2/powerpc-ieee1275/bitmap_scale.mod 8128 /boot/grub2/powerpc-ieee1275/terminal.mod 8240 /boot/grub2/powerpc-ieee1275/romfs.mod 8252 /boot/grub2/powerpc-ieee1275/signature_test.mod 8288 /boot/grub2/powerpc-ieee1275/parttool.mod 8296 /boot/grub2/powerpc-ieee1275/minix_be.mod 8608 /boot/grub2/powerpc-ieee1275/datetime.mod 8668 /boot/grub2/powerpc-ieee1275/minix2_be.mod 8704 /boot/grub2/powerpc-ieee1275/minix3_be.mod 8752 /boot/grub2/powerpc-ieee1275/tga.mod 8820 /boot/grub2/powerpc-ieee1275/minix.mod 8864 /boot/grub2/powerpc-ieee1275/hashsum.mod 9140 /boot/grub2/powerpc-ieee1275/minix3.mod 9168 /boot/grub2/powerpc-ieee1275/minix2.mod 9280 /boot/grub2/powerpc-ieee1275/tftp.mod 9288 /boot/grub2/powerpc-ieee1275/gcry_sha1.mod 9556 /boot/grub2/powerpc-ieee1275/gettext.mod 9620 /boot/grub2/powerpc-ieee1275/ntfscomp.mod 9660 /boot/grub2/powerpc-ieee1275/crypto.mod 9876 /boot/grub2/powerpc-ieee1275/scsi.mod 9996 /boot/grub2/powerpc-ieee1275/cmp_test.mod 10048 /boot/grub2/powerpc-ieee1275/zfscrypt.mod 10344 /boot/grub2/powerpc-ieee1275/ieee1275_fb.mod 10568 /boot/grub2/powerpc-ieee1275/geli.mod 10600 /boot/grub2/powerpc-ieee1275/zfsinfo.mod 10776 /boot/grub2/powerpc-ieee1275/lzopio.mod 10876 /boot/grub2/powerpc-ieee1275/sfs.mod 10888 /boot/grub2/powerpc-ieee1275/affs.mod 10928 /boot/grub2/powerpc-ieee1275/elf.mod 11184 /boot/grub2/powerpc-ieee1275/http.mod 11236 /boot/grub2/powerpc-ieee1275/ofnet.mod 11644 /boot/grub2/powerpc-ieee1275/test.mod 11720 /boot/grub2/powerpc-ieee1275/gcry_sha512.mod 11848 /boot/grub2/powerpc-ieee1275/div_test.mod 12176 /boot/grub2/powerpc-ieee1275/gcry_rmd160.mod 12388 /boot/grub2/powerpc-ieee1275/ufs1_be.mod 12528 /boot/grub2/powerpc-ieee1275/loadenv.mod 12676 /boot/grub2/powerpc-ieee1275/gcry_crc.mod 13124 /boot/grub2/powerpc-ieee1275/ufs2.mod 13256 /boot/grub2/powerpc-ieee1275/ufs1.mod 13296 /boot/grub2/powerpc-ieee1275/exfat.mod 13568 /boot/grub2/powerpc-ieee1275/serial.mod 13712 /boot/grub2/powerpc-ieee1275/hfs.mod 13812 /boot/grub2/powerpc-ieee1275/fat.mod 13948 /boot/grub2/powerpc-ieee1275/video.mod 14576 /boot/grub2/powerpc-ieee1275/gcry_blowfish.mod 14768 /boot/grub2/powerpc-ieee1275/jfs.mod 14864 /boot/grub2/powerpc-ieee1275/afs.mod 14956 /boot/grub2/powerpc-ieee1275/ext2.mod 15376 /boot/grub2/powerpc-ieee1275/ldm.mod 15428 /boot/grub2/powerpc-ieee1275/jpeg.mod 15936 /boot/grub2/powerpc-ieee1275/lvm.mod 16092 /boot/grub2/powerpc-ieee1275/nilfs2.mod 16144 /boot/grub2/powerpc-ieee1275/bfs.mod 16336 /boot/grub2/powerpc-ieee1275/hfsplus.mod 16392 /boot/grub2/powerpc-ieee1275/png.mod 16628 /boot/grub2/powerpc-ieee1275/f2fs.mod 16824 /boot/grub2/powerpc-ieee1275/squash4.mod 17044 /boot/grub2/powerpc-ieee1275/gzio.mod 17228 /boot/grub2/powerpc-ieee1275/xfs.mod 17364 /boot/grub2/powerpc-ieee1275/linux.mod 18120 /boot/grub2/powerpc-ieee1275/macho.mod 19016 /boot/grub2/powerpc-ieee1275/iso9660.mod 19476 /boot/grub2/powerpc-ieee1275/appended_signature_test.mod 20048 /boot/grub2/powerpc-ieee1275/udf.mod 20172 /boot/grub2/powerpc-ieee1275/reiserfs.mod 22864 /boot/grub2/powerpc-ieee1275/gcry_tiger.mod 23332 /boot/grub2/powerpc-ieee1275/pgp.mod 23348 /boot/grub2/powerpc-ieee1275/gfxterm.mod 23480 /boot/grub2/powerpc-ieee1275/blscfg.mod 23876 /boot/grub2/powerpc-ieee1275/ntfs.mod 24376 /boot/grub2/powerpc-ieee1275/diskfilter.mod 24448 /boot/grub2/powerpc-ieee1275/cryptodisk.mod 27516 /boot/grub2/powerpc-ieee1275/font.mod 27604 /boot/grub2/powerpc-ieee1275/gcry_rijndael.mod 28056 /boot/grub2/powerpc-ieee1275/relocator.mod 28408 /boot/grub2/powerpc-ieee1275/luks2.mod 32472 /boot/grub2/powerpc-ieee1275/file.mod 32516 /boot/grub2/powerpc-ieee1275/gcry_cast5.mod 34836 /boot/grub2/powerpc-ieee1275/gcry_seed.mod 36756 /boot/grub2/powerpc-ieee1275/syslinuxcfg.mod 37108 /boot/grub2/powerpc-ieee1275/gcry_whirlpool.mod 37504 /boot/grub2/powerpc-ieee1275/xzio.mod 45120 /boot/grub2/powerpc-ieee1275/functional_test.mod 49780 /boot/grub2/powerpc-ieee1275/gcry_des.mod 50980 /boot/grub2/powerpc-ieee1275/test_asn1.mod 55644 /boot/grub2/powerpc-ieee1275/video_fb.mod 58880 /boot/grub2/powerpc-ieee1275/gcry_serpent.mod 58908 /boot/grub2/powerpc-ieee1275/asn1.mod 60344 /boot/grub2/powerpc-ieee1275/appendedsig.mod 64664 /boot/grub2/powerpc-ieee1275/btrfs.mod 69020 /boot/grub2/powerpc-ieee1275/mpi.mod 88536 /boot/grub2/powerpc-ieee1275/gfxmenu.mod 90436 /boot/grub2/powerpc-ieee1275/zfs.mod 90548 /boot/grub2/powerpc-ieee1275/gcry_camellia.mod 95652 /boot/grub2/powerpc-ieee1275/gcry_twofish.mod 127780 /boot/grub2/powerpc-ieee1275/net.mod 129948 /boot/grub2/powerpc-ieee1275/regexp.mod 141492 /boot/grub2/powerpc-ieee1275/zstd.mod 165660 /boot/grub2/grub 165660 /boot/grub2/powerpc-ieee1275/core.elf 235748 /boot/grub2/powerpc-ieee1275/normal.mod 2394108 /boot/grub2/fonts/unicode.pf2 30914365 /boot/ostree/rhcos-7022fef0ce6fe2dbc74dc84855f917bc677928b7d1f75ac0eb422b3d52472e29/vmlinuz-4.18.0-348.12.2.el8_5.ppc64le 35634133 /boot/ostree/rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f/vmlinuz-4.18.0-372.9.1.el8.ppc64le 35634157 /boot/ostree/rhcos-1d101bb28fbd71daf2b6ffb8cf1261576cfd670cec4604a91b20a713523370a2/vmlinuz-4.18.0-372.13.1.el8_6.ppc64le 89933506 /boot/ostree/rhcos-952c9677e944c268239aec2429e421d2a4fd4c751550a1c5f74d83bd2c5ddb1f/initramfs-4.18.0-372.9.1.el8.ppc64le.img 91958667 /boot/ostree/rhcos-7022fef0ce6fe2dbc74dc84855f917bc677928b7d1f75ac0eb422b3d52472e29/initramfs-4.18.0-348.12.2.el8_5.ppc64le.img
It looks like space is consumed by the kernel binaries and current initramfs at 284.074828 megabytes. The grub and other smaller stuff modules seem to take about 6.001679 megabyte. If the size of the new initramfs is equal to the current ones it would not fit since it's around 91mb and the size of the new kernel binary is a bit bigger which will likely mean a bigger initramfs too. Even if there were no additional files than the vmlinuz and initramfs there would not be space for a third initramfs to be there even transiently because we would be sitting at about ~78mb. It looks like the kernel grew about 5mb from 4.18.0-348 to 4.18.0-372 and the initramfs 2mb. Which means there is a new 14mb we need that might be the cause of us running out of space here.
I think the path forward will likely be: https://github.com/ostreedev/ostree/issues/2670
The ostree side is the most elegant fix. However, as https://github.com/coreos/fedora-coreos-tracker/issues/1247#issuecomment-1180469010 notes it will need to be deployed on the version we're upgrading *from*. Concretely we'd need to ship an updated ostree back to RHEL 8.4.z, get that shipped in OCP 4.10.Z, and then ensure the upgrade graph requires going through 4.10.Z before 4.11. Whereas, if we patch the MCO like https://github.com/openshift/machine-config-operator/pull/2302#issuecomment-1179091466 then we will not need any upgrade graph changes; the updated MCD will land and operate on older systems *before* we upgrade the host. While I personally think the ostree fix is most elegant, will be most long term sustainable; the above may argue for reassigning this to the MCO for now.
I am OK with that, I can keep the ostree change on my plate and we can solve this for now with: https://github.com/openshift/machine-config-operator/pull/3243/
A workaround for this issue has been pushed in the MCO. Could you retry a 4.10 -> 4.11 and 4.11 updates to the latest releases? Thanks!
Thanks, Yes I verified that upgrade works fine with the latest RC - 4.11.0-rc.3 The test scenario was 4.10.23 -> 4.11.0-rc.2 -> 4.11.0-rc.3 - PASS Post upgrade: sh-4.4# rpm-ostree status State: idle Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0615fd76754f79d3da5d5aa3627d5321e9846e276e87bb8567e32e1ac65f4fdb CustomOrigin: Managed by machine-config-operator Version: 411.86.202207150037-0 (2022-07-15T00:44:41Z) # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-rc.3 True False 3h51m Cluster version is 4.11.0-rc.3 # oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready master 7h54m v1.24.0+9546431 master-1 Ready master 7h55m v1.24.0+9546431 master-2 Ready master 7h54m v1.24.0+9546431 worker-0 Ready worker 7h43m v1.24.0+9546431 worker-1 Ready worker 7h46m v1.24.0+9546431 # oc get clusterversion -o json |jq ".items[0].status.history" [ { "completionTime": "2022-07-19T08:51:59Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:e870c897fd9d87a6839a7747c640cbe862d81f377eb2315b166bb489f2f5abf6", "startedTime": "2022-07-19T08:11:35Z", "state": "Completed", "verified": true, "version": "4.11.0-rc.3" }, { "completionTime": "2022-07-19T08:01:28Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:1d1a712818baf48944e248922c1be350535347d64d8a9e27c74d38a1a84c9846", "startedTime": "2022-07-19T06:51:54Z", "state": "Completed", "verified": true, "version": "4.11.0-rc.2" }, { "completionTime": "2022-07-19T05:02:47Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:0b5822f70610ff8f624a8b83378000908f25701dfea83c377abc84d82d636099", "startedTime": "2022-07-19T04:41:07Z", "state": "Completed", "verified": false, "version": "4.10.23" } ] # oc get mcp -A NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-aa68b9dd37f3324224072db1bc89b694 True False False 3 3 3 0 8h worker rendered-worker-00ce21b598e67079e83b697268e00b14 True False False 2 2 2 0 8h
Setting as verified as we have https://bugzilla.redhat.com/show_bug.cgi?id=2104619#c21
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399