Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1953493

Summary: worker pool went degraded due to no rpm-ostree on rhel worker during applying new mc
Product: OpenShift Container Platform Reporter: Sinny Kumari <skumari>
Component: Machine Config OperatorAssignee: Sinny Kumari <skumari>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: high Docs Contact:
Priority: high    
Version: 4.7CC: jiajliu, mnguyen, skumari
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1952368 Environment:
Last Closed: 2021-06-15 19:30:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1953475    
Bug Blocks:    

Comment 3 Michael Nguyen 2021-06-07 19:35:54 UTC
Verified on 4.6.0-0.nightly-2021-06-07-054625


$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2021-06-07-054625   True        False         3h30m   Cluster version is 4.6.0-0.nightly-2021-06-07-054625
$ cat trifecta.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: worker-extensions-usbguard
spec:
  config:
    ignition:
      version: 3.1.0
  extensions:
    - usbguard
  kernelType: realtime
  kernelArguments:
  - 'z=10'
$ oc create -f trifecta.yaml 
machineconfig.machineconfiguration.openshift.io/worker-extensions-usbguard created
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
00-worker                                          fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-master-container-runtime                        fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-master-kubelet                                  fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-worker-container-runtime                        fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-worker-kubelet                                  fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
99-master-generated-registries                     fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
99-master-ssh                                                                                 3.1.0             3h57m
99-worker-generated-registries                     fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
99-worker-ssh                                                                                 3.1.0             3h57m
rendered-master-ceb65787dcbeeef1c6f9ea03fc92cc83   fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
rendered-worker-d5b114ebbaaa54d7165c9405d6735619   fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
worker-extensions-usbguard                                                                    3.1.0             3s
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-d5b114ebbaaa54d7165c9405d6735619   True      False      False      3              3                   3                     0                      3h54m
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-d5b114ebbaaa54d7165c9405d6735619   False     True       False      3              0                   0                     0                      3h54m
$ oc get nodes
NAME                                        STATUS                     ROLES    AGE     VERSION
ip-10-0-52-43.us-east-2.compute.internal    Ready,SchedulingDisabled   worker   3h47m   v1.19.0+c3e2e69
ip-10-0-53-65.us-east-2.compute.internal    Ready                      master   3h56m   v1.19.0+c3e2e69
ip-10-0-58-166.us-east-2.compute.internal   Ready                      worker   13m     v1.19.0+c3e2e69
ip-10-0-63-175.us-east-2.compute.internal   Ready                      master   3h55m   v1.19.0+c3e2e69
ip-10-0-76-20.us-east-2.compute.internal    Ready                      worker   3h45m   v1.19.0+c3e2e69
ip-10-0-79-202.us-east-2.compute.internal   Ready                      master   3h55m   v1.19.0+c3e2e69
$ watch oc get nodes
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-1f5748ea8c6414f92bf9dd7f3942eebb   True      False      False      3              3                   3                     0                      4h9m
$ oc debug node/ip-10-0-52-43.us-east-2.compute.internal
Starting pod/ip-10-0-52-43us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm -qa | grep kernel
kernel-rt-modules-4.18.0-193.56.1.rt13.106.el8_2.x86_64
kernel-rt-modules-extra-4.18.0-193.56.1.rt13.106.el8_2.x86_64
kernel-rt-core-4.18.0-193.56.1.rt13.106.el8_2.x86_64
kernel-rt-kvm-4.18.0-193.56.1.rt13.106.el8_2.x86_64
sh-4.4# uname -a
Linux ip-10-0-52-43 4.18.0-193.56.1.rt13.106.el8_2.x86_64 #1 SMP PREEMPT RT Wed May 12 16:10:12 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
sh-4.4# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-a8217562a29572884d0ec056426d6e19ab6f4c0c2b1d339f320c0c3878e640c0/vmlinuz-4.18.0-193.56.1.rt13.106.el8_2.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/a8217562a29572884d0ec056426d6e19ab6f4c0c2b1d339f320c0c3878e640c0/0 ignition.platform.id=aws z=10
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc debug node/ip-10-0-58-166.us-east-2.compute.internal
Starting pod/ip-10-0-58-166us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.2# rpm -qa | grep kernel
kernel-tools-3.10.0-1127.el7.x86_64
kernel-tools-libs-3.10.0-1127.el7.x86_64
kernel-3.10.0-1160.25.1.el7.x86_64
ukernel-3.10.0-1127.el7.x86_64
sh-4.2# uname -a
Linux ip-10-0-58-166.us-east-2.compute.internal 3.10.0-1160.25.1.el7.x86_64 #1 SMP Tue Apr 13 18:55:45 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
sh-4.2# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-1160.25.1.el7.x86_64 root=UUID=5a000634-a1fc-467d-8ef4-5fcf5dbc6033 ro console=ttyS0,115200n8 console=tty0 net.ifnames=0 rd.blacklist=nouveau nvme_core.io_timeout=4294967295 crashkernel=auto LANG=en_US.UTF-8
sh-4.2# exit
exit
sh-4.2# exit
exit

Removing debug pod ...

Comment 5 errata-xmlrpc 2021-06-15 19:30:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.34 bux fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2267