Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1845427

Summary: Removing 'worker-cnf' label from the worker node does not revert RT Kernel to non-RT Kernel.
Product: OpenShift Container Platform Reporter: Gowrishankar Rajaiyan <grajaiya>
Component: Performance Addon OperatorAssignee: Yanir Quinn <yquinn>
Status: CLOSED UPSTREAM QA Contact: Gowrishankar Rajaiyan <grajaiya>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4CC: aos-bugs, dshchedr, fiezzi, grajaiya, scuppett, yquinn
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-11 06:23:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1827712    
Bug Blocks:    

Description Gowrishankar Rajaiyan 2020-06-09 08:11:00 UTC
Description of problem: Removing 'worker-cnf' label from the worker node should remove RT Kernel from the worker node thereby reverting it to its previous state.


Version-Release number of selected component (if applicable): v4.4.0-77


How reproducible: Not consistently reproducible. But we hit this issue twice during our nightly test execution.


Steps to Reproduce:
1. Install OCP 4.4.7
2. Label worker node with 'worker-cnf'.
3. Deploy Performance Addon Operator and Performance Profile with 'realTimeKernel: {enabled: true}'
3. Ensure that RT Kernel is installed on 'worker-cnf' labeled worker node.
4. Remove 'worker-cnf' label.
5. Verify if RT Kernel is reverted back to non-RT Kernel.

Actual results: RT Kernel is still installed.


Expected results: RT Kernel is reverted back to non-RT Kernel


Additional info:

Comment 1 Gowrishankar Rajaiyan 2020-06-09 11:14:59 UTC
*Note: The following information is from a _similar_ cluster where the issue could not be reproduced.*


# oc get node -o wide
NAME       STATUS   ROLES               AGE    VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                         CONTAINER-RUNTIME
master-0   Ready    master              158m   v1.17.1+3f6f40d   192.168.111.20   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
master-1   Ready    master              158m   v1.17.1+3f6f40d   192.168.111.21   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
master-2   Ready    master              158m   v1.17.1+3f6f40d   192.168.111.22   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-0   Ready    worker,worker-cnf   137m   v1.17.1+3f6f40d   192.168.111.23   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.rt24.101.el8_1.x86_64   cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-1   Ready    worker              138m   v1.17.1+3f6f40d   192.168.111.24   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-2   Ready    worker              138m   v1.17.1+3f6f40d   192.168.111.25   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8



# oc describe performanceprofiles.performance.openshift.io performance
Name:         performance
Namespace:
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"performance.openshift.io/v1alpha1","kind":"PerformanceProfile","metadata":{"annotations":{},"name":"performance"},"spec":{"...
API Version:  performance.openshift.io/v1alpha1
Kind:         PerformanceProfile
Metadata:
  Creation Timestamp:  2020-06-09T09:36:40Z
  Finalizers:
    foreground-deletion
  Generation:        7
  Resource Version:  91146
  Self Link:         /apis/performance.openshift.io/v1alpha1/performanceprofiles/performance
  UID:               a2bc52cc-9b13-4904-a47e-9f7a592daf5b
Spec:
  Cpu:
    Isolated:  1-3
    Reserved:  0
  Hugepages:
    Default Hugepages Size:  1G
    Pages:
      Count:  1
      Size:   1G
  Node Selector:
    node-role.kubernetes.io/worker-cnf:
  Real Time Kernel:
    Enabled:  true
Status:
  Conditions:
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                True
    Type:                  Available
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                True
    Type:                  Upgradeable
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                False
    Type:                  Progressing
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                False
    Type:                  Degraded
Events:
  Type     Reason              Age                From                            Message
  ----     ------              ----               ----                            -------
  Normal   Creation succeeded  87m                performance-profile-controller  Succeeded to create all components
  Warning  Creation failed     56m                performance-profile-controller  Failed to create all components: Operation cannot be fulfilled on machineconfigs.machineconfiguration.openshift.io "performance-performance": the object has been modified; please apply your changes to the latest version and try again
  Normal   Creation succeeded  10m (x5 over 69m)  performance-profile-controller  Succeeded to create all components





# oc get mcp
NAME         CONFIG                                                 UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master       rendered-master-0112f787eaed5267898b3988aec444eb       True      False      False      3              3                   3                     0                      150m
worker       rendered-worker-2e3a4df0240eb8745b2845521a1f28f8       True      False      False      2              2                   2                     0                      150m
worker-cnf   rendered-worker-cnf-5c311bbe9c9f762497d1753a8fb27536   True      False      False      1              1                   1                     0                      97m




# oc describe mcp worker-cnf
Name:         worker-cnf
Namespace:
Labels:       machineconfiguration.openshift.io/role=worker-cnf
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"machineconfiguration.openshift.io/v1","kind":"MachineConfigPool","metadata":{"annotations":{},"labels":{"machineconfigurati...
API Version:  machineconfiguration.openshift.io/v1
Kind:         MachineConfigPool
Metadata:
  Creation Timestamp:  2020-06-09T09:29:14Z
  Generation:          6
  Resource Version:    79333
  Self Link:           /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker-cnf
  UID:                 68b55a8e-1d12-4e8f-b57d-86cea701f80d
Spec:
  Configuration:
    Name:  rendered-worker-cnf-5c311bbe9c9f762497d1753a8fb27536
    Source:
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         00-worker
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         00-worker-chronyd-custom
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         01-worker-container-runtime
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         01-worker-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         98-worker-92ef12b3-6b66-4348-b91f-f49b08f49348-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         98-worker-cnf-68b55a8e-1d12-4e8f-b57d-86cea701f80d-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-92ef12b3-6b66-4348-b91f-f49b08f49348-registries
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-cnf-68b55a8e-1d12-4e8f-b57d-86cea701f80d-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-registries
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-ssh
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         performance-performance
  Machine Config Selector:
    Match Expressions:
      Key:       machineconfiguration.openshift.io/role
      Operator:  In
      Values:
        worker-cnf
        worker
  Node Selector:
    Match Labels:
      node-role.kubernetes.io/worker-cnf:
  Paused:                                  false




# oc describe mc performance-performance
Name:         performance-performance
Namespace:
Labels:       machineconfiguration.openshift.io/role=worker-cnf
Annotations:  <none>
API Version:  machineconfiguration.openshift.io/v1
Kind:         MachineConfig
Metadata:
  Creation Timestamp:  2020-06-09T09:36:49Z
  Generation:          4
  Owner References:
    API Version:           performance.openshift.io/v1alpha1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  PerformanceProfile
    Name:                  performance
    UID:                   a2bc52cc-9b13-4904-a47e-9f7a592daf5b
  Resource Version:        91150
  Self Link:               /apis/machineconfiguration.openshift.io/v1/machineconfigs/performance-performance
  UID:                     841ae0a3-4762-4602-900f-10e988fcb14e
Spec:
  Config:
    Ignition:
      Config:
      Security:
        Tls:
      Timeouts:
      Version:  2.2.0
    Networkd:
    Passwd:
    Storage:
      Files:
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKClNZU1RFTV9DT05GSUdfRklMRT0iL2V0Yy9zeXN0ZW1kL3N5c3RlbS5jb25mIgpTWVNURU1fQ09ORklHX0NVU1RPTV9GSUxFPSIvZXRjL3N5c3RlbWQvc3lzdGVtLmNvbmYuZC9zZXRBZmZpbml0eS5jb25mIgoKaWYgWyAtZiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlIF0gJiYgWyAtZiAke1NZU1RFTV9DT05GSUdfQ1VTVE9NX0ZJTEV9IF0gICYmIHJwbS1vc3RyZWUgc3RhdHVzIC1iIHwgZ3JlcCAtcSAtZSAiJHtTWVNURU1fQ09ORklHX0ZJTEV9ICR7U1lTVEVNX0NPTkZJR19DVVNUT01fRklMRX0iICYmIGVncmVwIC13cSAiXklSUUJBTEFOQ0VfQkFOTkVEX0NQVVM9JHtSRVNFUlZFRF9DUFVfTUFTS19JTlZFUlR9IiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlOyB0aGVuCiAgICBlY2hvICJQcmUgYm9vdCB0dW5pbmcgY29uZmlndXJhdGlvbiBhbHJlYWR5IGFwcGxpZWQiCmVsc2UKICAgICNTZXQgSVJRIGJhbGFuY2UgYmFubmVkIGNwdXMKICAgIGlmIFsgISAtZiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlIF07IHRoZW4KICAgICAgICB0b3VjaCAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlCiAgICBmaQoKICAgIGlmIGdyZXAgLWxzICJJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSIgL2V0Yy9zeXNjb25maWcvaXJxYmFsYW5jZTsgdGhlbgogICAgICAgIHNlZCAtaSAicy9eLipJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPS4qJC9JUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSR7UkVTRVJWRURfQ1BVX01BU0tfSU5WRVJUfS8iIC9ldGMvc3lzY29uZmlnL2lycWJhbGFuY2UKICAgIGVsc2UKICAgICAgICBlY2hvICJJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSR7UkVTRVJWRURfQ1BVX01BU0tfSU5WRVJUfSIgPj4vZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlCiAgICBmaQoKICAgIHJwbS1vc3RyZWUgaW5pdHJhbWZzIC0tZW5hYmxlIC0tYXJnPS1JIC0tYXJnPSIke1NZU1RFTV9DT05GSUdfRklMRX0gJHtTWVNURU1fQ09ORklHX0NVU1RPTV9GSUxFfSIgCgogICAgdG91Y2ggL3Zhci9yZWJvb3QKZmkK
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /usr/local/bin/pre-boot-tuning.sh
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKCm5vZGVzX3BhdGg9Ii9zeXMvZGV2aWNlcy9zeXN0ZW0vbm9kZSIKaHVnZXBhZ2VzX2ZpbGU9IiR7bm9kZXNfcGF0aH0vbm9kZSR7TlVNQV9OT0RFfS9odWdlcGFnZXMvaHVnZXBhZ2VzLSR7SFVHRVBBR0VTX1NJWkV9a0IvbnJfaHVnZXBhZ2VzIgoKaWYgWyAhIC1mICAke2h1Z2VwYWdlc19maWxlfSBdOyB0aGVuCiAgICBlY2hvICJFUlJPUjogJHtodWdlcGFnZXNfZmlsZX0gZG9lcyBub3QgZXhpc3QiCiAgICBleGl0IDEKZmkKCmVjaG8gJHtIVUdFUEFHRVNfQ09VTlR9ID4gJHtodWdlcGFnZXNfZmlsZX0K
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /usr/local/bin/hugepages-allocation.sh
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKCmlmIFtbIC1mIC92YXIvcmVib290IF1dOyB0aGVuIAogICAgcm0gLWYgL3Zhci9yZWJvb3QKICAgIGVjaG8gIkZpbGUgL3Zhci9yZWJvb3QgZXhpc3RzLCBpbml0aWF0ZSByZWJvb3QiCiAgICBzeXN0ZW1jdGwgcmVib290CmZpCg==
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /usr/local/bin/reboot.sh
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,W01hbmFnZXJdCkNQVUFmZmluaXR5PTA=
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /etc/systemd/system.conf.d/setAffinity.conf
    Systemd:
      Units:
        Contents:  [Unit]
Description=Preboot tuning patch
Before=kubelet.service
Before=reboot.service

[Service]
Environment=RESERVED_CPUS=0
Environment=RESERVED_CPU_MASK_INVERT=ffffffff,fffffffe
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/local/bin/pre-boot-tuning.sh

[Install]
WantedBy=multi-user.target

        Enabled:   true
        Name:      pre-boot-tuning.service
        Contents:  [Unit]
Description=Reboot initiated by pre-boot-tuning
Wants=network-online.target
After=network-online.target
Before=kubelet.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/local/bin/reboot.sh

[Install]
WantedBy=multi-user.target

        Enabled:  true
        Name:     reboot.service
  Fips:           false
  Kernel Arguments:
    nohz=on
    nosoftlockup
    skew_tick=1
    intel_pstate=disable
    intel_iommu=on
    iommu=pt
    rcu_nocbs=1-3
    tuned.non_isolcpus=00000001
    default_hugepagesz=1G
    hugepagesz=1G
    hugepages=1
  Kernel Type:   realtime
  Os Image URL:
Events:          <none>





# oc get node worker-1 -o yaml
apiVersion: v1
kind: Node
metadata:
  annotations:
    machine.openshift.io/machine: openshift-machine-api/ostest-worker-0-f78ck
    machineconfiguration.openshift.io/currentConfig: rendered-worker-2e3a4df0240eb8745b2845521a1f28f8
    machineconfiguration.openshift.io/desiredConfig: rendered-worker-2e3a4df0240eb8745b2845521a1f28f8
    machineconfiguration.openshift.io/reason: ""
    machineconfiguration.openshift.io/state: Done
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2020-06-09T08:56:29Z"
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: worker-1
    kubernetes.io/os: linux
    node-role.kubernetes.io/worker: ""
    node.openshift.io/os_id: rhcos
    ptp/slave: ""
  name: worker-1
  resourceVersion: "97743"
  selfLink: /api/v1/nodes/worker-1
  uid: 4560f482-ce60-4aa2-8237-5864c38727da
spec: {}
status:
  addresses:
  - address: 192.168.111.24
    type: InternalIP
  - address: worker-1
    type: Hostname
  allocatable:
    cpu: 3500m
    ephemeral-storage: "17683605064"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 6995204Ki
    pods: "250"
  capacity:
    cpu: "4"
    ephemeral-storage: 19876Mi
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 8146180Ki
    pods: "250"
  conditions:
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet has sufficient memory available
    reason: KubeletHasSufficientMemory
    status: "False"
    type: MemoryPressure
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet has no disk pressure
    reason: KubeletHasNoDiskPressure
    status: "False"
    type: DiskPressure
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet has sufficient PID available
    reason: KubeletHasSufficientPID
    status: "False"
    type: PIDPressure
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet is posting ready status
    reason: KubeletReady
    status: "True"
    type: Ready
  daemonEndpoints:
    kubeletEndpoint:
      Port: 10250
  images:
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ec5c9406f29ac98580228688db7849590f949e01df39952327adfb261197c16c
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 773373898
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d5441ced3440395aa27b2d6ceec3315acf55f2dccc19d76a2f0e704d30b77cc0
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 467404338
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a16b0dab9867070830d18ba5cab98d02b92fa367b69d464e6e22860f0a6293e0
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 454102624
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:398e40715b7e39428c2c8d8cedfcc024cf0bed8844b80bf22ff4107345e2e298
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 429949466
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:79a4986a92c449401e69c14f34f2e0ad92bc219dd43f716ed806d551e2e09f72
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 428902230
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d879eb6e4426c976d1fffef15a38ff2454bca5d382c13c7b157181db9373f43e
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 367621904
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:81dce0d0947963d8ce70c74bc63eaf2f2cdf17c40b839623c39d52f6a5ef2d61
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 364552618
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:127b6610891c6e38d8ee474e134b9799ecc6cf0ae9be659f06f933ea84e7c877
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 342705361
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b89e9a469c30c0b707766d0ae31665ccd6a13135c8074087f617ddee12860774
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 338328160
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9e4d6bd70d5c481267ec2c420b7ee40d2addecb190e4787ac912996a29c90d00
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 334993182
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba38bc55a210372ca5243310f470d98e6b0d6712609eccfbcc4e5961c1cb0205
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 333782407
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8376e01b6c5a4f045548a16946d5b24918ddd88f156e26249e2ff38ab539d512
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 325752144
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c5ec8d53faf22153f54fc003b094fa263b769b50a457c878824c800e344b1b2f
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 318103617
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:19f3539c87ed84a02ee5a15438928083aa9e82294ff9a8dcb86d6eeeeff68c0b
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 311095159
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3e185bf343f119d7e64d3d4297ad358c86a613bd047f8d45176a0a484f5d87f3
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 305661722
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:43e5bf12c08a33946f99f475ae062187242d5ec74b7d23b48621a1b9f91e44b3
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 299586006
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:12cce448b8888f398604eeb6a3a7a14c2850cd9dfd463229bd0735aa68ada885
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 283513263
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:294f21bc3baa4221f3c02ed3e2dadcf3e61490813691579be9f05d69b3639f3c
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 277408258
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dbc06ac527acaed79a4b189ba091fb4a37bfaa4b061f399011e07163fa4fcd26
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 277232582
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9166f5f53cbaddbf11d4895c96d73b16a9b9002cb3fadefd9f7e13e96ee16edc
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 276916452
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c69cb2203988cae3506a58d8616ac928b7f2f796d147fbc6640f644398fb1949
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 267689884
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b3e62612cd9192b7af779ff34a95b039cac3058b0c39129eb37a7deec08908e
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 264283423
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d7a71a527f95dd9b607cb2db3a2809003818573e6f550a96238ac2a55cc8a635
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 258272205
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e08c2b9e5bf641a524636e7b699830d27464636e431dd992a4a49f56b27d4dd2
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 257276911
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c14d0178b17e5184c17090bcc624da3e21f84497140865a2e41d6c3d54a0e42a
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 255893796
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:570784e273b695e401ef77926ca6acc2af523f706e9eff1ddd2bba1568d610a5
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 251107160
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f5e521c35f624e1d40570f476c590490ee8624c55834c22f3dfa7e7ab4fd79f
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 243361727
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b56ea96a9f19d51b2822719cb5b7e4cd34a5253f97f34538b64f3a7111489d8
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 237382357
  nodeInfo:
    architecture: amd64
    bootID: e26520f5-f180-4826-ac7a-b206ca16f76b
    containerRuntimeVersion: cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
    kernelVersion: 4.18.0-147.8.1.el8_1.x86_64
    kubeProxyVersion: v1.17.1+3f6f40d
    kubeletVersion: v1.17.1+3f6f40d
    machineID: 7751b004e15743aea649269a173add91
    operatingSystem: linux
    osImage: Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)
    systemUUID: 7751b004-e157-43ae-a649-269a173add91

Comment 2 Martin Sivák 2020-06-10 11:55:10 UTC
I always wondered if the label selector in mcp is correct:

  Machine Config Selector:
    Match Expressions:
      Key:       machineconfiguration.openshift.io/role
      Operator:  In
      Values:
        worker-cnf
        worker

Why not match just on worker-cnf?

Comment 5 Gowrishankar Rajaiyan 2020-06-23 17:04:22 UTC
Automation hit this issue again. Observe that worker-1 has no 'worker-cnf' label, however, RT kernel is still installed.

# oc get node -o wide
NAME       STATUS   ROLES               AGE     VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                         CONTAINER-RUNTIME
master-0   Ready    master              4h18m   v1.17.1+912792b   192.168.111.20   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
master-1   Ready    master              4h17m   v1.17.1+912792b   192.168.111.21   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
master-2   Ready    master              4h17m   v1.17.1+912792b   192.168.111.22   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-0   Ready    worker,worker-cnf   3h31m   v1.17.1+912792b   192.168.111.23   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.8.1.rt24.101.el8_1.x86_64   cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-1   Ready    worker              3h30m   v1.17.1+912792b   192.168.111.24   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.8.1.rt24.101.el8_1.x86_64   cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-2   Ready    worker              3h25m   v1.17.1+912792b   192.168.111.25   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-3   Ready    worker              3h34m   v1.17.1+912792b   192.168.111.26   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8

Comment 6 Denys Shchedrivyi 2020-06-23 20:55:18 UTC
From machine-config-daemon logs I see it tries to remove rt-kernel and all additional stuff from /proc/cmdline (at least log shows that rpm-ostree command was executed without errors): 


I0623 15:45:13.267134    2751 update.go:1291] Running rpm-ostree [kargs --delete=nohz=on --delete=nosoftlockup --delete=skew_tick=1 --delete=intel_pstate=disable --delete=intel_iommu=on --delete=iommu=pt --delete=rcu_nocbs=1-3 --delete=tuned.non_isolcpus=00000001 --delete=default_hugepagesz=1G --delete=hugepagesz=1G --delete=hugepages=1]
I0623 15:47:47.761103    2751 update.go:1291] Initiating switch from kernel realtime to default
I0623 15:47:47.769240    2751 update.go:1291] Switching to kernelType=default, invoking rpm-ostree ["override" "reset" "kernel" "kernel-core" "kernel-modules" "kernel-modules-extra" "--uninstall" "kernel-rt-core" "--uninstall" "kernel-rt-modules" "--uninstall" "kernel-rt-modules-extra"]
I0623 15:49:48.824461    2751 update.go:1291] initiating reboot: Node will reboot into config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:49:50.421631    2751 daemon.go:553] Shutting down MachineConfigDaemon
I0623 15:51:05.278092    2586 start.go:74] Version: v4.4.0-202006160135-dirty (b6c95fea3987483780994c8a5809a6afd15a633d)
I0623 15:51:05.290953    2586 start.go:84] Calling chroot("/rootfs")
I0623 15:51:05.293832    2586 rpm-ostree.go:366] Running captured: rpm-ostree status --json
I0623 15:51:05.835798    2586 daemon.go:209] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67 (44.81.202006161946-0)
I0623 15:51:05.845673    2586 metrics.go:106] Registering Prometheus metrics
I0623 15:51:05.845776    2586 metrics.go:111] Starting metrics listener on 127.0.0.1:8797
I0623 15:51:05.861258    2586 update.go:1291] Starting to manage node: worker-1
I0623 15:51:05.892696    2586 rpm-ostree.go:366] Running captured: rpm-ostree status
I0623 15:51:06.170340    2586 daemon.go:778] State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67
              CustomOrigin: Managed by machine-config-operator
                   Version: 44.81.202006161946-0 (2020-06-16T19:52:18Z)
       RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-147.20.1.el8_1
             LocalPackages: kernel-rt-core-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-extra-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                 Initramfs: -I '/etc/systemd/system.conf /etc/systemd/system.conf.d/setAffinity.conf' 

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67
              CustomOrigin: Managed by machine-config-operator
                   Version: 44.81.202006161946-0 (2020-06-16T19:52:18Z)
       RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-147.20.1.el8_1
             LocalPackages: kernel-rt-core-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-extra-4.18.0-147.8.1.rt24.101.el8_1.x86_64
I0623 15:51:06.170367    2586 rpm-ostree.go:366] Running captured: journalctl --list-boots
I0623 15:51:06.214931    2586 daemon.go:785] journalctl --list-boots:
-6 7348b2aa010442fe8f0130c400f2934e Tue 2020-06-23 13:14:00 UTC—Tue 2020-06-23 13:27:49 UTC
-5 7bddde3d635b4625abd942c8615e8d61 Tue 2020-06-23 13:28:02 UTC—Tue 2020-06-23 14:08:06 UTC
-4 9de73ead42a442d089fef57fbe57cde9 Tue 2020-06-23 14:08:19 UTC—Tue 2020-06-23 14:30:48 UTC
-3 c35644ae4cc34beba4c9a48b3c705c23 Tue 2020-06-23 14:30:59 UTC—Tue 2020-06-23 15:35:30 UTC
-2 3ca184967bfd49e29b7999321d362137 Tue 2020-06-23 15:35:44 UTC—Tue 2020-06-23 15:38:48 UTC
-1 1a83ee5e6c48431e9ef47ae63ab554dd Tue 2020-06-23 15:39:04 UTC—Tue 2020-06-23 15:49:55 UTC
 0 b749a5bc768f4a1c9a9e96e100617aeb Tue 2020-06-23 15:50:11 UTC—Tue 2020-06-23 15:51:06 UTC
I0623 15:51:06.215422    2586 daemon.go:528] Starting MachineConfigDaemon
I0623 15:51:06.215734    2586 daemon.go:535] Enabling Kubelet Healthz Monitor
E0623 15:51:09.944725    2586 reflector.go:153] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
E0623 15:51:09.945801    2586 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
I0623 15:56:02.145210    2586 daemon.go:731] Current config: rendered-worker-test-ca4f34a17c0142a571e1a28dbd605d89
I0623 15:56:02.145255    2586 daemon.go:732] Desired config: rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.161923    2586 update.go:1291] Disk currentConfig rendered-worker-266d98cce7d051b2576afb3add50ec44 overrides node annotation rendered-worker-test-ca4f34a17c0142a571e1a28dbd605d89
I0623 15:56:02.169620    2586 daemon.go:955] Validating against pending config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.176061    2586 daemon.go:971] Validated on-disk state
I0623 15:56:02.196827    2586 daemon.go:1005] Completing pending config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.204937    2586 update.go:1291] completed update for config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.212888    2586 daemon.go:1021] In desired config rendered-worker-266d98cce7d051b2576afb3add50ec44



but kernel was not reverted back.. as well as values in cmdline are still present:

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-fc107e60e7e98cdbc54ca91fd9294d4cbf2ff5447d7b81511d7204df2f0b0e6c/vmlinuz-4.18.0-147.8.1.rt24.101.el8_1.x86_64 rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.0/rhcos/fc107e60e7e98cdbc54ca91fd9294d4cbf2ff5447d7b81511d7204df2f0b0e6c/0 ignition.platform.id=openstack nohz=on nosoftlockup skew_tick=1 intel_pstate=disable intel_iommu=on iommu=pt rcu_nocbs=1-3 tuned.non_isolcpus=00000001 default_hugepagesz=1G hugepagesz=1G hugepages=1

Comment 8 Denys Shchedrivyi 2020-06-27 15:49:41 UTC
There is a log from journalctl:


Jun 27 14:27:25 worker-4 systemd[1]: Unmounting Boot partition...
Jun 27 14:27:25 worker-4 systemd[1]: Unmounting /var/lib/containers/storage/overlay...
Jun 27 14:27:25 worker-4 systemd[1]: Stopped target Host and Network Name Lookups.
Jun 27 14:27:25 worker-4 systemd[1]: Removed slice system-serial\x2dgetty.slice.
Jun 27 14:27:25 worker-4 systemd[1]: system-serial\x2dgetty.slice: Consumed 175ms CPU time
Jun 27 14:27:25 worker-4 systemd[1]: Stopping Permit User Sessions...
Jun 27 14:27:25 worker-4 systemd[1]: Removed slice system-getty.slice.
Jun 27 14:27:25 worker-4 systemd[1]: system-getty.slice: Consumed 37ms CPU time
Jun 27 14:27:25 worker-4 systemd[1]: Unmounted Boot partition.
Jun 27 14:27:25 worker-4 systemd[1]: boot.mount: Consumed 26ms CPU time
.
Jun 27 14:27:25 worker-4 ostree[55230]: error: Unexpected state: /run/ostree-booted found, but no /boot/loader directory
Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited status=1
Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'.
Jun 27 14:27:25 worker-4 systemd[1]: Stopped OSTree Finalize Staged Deployment.
Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Consumed 37ms CPU time


As i understand, it unmounts /boot/ partition first and after that ostree can't finish job there..

Comment 9 Martin Sivák 2020-09-11 06:23:33 UTC
Closing as the fix happened in RHCOS and will be part of the next OCP release.

Comment 10 Red Hat Bugzilla 2023-09-15 00:32:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days