Bug 1845427 - Removing 'worker-cnf' label from the worker node does not revert RT Kernel to non-RT Kernel.
Summary: Removing 'worker-cnf' label from the worker node does not revert RT Kernel to...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Performance Addon Operator
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Yanir Quinn
QA Contact: Gowrishankar Rajaiyan
URL:
Whiteboard:
Depends On: 1827712
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-09 08:11 UTC by Gowrishankar Rajaiyan
Modified: 2023-09-15 00:32 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-11 06:23:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Gowrishankar Rajaiyan 2020-06-09 08:11:00 UTC
Description of problem: Removing 'worker-cnf' label from the worker node should remove RT Kernel from the worker node thereby reverting it to its previous state.


Version-Release number of selected component (if applicable): v4.4.0-77


How reproducible: Not consistently reproducible. But we hit this issue twice during our nightly test execution.


Steps to Reproduce:
1. Install OCP 4.4.7
2. Label worker node with 'worker-cnf'.
3. Deploy Performance Addon Operator and Performance Profile with 'realTimeKernel: {enabled: true}'
3. Ensure that RT Kernel is installed on 'worker-cnf' labeled worker node.
4. Remove 'worker-cnf' label.
5. Verify if RT Kernel is reverted back to non-RT Kernel.

Actual results: RT Kernel is still installed.


Expected results: RT Kernel is reverted back to non-RT Kernel


Additional info:

Comment 1 Gowrishankar Rajaiyan 2020-06-09 11:14:59 UTC
*Note: The following information is from a _similar_ cluster where the issue could not be reproduced.*


# oc get node -o wide
NAME       STATUS   ROLES               AGE    VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                         CONTAINER-RUNTIME
master-0   Ready    master              158m   v1.17.1+3f6f40d   192.168.111.20   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
master-1   Ready    master              158m   v1.17.1+3f6f40d   192.168.111.21   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
master-2   Ready    master              158m   v1.17.1+3f6f40d   192.168.111.22   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-0   Ready    worker,worker-cnf   137m   v1.17.1+3f6f40d   192.168.111.23   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.rt24.101.el8_1.x86_64   cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-1   Ready    worker              138m   v1.17.1+3f6f40d   192.168.111.24   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-2   Ready    worker              138m   v1.17.1+3f6f40d   192.168.111.25   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64            cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8



# oc describe performanceprofiles.performance.openshift.io performance
Name:         performance
Namespace:
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"performance.openshift.io/v1alpha1","kind":"PerformanceProfile","metadata":{"annotations":{},"name":"performance"},"spec":{"...
API Version:  performance.openshift.io/v1alpha1
Kind:         PerformanceProfile
Metadata:
  Creation Timestamp:  2020-06-09T09:36:40Z
  Finalizers:
    foreground-deletion
  Generation:        7
  Resource Version:  91146
  Self Link:         /apis/performance.openshift.io/v1alpha1/performanceprofiles/performance
  UID:               a2bc52cc-9b13-4904-a47e-9f7a592daf5b
Spec:
  Cpu:
    Isolated:  1-3
    Reserved:  0
  Hugepages:
    Default Hugepages Size:  1G
    Pages:
      Count:  1
      Size:   1G
  Node Selector:
    node-role.kubernetes.io/worker-cnf:
  Real Time Kernel:
    Enabled:  true
Status:
  Conditions:
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                True
    Type:                  Available
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                True
    Type:                  Upgradeable
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                False
    Type:                  Progressing
    Last Heartbeat Time:   2020-06-09T10:08:01Z
    Last Transition Time:  2020-06-09T10:08:01Z
    Status:                False
    Type:                  Degraded
Events:
  Type     Reason              Age                From                            Message
  ----     ------              ----               ----                            -------
  Normal   Creation succeeded  87m                performance-profile-controller  Succeeded to create all components
  Warning  Creation failed     56m                performance-profile-controller  Failed to create all components: Operation cannot be fulfilled on machineconfigs.machineconfiguration.openshift.io "performance-performance": the object has been modified; please apply your changes to the latest version and try again
  Normal   Creation succeeded  10m (x5 over 69m)  performance-profile-controller  Succeeded to create all components





# oc get mcp
NAME         CONFIG                                                 UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master       rendered-master-0112f787eaed5267898b3988aec444eb       True      False      False      3              3                   3                     0                      150m
worker       rendered-worker-2e3a4df0240eb8745b2845521a1f28f8       True      False      False      2              2                   2                     0                      150m
worker-cnf   rendered-worker-cnf-5c311bbe9c9f762497d1753a8fb27536   True      False      False      1              1                   1                     0                      97m




# oc describe mcp worker-cnf
Name:         worker-cnf
Namespace:
Labels:       machineconfiguration.openshift.io/role=worker-cnf
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"machineconfiguration.openshift.io/v1","kind":"MachineConfigPool","metadata":{"annotations":{},"labels":{"machineconfigurati...
API Version:  machineconfiguration.openshift.io/v1
Kind:         MachineConfigPool
Metadata:
  Creation Timestamp:  2020-06-09T09:29:14Z
  Generation:          6
  Resource Version:    79333
  Self Link:           /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker-cnf
  UID:                 68b55a8e-1d12-4e8f-b57d-86cea701f80d
Spec:
  Configuration:
    Name:  rendered-worker-cnf-5c311bbe9c9f762497d1753a8fb27536
    Source:
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         00-worker
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         00-worker-chronyd-custom
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         01-worker-container-runtime
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         01-worker-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         98-worker-92ef12b3-6b66-4348-b91f-f49b08f49348-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         98-worker-cnf-68b55a8e-1d12-4e8f-b57d-86cea701f80d-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-92ef12b3-6b66-4348-b91f-f49b08f49348-registries
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-cnf-68b55a8e-1d12-4e8f-b57d-86cea701f80d-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-registries
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-ssh
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         performance-performance
  Machine Config Selector:
    Match Expressions:
      Key:       machineconfiguration.openshift.io/role
      Operator:  In
      Values:
        worker-cnf
        worker
  Node Selector:
    Match Labels:
      node-role.kubernetes.io/worker-cnf:
  Paused:                                  false




# oc describe mc performance-performance
Name:         performance-performance
Namespace:
Labels:       machineconfiguration.openshift.io/role=worker-cnf
Annotations:  <none>
API Version:  machineconfiguration.openshift.io/v1
Kind:         MachineConfig
Metadata:
  Creation Timestamp:  2020-06-09T09:36:49Z
  Generation:          4
  Owner References:
    API Version:           performance.openshift.io/v1alpha1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  PerformanceProfile
    Name:                  performance
    UID:                   a2bc52cc-9b13-4904-a47e-9f7a592daf5b
  Resource Version:        91150
  Self Link:               /apis/machineconfiguration.openshift.io/v1/machineconfigs/performance-performance
  UID:                     841ae0a3-4762-4602-900f-10e988fcb14e
Spec:
  Config:
    Ignition:
      Config:
      Security:
        Tls:
      Timeouts:
      Version:  2.2.0
    Networkd:
    Passwd:
    Storage:
      Files:
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKClNZU1RFTV9DT05GSUdfRklMRT0iL2V0Yy9zeXN0ZW1kL3N5c3RlbS5jb25mIgpTWVNURU1fQ09ORklHX0NVU1RPTV9GSUxFPSIvZXRjL3N5c3RlbWQvc3lzdGVtLmNvbmYuZC9zZXRBZmZpbml0eS5jb25mIgoKaWYgWyAtZiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlIF0gJiYgWyAtZiAke1NZU1RFTV9DT05GSUdfQ1VTVE9NX0ZJTEV9IF0gICYmIHJwbS1vc3RyZWUgc3RhdHVzIC1iIHwgZ3JlcCAtcSAtZSAiJHtTWVNURU1fQ09ORklHX0ZJTEV9ICR7U1lTVEVNX0NPTkZJR19DVVNUT01fRklMRX0iICYmIGVncmVwIC13cSAiXklSUUJBTEFOQ0VfQkFOTkVEX0NQVVM9JHtSRVNFUlZFRF9DUFVfTUFTS19JTlZFUlR9IiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlOyB0aGVuCiAgICBlY2hvICJQcmUgYm9vdCB0dW5pbmcgY29uZmlndXJhdGlvbiBhbHJlYWR5IGFwcGxpZWQiCmVsc2UKICAgICNTZXQgSVJRIGJhbGFuY2UgYmFubmVkIGNwdXMKICAgIGlmIFsgISAtZiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlIF07IHRoZW4KICAgICAgICB0b3VjaCAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlCiAgICBmaQoKICAgIGlmIGdyZXAgLWxzICJJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSIgL2V0Yy9zeXNjb25maWcvaXJxYmFsYW5jZTsgdGhlbgogICAgICAgIHNlZCAtaSAicy9eLipJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPS4qJC9JUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSR7UkVTRVJWRURfQ1BVX01BU0tfSU5WRVJUfS8iIC9ldGMvc3lzY29uZmlnL2lycWJhbGFuY2UKICAgIGVsc2UKICAgICAgICBlY2hvICJJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSR7UkVTRVJWRURfQ1BVX01BU0tfSU5WRVJUfSIgPj4vZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlCiAgICBmaQoKICAgIHJwbS1vc3RyZWUgaW5pdHJhbWZzIC0tZW5hYmxlIC0tYXJnPS1JIC0tYXJnPSIke1NZU1RFTV9DT05GSUdfRklMRX0gJHtTWVNURU1fQ09ORklHX0NVU1RPTV9GSUxFfSIgCgogICAgdG91Y2ggL3Zhci9yZWJvb3QKZmkK
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /usr/local/bin/pre-boot-tuning.sh
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKCm5vZGVzX3BhdGg9Ii9zeXMvZGV2aWNlcy9zeXN0ZW0vbm9kZSIKaHVnZXBhZ2VzX2ZpbGU9IiR7bm9kZXNfcGF0aH0vbm9kZSR7TlVNQV9OT0RFfS9odWdlcGFnZXMvaHVnZXBhZ2VzLSR7SFVHRVBBR0VTX1NJWkV9a0IvbnJfaHVnZXBhZ2VzIgoKaWYgWyAhIC1mICAke2h1Z2VwYWdlc19maWxlfSBdOyB0aGVuCiAgICBlY2hvICJFUlJPUjogJHtodWdlcGFnZXNfZmlsZX0gZG9lcyBub3QgZXhpc3QiCiAgICBleGl0IDEKZmkKCmVjaG8gJHtIVUdFUEFHRVNfQ09VTlR9ID4gJHtodWdlcGFnZXNfZmlsZX0K
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /usr/local/bin/hugepages-allocation.sh
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKCmlmIFtbIC1mIC92YXIvcmVib290IF1dOyB0aGVuIAogICAgcm0gLWYgL3Zhci9yZWJvb3QKICAgIGVjaG8gIkZpbGUgL3Zhci9yZWJvb3QgZXhpc3RzLCBpbml0aWF0ZSByZWJvb3QiCiAgICBzeXN0ZW1jdGwgcmVib290CmZpCg==
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /usr/local/bin/reboot.sh
        Contents:
          Source:  data:text/plain;charset=utf-8;base64,W01hbmFnZXJdCkNQVUFmZmluaXR5PTA=
          Verification:
        Filesystem:  root
        Mode:        448
        Path:        /etc/systemd/system.conf.d/setAffinity.conf
    Systemd:
      Units:
        Contents:  [Unit]
Description=Preboot tuning patch
Before=kubelet.service
Before=reboot.service

[Service]
Environment=RESERVED_CPUS=0
Environment=RESERVED_CPU_MASK_INVERT=ffffffff,fffffffe
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/local/bin/pre-boot-tuning.sh

[Install]
WantedBy=multi-user.target

        Enabled:   true
        Name:      pre-boot-tuning.service
        Contents:  [Unit]
Description=Reboot initiated by pre-boot-tuning
Wants=network-online.target
After=network-online.target
Before=kubelet.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/local/bin/reboot.sh

[Install]
WantedBy=multi-user.target

        Enabled:  true
        Name:     reboot.service
  Fips:           false
  Kernel Arguments:
    nohz=on
    nosoftlockup
    skew_tick=1
    intel_pstate=disable
    intel_iommu=on
    iommu=pt
    rcu_nocbs=1-3
    tuned.non_isolcpus=00000001
    default_hugepagesz=1G
    hugepagesz=1G
    hugepages=1
  Kernel Type:   realtime
  Os Image URL:
Events:          <none>





# oc get node worker-1 -o yaml
apiVersion: v1
kind: Node
metadata:
  annotations:
    machine.openshift.io/machine: openshift-machine-api/ostest-worker-0-f78ck
    machineconfiguration.openshift.io/currentConfig: rendered-worker-2e3a4df0240eb8745b2845521a1f28f8
    machineconfiguration.openshift.io/desiredConfig: rendered-worker-2e3a4df0240eb8745b2845521a1f28f8
    machineconfiguration.openshift.io/reason: ""
    machineconfiguration.openshift.io/state: Done
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2020-06-09T08:56:29Z"
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: worker-1
    kubernetes.io/os: linux
    node-role.kubernetes.io/worker: ""
    node.openshift.io/os_id: rhcos
    ptp/slave: ""
  name: worker-1
  resourceVersion: "97743"
  selfLink: /api/v1/nodes/worker-1
  uid: 4560f482-ce60-4aa2-8237-5864c38727da
spec: {}
status:
  addresses:
  - address: 192.168.111.24
    type: InternalIP
  - address: worker-1
    type: Hostname
  allocatable:
    cpu: 3500m
    ephemeral-storage: "17683605064"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 6995204Ki
    pods: "250"
  capacity:
    cpu: "4"
    ephemeral-storage: 19876Mi
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 8146180Ki
    pods: "250"
  conditions:
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet has sufficient memory available
    reason: KubeletHasSufficientMemory
    status: "False"
    type: MemoryPressure
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet has no disk pressure
    reason: KubeletHasNoDiskPressure
    status: "False"
    type: DiskPressure
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet has sufficient PID available
    reason: KubeletHasSufficientPID
    status: "False"
    type: PIDPressure
  - lastHeartbeatTime: "2020-06-09T11:09:13Z"
    lastTransitionTime: "2020-06-09T10:49:13Z"
    message: kubelet is posting ready status
    reason: KubeletReady
    status: "True"
    type: Ready
  daemonEndpoints:
    kubeletEndpoint:
      Port: 10250
  images:
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ec5c9406f29ac98580228688db7849590f949e01df39952327adfb261197c16c
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 773373898
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d5441ced3440395aa27b2d6ceec3315acf55f2dccc19d76a2f0e704d30b77cc0
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 467404338
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a16b0dab9867070830d18ba5cab98d02b92fa367b69d464e6e22860f0a6293e0
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 454102624
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:398e40715b7e39428c2c8d8cedfcc024cf0bed8844b80bf22ff4107345e2e298
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 429949466
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:79a4986a92c449401e69c14f34f2e0ad92bc219dd43f716ed806d551e2e09f72
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 428902230
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d879eb6e4426c976d1fffef15a38ff2454bca5d382c13c7b157181db9373f43e
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 367621904
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:81dce0d0947963d8ce70c74bc63eaf2f2cdf17c40b839623c39d52f6a5ef2d61
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 364552618
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:127b6610891c6e38d8ee474e134b9799ecc6cf0ae9be659f06f933ea84e7c877
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 342705361
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b89e9a469c30c0b707766d0ae31665ccd6a13135c8074087f617ddee12860774
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 338328160
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9e4d6bd70d5c481267ec2c420b7ee40d2addecb190e4787ac912996a29c90d00
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 334993182
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba38bc55a210372ca5243310f470d98e6b0d6712609eccfbcc4e5961c1cb0205
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 333782407
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8376e01b6c5a4f045548a16946d5b24918ddd88f156e26249e2ff38ab539d512
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 325752144
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c5ec8d53faf22153f54fc003b094fa263b769b50a457c878824c800e344b1b2f
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 318103617
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:19f3539c87ed84a02ee5a15438928083aa9e82294ff9a8dcb86d6eeeeff68c0b
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 311095159
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3e185bf343f119d7e64d3d4297ad358c86a613bd047f8d45176a0a484f5d87f3
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 305661722
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:43e5bf12c08a33946f99f475ae062187242d5ec74b7d23b48621a1b9f91e44b3
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 299586006
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:12cce448b8888f398604eeb6a3a7a14c2850cd9dfd463229bd0735aa68ada885
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 283513263
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:294f21bc3baa4221f3c02ed3e2dadcf3e61490813691579be9f05d69b3639f3c
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 277408258
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dbc06ac527acaed79a4b189ba091fb4a37bfaa4b061f399011e07163fa4fcd26
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 277232582
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9166f5f53cbaddbf11d4895c96d73b16a9b9002cb3fadefd9f7e13e96ee16edc
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 276916452
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c69cb2203988cae3506a58d8616ac928b7f2f796d147fbc6640f644398fb1949
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 267689884
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b3e62612cd9192b7af779ff34a95b039cac3058b0c39129eb37a7deec08908e
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 264283423
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d7a71a527f95dd9b607cb2db3a2809003818573e6f550a96238ac2a55cc8a635
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 258272205
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e08c2b9e5bf641a524636e7b699830d27464636e431dd992a4a49f56b27d4dd2
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 257276911
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c14d0178b17e5184c17090bcc624da3e21f84497140865a2e41d6c3d54a0e42a
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 255893796
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:570784e273b695e401ef77926ca6acc2af523f706e9eff1ddd2bba1568d610a5
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 251107160
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f5e521c35f624e1d40570f476c590490ee8624c55834c22f3dfa7e7ab4fd79f
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 243361727
  - names:
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b56ea96a9f19d51b2822719cb5b7e4cd34a5253f97f34538b64f3a7111489d8
    - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
    sizeBytes: 237382357
  nodeInfo:
    architecture: amd64
    bootID: e26520f5-f180-4826-ac7a-b206ca16f76b
    containerRuntimeVersion: cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
    kernelVersion: 4.18.0-147.8.1.el8_1.x86_64
    kubeProxyVersion: v1.17.1+3f6f40d
    kubeletVersion: v1.17.1+3f6f40d
    machineID: 7751b004e15743aea649269a173add91
    operatingSystem: linux
    osImage: Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)
    systemUUID: 7751b004-e157-43ae-a649-269a173add91

Comment 2 Martin Sivák 2020-06-10 11:55:10 UTC
I always wondered if the label selector in mcp is correct:

  Machine Config Selector:
    Match Expressions:
      Key:       machineconfiguration.openshift.io/role
      Operator:  In
      Values:
        worker-cnf
        worker

Why not match just on worker-cnf?

Comment 5 Gowrishankar Rajaiyan 2020-06-23 17:04:22 UTC
Automation hit this issue again. Observe that worker-1 has no 'worker-cnf' label, however, RT kernel is still installed.

# oc get node -o wide
NAME       STATUS   ROLES               AGE     VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                         CONTAINER-RUNTIME
master-0   Ready    master              4h18m   v1.17.1+912792b   192.168.111.20   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
master-1   Ready    master              4h17m   v1.17.1+912792b   192.168.111.21   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
master-2   Ready    master              4h17m   v1.17.1+912792b   192.168.111.22   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-0   Ready    worker,worker-cnf   3h31m   v1.17.1+912792b   192.168.111.23   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.8.1.rt24.101.el8_1.x86_64   cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-1   Ready    worker              3h30m   v1.17.1+912792b   192.168.111.24   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.8.1.rt24.101.el8_1.x86_64   cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-2   Ready    worker              3h25m   v1.17.1+912792b   192.168.111.25   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8
worker-3   Ready    worker              3h34m   v1.17.1+912792b   192.168.111.26   <none>        Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa)   4.18.0-147.20.1.el8_1.x86_64           cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8

Comment 6 Denys Shchedrivyi 2020-06-23 20:55:18 UTC
From machine-config-daemon logs I see it tries to remove rt-kernel and all additional stuff from /proc/cmdline (at least log shows that rpm-ostree command was executed without errors): 


I0623 15:45:13.267134    2751 update.go:1291] Running rpm-ostree [kargs --delete=nohz=on --delete=nosoftlockup --delete=skew_tick=1 --delete=intel_pstate=disable --delete=intel_iommu=on --delete=iommu=pt --delete=rcu_nocbs=1-3 --delete=tuned.non_isolcpus=00000001 --delete=default_hugepagesz=1G --delete=hugepagesz=1G --delete=hugepages=1]
I0623 15:47:47.761103    2751 update.go:1291] Initiating switch from kernel realtime to default
I0623 15:47:47.769240    2751 update.go:1291] Switching to kernelType=default, invoking rpm-ostree ["override" "reset" "kernel" "kernel-core" "kernel-modules" "kernel-modules-extra" "--uninstall" "kernel-rt-core" "--uninstall" "kernel-rt-modules" "--uninstall" "kernel-rt-modules-extra"]
I0623 15:49:48.824461    2751 update.go:1291] initiating reboot: Node will reboot into config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:49:50.421631    2751 daemon.go:553] Shutting down MachineConfigDaemon
I0623 15:51:05.278092    2586 start.go:74] Version: v4.4.0-202006160135-dirty (b6c95fea3987483780994c8a5809a6afd15a633d)
I0623 15:51:05.290953    2586 start.go:84] Calling chroot("/rootfs")
I0623 15:51:05.293832    2586 rpm-ostree.go:366] Running captured: rpm-ostree status --json
I0623 15:51:05.835798    2586 daemon.go:209] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67 (44.81.202006161946-0)
I0623 15:51:05.845673    2586 metrics.go:106] Registering Prometheus metrics
I0623 15:51:05.845776    2586 metrics.go:111] Starting metrics listener on 127.0.0.1:8797
I0623 15:51:05.861258    2586 update.go:1291] Starting to manage node: worker-1
I0623 15:51:05.892696    2586 rpm-ostree.go:366] Running captured: rpm-ostree status
I0623 15:51:06.170340    2586 daemon.go:778] State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67
              CustomOrigin: Managed by machine-config-operator
                   Version: 44.81.202006161946-0 (2020-06-16T19:52:18Z)
       RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-147.20.1.el8_1
             LocalPackages: kernel-rt-core-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-extra-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                 Initramfs: -I '/etc/systemd/system.conf /etc/systemd/system.conf.d/setAffinity.conf' 

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67
              CustomOrigin: Managed by machine-config-operator
                   Version: 44.81.202006161946-0 (2020-06-16T19:52:18Z)
       RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-147.20.1.el8_1
             LocalPackages: kernel-rt-core-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-4.18.0-147.8.1.rt24.101.el8_1.x86_64
                            kernel-rt-modules-extra-4.18.0-147.8.1.rt24.101.el8_1.x86_64
I0623 15:51:06.170367    2586 rpm-ostree.go:366] Running captured: journalctl --list-boots
I0623 15:51:06.214931    2586 daemon.go:785] journalctl --list-boots:
-6 7348b2aa010442fe8f0130c400f2934e Tue 2020-06-23 13:14:00 UTC—Tue 2020-06-23 13:27:49 UTC
-5 7bddde3d635b4625abd942c8615e8d61 Tue 2020-06-23 13:28:02 UTC—Tue 2020-06-23 14:08:06 UTC
-4 9de73ead42a442d089fef57fbe57cde9 Tue 2020-06-23 14:08:19 UTC—Tue 2020-06-23 14:30:48 UTC
-3 c35644ae4cc34beba4c9a48b3c705c23 Tue 2020-06-23 14:30:59 UTC—Tue 2020-06-23 15:35:30 UTC
-2 3ca184967bfd49e29b7999321d362137 Tue 2020-06-23 15:35:44 UTC—Tue 2020-06-23 15:38:48 UTC
-1 1a83ee5e6c48431e9ef47ae63ab554dd Tue 2020-06-23 15:39:04 UTC—Tue 2020-06-23 15:49:55 UTC
 0 b749a5bc768f4a1c9a9e96e100617aeb Tue 2020-06-23 15:50:11 UTC—Tue 2020-06-23 15:51:06 UTC
I0623 15:51:06.215422    2586 daemon.go:528] Starting MachineConfigDaemon
I0623 15:51:06.215734    2586 daemon.go:535] Enabling Kubelet Healthz Monitor
E0623 15:51:09.944725    2586 reflector.go:153] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
E0623 15:51:09.945801    2586 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
I0623 15:56:02.145210    2586 daemon.go:731] Current config: rendered-worker-test-ca4f34a17c0142a571e1a28dbd605d89
I0623 15:56:02.145255    2586 daemon.go:732] Desired config: rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.161923    2586 update.go:1291] Disk currentConfig rendered-worker-266d98cce7d051b2576afb3add50ec44 overrides node annotation rendered-worker-test-ca4f34a17c0142a571e1a28dbd605d89
I0623 15:56:02.169620    2586 daemon.go:955] Validating against pending config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.176061    2586 daemon.go:971] Validated on-disk state
I0623 15:56:02.196827    2586 daemon.go:1005] Completing pending config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.204937    2586 update.go:1291] completed update for config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.212888    2586 daemon.go:1021] In desired config rendered-worker-266d98cce7d051b2576afb3add50ec44



but kernel was not reverted back.. as well as values in cmdline are still present:

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-fc107e60e7e98cdbc54ca91fd9294d4cbf2ff5447d7b81511d7204df2f0b0e6c/vmlinuz-4.18.0-147.8.1.rt24.101.el8_1.x86_64 rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.0/rhcos/fc107e60e7e98cdbc54ca91fd9294d4cbf2ff5447d7b81511d7204df2f0b0e6c/0 ignition.platform.id=openstack nohz=on nosoftlockup skew_tick=1 intel_pstate=disable intel_iommu=on iommu=pt rcu_nocbs=1-3 tuned.non_isolcpus=00000001 default_hugepagesz=1G hugepagesz=1G hugepages=1

Comment 8 Denys Shchedrivyi 2020-06-27 15:49:41 UTC
There is a log from journalctl:


Jun 27 14:27:25 worker-4 systemd[1]: Unmounting Boot partition...
Jun 27 14:27:25 worker-4 systemd[1]: Unmounting /var/lib/containers/storage/overlay...
Jun 27 14:27:25 worker-4 systemd[1]: Stopped target Host and Network Name Lookups.
Jun 27 14:27:25 worker-4 systemd[1]: Removed slice system-serial\x2dgetty.slice.
Jun 27 14:27:25 worker-4 systemd[1]: system-serial\x2dgetty.slice: Consumed 175ms CPU time
Jun 27 14:27:25 worker-4 systemd[1]: Stopping Permit User Sessions...
Jun 27 14:27:25 worker-4 systemd[1]: Removed slice system-getty.slice.
Jun 27 14:27:25 worker-4 systemd[1]: system-getty.slice: Consumed 37ms CPU time
Jun 27 14:27:25 worker-4 systemd[1]: Unmounted Boot partition.
Jun 27 14:27:25 worker-4 systemd[1]: boot.mount: Consumed 26ms CPU time
.
Jun 27 14:27:25 worker-4 ostree[55230]: error: Unexpected state: /run/ostree-booted found, but no /boot/loader directory
Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited status=1
Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'.
Jun 27 14:27:25 worker-4 systemd[1]: Stopped OSTree Finalize Staged Deployment.
Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Consumed 37ms CPU time


As i understand, it unmounts /boot/ partition first and after that ostree can't finish job there..

Comment 9 Martin Sivák 2020-09-11 06:23:33 UTC
Closing as the fix happened in RHCOS and will be part of the next OCP release.

Comment 10 Red Hat Bugzilla 2023-09-15 00:32:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.