Bug 1845427
| Summary: | Removing 'worker-cnf' label from the worker node does not revert RT Kernel to non-RT Kernel. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Gowrishankar Rajaiyan <grajaiya> |
| Component: | Performance Addon Operator | Assignee: | Yanir Quinn <yquinn> |
| Status: | CLOSED UPSTREAM | QA Contact: | Gowrishankar Rajaiyan <grajaiya> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.4 | CC: | aos-bugs, dshchedr, fiezzi, grajaiya, scuppett, yquinn |
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-09-11 06:23:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1827712 | ||
| Bug Blocks: | |||
*Note: The following information is from a _similar_ cluster where the issue could not be reproduced.*
# oc get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master-0 Ready master 158m v1.17.1+3f6f40d 192.168.111.20 <none> Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
master-1 Ready master 158m v1.17.1+3f6f40d 192.168.111.21 <none> Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
master-2 Ready master 158m v1.17.1+3f6f40d 192.168.111.22 <none> Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-0 Ready worker,worker-cnf 137m v1.17.1+3f6f40d 192.168.111.23 <none> Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa) 4.18.0-147.8.1.rt24.101.el8_1.x86_64 cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-1 Ready worker 138m v1.17.1+3f6f40d 192.168.111.24 <none> Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
worker-2 Ready worker 138m v1.17.1+3f6f40d 192.168.111.25 <none> Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
# oc describe performanceprofiles.performance.openshift.io performance
Name: performance
Namespace:
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"performance.openshift.io/v1alpha1","kind":"PerformanceProfile","metadata":{"annotations":{},"name":"performance"},"spec":{"...
API Version: performance.openshift.io/v1alpha1
Kind: PerformanceProfile
Metadata:
Creation Timestamp: 2020-06-09T09:36:40Z
Finalizers:
foreground-deletion
Generation: 7
Resource Version: 91146
Self Link: /apis/performance.openshift.io/v1alpha1/performanceprofiles/performance
UID: a2bc52cc-9b13-4904-a47e-9f7a592daf5b
Spec:
Cpu:
Isolated: 1-3
Reserved: 0
Hugepages:
Default Hugepages Size: 1G
Pages:
Count: 1
Size: 1G
Node Selector:
node-role.kubernetes.io/worker-cnf:
Real Time Kernel:
Enabled: true
Status:
Conditions:
Last Heartbeat Time: 2020-06-09T10:08:01Z
Last Transition Time: 2020-06-09T10:08:01Z
Status: True
Type: Available
Last Heartbeat Time: 2020-06-09T10:08:01Z
Last Transition Time: 2020-06-09T10:08:01Z
Status: True
Type: Upgradeable
Last Heartbeat Time: 2020-06-09T10:08:01Z
Last Transition Time: 2020-06-09T10:08:01Z
Status: False
Type: Progressing
Last Heartbeat Time: 2020-06-09T10:08:01Z
Last Transition Time: 2020-06-09T10:08:01Z
Status: False
Type: Degraded
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Creation succeeded 87m performance-profile-controller Succeeded to create all components
Warning Creation failed 56m performance-profile-controller Failed to create all components: Operation cannot be fulfilled on machineconfigs.machineconfiguration.openshift.io "performance-performance": the object has been modified; please apply your changes to the latest version and try again
Normal Creation succeeded 10m (x5 over 69m) performance-profile-controller Succeeded to create all components
# oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-0112f787eaed5267898b3988aec444eb True False False 3 3 3 0 150m
worker rendered-worker-2e3a4df0240eb8745b2845521a1f28f8 True False False 2 2 2 0 150m
worker-cnf rendered-worker-cnf-5c311bbe9c9f762497d1753a8fb27536 True False False 1 1 1 0 97m
# oc describe mcp worker-cnf
Name: worker-cnf
Namespace:
Labels: machineconfiguration.openshift.io/role=worker-cnf
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"machineconfiguration.openshift.io/v1","kind":"MachineConfigPool","metadata":{"annotations":{},"labels":{"machineconfigurati...
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfigPool
Metadata:
Creation Timestamp: 2020-06-09T09:29:14Z
Generation: 6
Resource Version: 79333
Self Link: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker-cnf
UID: 68b55a8e-1d12-4e8f-b57d-86cea701f80d
Spec:
Configuration:
Name: rendered-worker-cnf-5c311bbe9c9f762497d1753a8fb27536
Source:
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 00-worker
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 00-worker-chronyd-custom
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 01-worker-container-runtime
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 01-worker-kubelet
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 98-worker-92ef12b3-6b66-4348-b91f-f49b08f49348-kubelet
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 98-worker-cnf-68b55a8e-1d12-4e8f-b57d-86cea701f80d-kubelet
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-worker-92ef12b3-6b66-4348-b91f-f49b08f49348-registries
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-worker-cnf-68b55a8e-1d12-4e8f-b57d-86cea701f80d-kubelet
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-worker-registries
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-worker-ssh
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: performance-performance
Machine Config Selector:
Match Expressions:
Key: machineconfiguration.openshift.io/role
Operator: In
Values:
worker-cnf
worker
Node Selector:
Match Labels:
node-role.kubernetes.io/worker-cnf:
Paused: false
# oc describe mc performance-performance
Name: performance-performance
Namespace:
Labels: machineconfiguration.openshift.io/role=worker-cnf
Annotations: <none>
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Metadata:
Creation Timestamp: 2020-06-09T09:36:49Z
Generation: 4
Owner References:
API Version: performance.openshift.io/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: PerformanceProfile
Name: performance
UID: a2bc52cc-9b13-4904-a47e-9f7a592daf5b
Resource Version: 91150
Self Link: /apis/machineconfiguration.openshift.io/v1/machineconfigs/performance-performance
UID: 841ae0a3-4762-4602-900f-10e988fcb14e
Spec:
Config:
Ignition:
Config:
Security:
Tls:
Timeouts:
Version: 2.2.0
Networkd:
Passwd:
Storage:
Files:
Contents:
Source: data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKClNZU1RFTV9DT05GSUdfRklMRT0iL2V0Yy9zeXN0ZW1kL3N5c3RlbS5jb25mIgpTWVNURU1fQ09ORklHX0NVU1RPTV9GSUxFPSIvZXRjL3N5c3RlbWQvc3lzdGVtLmNvbmYuZC9zZXRBZmZpbml0eS5jb25mIgoKaWYgWyAtZiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlIF0gJiYgWyAtZiAke1NZU1RFTV9DT05GSUdfQ1VTVE9NX0ZJTEV9IF0gICYmIHJwbS1vc3RyZWUgc3RhdHVzIC1iIHwgZ3JlcCAtcSAtZSAiJHtTWVNURU1fQ09ORklHX0ZJTEV9ICR7U1lTVEVNX0NPTkZJR19DVVNUT01fRklMRX0iICYmIGVncmVwIC13cSAiXklSUUJBTEFOQ0VfQkFOTkVEX0NQVVM9JHtSRVNFUlZFRF9DUFVfTUFTS19JTlZFUlR9IiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlOyB0aGVuCiAgICBlY2hvICJQcmUgYm9vdCB0dW5pbmcgY29uZmlndXJhdGlvbiBhbHJlYWR5IGFwcGxpZWQiCmVsc2UKICAgICNTZXQgSVJRIGJhbGFuY2UgYmFubmVkIGNwdXMKICAgIGlmIFsgISAtZiAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlIF07IHRoZW4KICAgICAgICB0b3VjaCAvZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlCiAgICBmaQoKICAgIGlmIGdyZXAgLWxzICJJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSIgL2V0Yy9zeXNjb25maWcvaXJxYmFsYW5jZTsgdGhlbgogICAgICAgIHNlZCAtaSAicy9eLipJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPS4qJC9JUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSR7UkVTRVJWRURfQ1BVX01BU0tfSU5WRVJUfS8iIC9ldGMvc3lzY29uZmlnL2lycWJhbGFuY2UKICAgIGVsc2UKICAgICAgICBlY2hvICJJUlFCQUxBTkNFX0JBTk5FRF9DUFVTPSR7UkVTRVJWRURfQ1BVX01BU0tfSU5WRVJUfSIgPj4vZXRjL3N5c2NvbmZpZy9pcnFiYWxhbmNlCiAgICBmaQoKICAgIHJwbS1vc3RyZWUgaW5pdHJhbWZzIC0tZW5hYmxlIC0tYXJnPS1JIC0tYXJnPSIke1NZU1RFTV9DT05GSUdfRklMRX0gJHtTWVNURU1fQ09ORklHX0NVU1RPTV9GSUxFfSIgCgogICAgdG91Y2ggL3Zhci9yZWJvb3QKZmkK
Verification:
Filesystem: root
Mode: 448
Path: /usr/local/bin/pre-boot-tuning.sh
Contents:
Source: data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKCm5vZGVzX3BhdGg9Ii9zeXMvZGV2aWNlcy9zeXN0ZW0vbm9kZSIKaHVnZXBhZ2VzX2ZpbGU9IiR7bm9kZXNfcGF0aH0vbm9kZSR7TlVNQV9OT0RFfS9odWdlcGFnZXMvaHVnZXBhZ2VzLSR7SFVHRVBBR0VTX1NJWkV9a0IvbnJfaHVnZXBhZ2VzIgoKaWYgWyAhIC1mICAke2h1Z2VwYWdlc19maWxlfSBdOyB0aGVuCiAgICBlY2hvICJFUlJPUjogJHtodWdlcGFnZXNfZmlsZX0gZG9lcyBub3QgZXhpc3QiCiAgICBleGl0IDEKZmkKCmVjaG8gJHtIVUdFUEFHRVNfQ09VTlR9ID4gJHtodWdlcGFnZXNfZmlsZX0K
Verification:
Filesystem: root
Mode: 448
Path: /usr/local/bin/hugepages-allocation.sh
Contents:
Source: data:text/plain;charset=utf-8;base64,IyEvdXNyL2Jpbi9lbnYgYmFzaAoKc2V0IC1ldW8gcGlwZWZhaWwKCmlmIFtbIC1mIC92YXIvcmVib290IF1dOyB0aGVuIAogICAgcm0gLWYgL3Zhci9yZWJvb3QKICAgIGVjaG8gIkZpbGUgL3Zhci9yZWJvb3QgZXhpc3RzLCBpbml0aWF0ZSByZWJvb3QiCiAgICBzeXN0ZW1jdGwgcmVib290CmZpCg==
Verification:
Filesystem: root
Mode: 448
Path: /usr/local/bin/reboot.sh
Contents:
Source: data:text/plain;charset=utf-8;base64,W01hbmFnZXJdCkNQVUFmZmluaXR5PTA=
Verification:
Filesystem: root
Mode: 448
Path: /etc/systemd/system.conf.d/setAffinity.conf
Systemd:
Units:
Contents: [Unit]
Description=Preboot tuning patch
Before=kubelet.service
Before=reboot.service
[Service]
Environment=RESERVED_CPUS=0
Environment=RESERVED_CPU_MASK_INVERT=ffffffff,fffffffe
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/local/bin/pre-boot-tuning.sh
[Install]
WantedBy=multi-user.target
Enabled: true
Name: pre-boot-tuning.service
Contents: [Unit]
Description=Reboot initiated by pre-boot-tuning
Wants=network-online.target
After=network-online.target
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/local/bin/reboot.sh
[Install]
WantedBy=multi-user.target
Enabled: true
Name: reboot.service
Fips: false
Kernel Arguments:
nohz=on
nosoftlockup
skew_tick=1
intel_pstate=disable
intel_iommu=on
iommu=pt
rcu_nocbs=1-3
tuned.non_isolcpus=00000001
default_hugepagesz=1G
hugepagesz=1G
hugepages=1
Kernel Type: realtime
Os Image URL:
Events: <none>
# oc get node worker-1 -o yaml
apiVersion: v1
kind: Node
metadata:
annotations:
machine.openshift.io/machine: openshift-machine-api/ostest-worker-0-f78ck
machineconfiguration.openshift.io/currentConfig: rendered-worker-2e3a4df0240eb8745b2845521a1f28f8
machineconfiguration.openshift.io/desiredConfig: rendered-worker-2e3a4df0240eb8745b2845521a1f28f8
machineconfiguration.openshift.io/reason: ""
machineconfiguration.openshift.io/state: Done
volumes.kubernetes.io/controller-managed-attach-detach: "true"
creationTimestamp: "2020-06-09T08:56:29Z"
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: worker-1
kubernetes.io/os: linux
node-role.kubernetes.io/worker: ""
node.openshift.io/os_id: rhcos
ptp/slave: ""
name: worker-1
resourceVersion: "97743"
selfLink: /api/v1/nodes/worker-1
uid: 4560f482-ce60-4aa2-8237-5864c38727da
spec: {}
status:
addresses:
- address: 192.168.111.24
type: InternalIP
- address: worker-1
type: Hostname
allocatable:
cpu: 3500m
ephemeral-storage: "17683605064"
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 6995204Ki
pods: "250"
capacity:
cpu: "4"
ephemeral-storage: 19876Mi
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 8146180Ki
pods: "250"
conditions:
- lastHeartbeatTime: "2020-06-09T11:09:13Z"
lastTransitionTime: "2020-06-09T10:49:13Z"
message: kubelet has sufficient memory available
reason: KubeletHasSufficientMemory
status: "False"
type: MemoryPressure
- lastHeartbeatTime: "2020-06-09T11:09:13Z"
lastTransitionTime: "2020-06-09T10:49:13Z"
message: kubelet has no disk pressure
reason: KubeletHasNoDiskPressure
status: "False"
type: DiskPressure
- lastHeartbeatTime: "2020-06-09T11:09:13Z"
lastTransitionTime: "2020-06-09T10:49:13Z"
message: kubelet has sufficient PID available
reason: KubeletHasSufficientPID
status: "False"
type: PIDPressure
- lastHeartbeatTime: "2020-06-09T11:09:13Z"
lastTransitionTime: "2020-06-09T10:49:13Z"
message: kubelet is posting ready status
reason: KubeletReady
status: "True"
type: Ready
daemonEndpoints:
kubeletEndpoint:
Port: 10250
images:
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ec5c9406f29ac98580228688db7849590f949e01df39952327adfb261197c16c
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 773373898
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d5441ced3440395aa27b2d6ceec3315acf55f2dccc19d76a2f0e704d30b77cc0
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 467404338
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a16b0dab9867070830d18ba5cab98d02b92fa367b69d464e6e22860f0a6293e0
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 454102624
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:398e40715b7e39428c2c8d8cedfcc024cf0bed8844b80bf22ff4107345e2e298
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 429949466
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:79a4986a92c449401e69c14f34f2e0ad92bc219dd43f716ed806d551e2e09f72
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 428902230
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d879eb6e4426c976d1fffef15a38ff2454bca5d382c13c7b157181db9373f43e
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 367621904
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:81dce0d0947963d8ce70c74bc63eaf2f2cdf17c40b839623c39d52f6a5ef2d61
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 364552618
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:127b6610891c6e38d8ee474e134b9799ecc6cf0ae9be659f06f933ea84e7c877
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 342705361
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b89e9a469c30c0b707766d0ae31665ccd6a13135c8074087f617ddee12860774
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 338328160
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9e4d6bd70d5c481267ec2c420b7ee40d2addecb190e4787ac912996a29c90d00
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 334993182
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba38bc55a210372ca5243310f470d98e6b0d6712609eccfbcc4e5961c1cb0205
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 333782407
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8376e01b6c5a4f045548a16946d5b24918ddd88f156e26249e2ff38ab539d512
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 325752144
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c5ec8d53faf22153f54fc003b094fa263b769b50a457c878824c800e344b1b2f
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 318103617
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:19f3539c87ed84a02ee5a15438928083aa9e82294ff9a8dcb86d6eeeeff68c0b
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 311095159
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3e185bf343f119d7e64d3d4297ad358c86a613bd047f8d45176a0a484f5d87f3
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 305661722
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:43e5bf12c08a33946f99f475ae062187242d5ec74b7d23b48621a1b9f91e44b3
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 299586006
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:12cce448b8888f398604eeb6a3a7a14c2850cd9dfd463229bd0735aa68ada885
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 283513263
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:294f21bc3baa4221f3c02ed3e2dadcf3e61490813691579be9f05d69b3639f3c
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 277408258
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dbc06ac527acaed79a4b189ba091fb4a37bfaa4b061f399011e07163fa4fcd26
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 277232582
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9166f5f53cbaddbf11d4895c96d73b16a9b9002cb3fadefd9f7e13e96ee16edc
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 276916452
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c69cb2203988cae3506a58d8616ac928b7f2f796d147fbc6640f644398fb1949
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 267689884
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b3e62612cd9192b7af779ff34a95b039cac3058b0c39129eb37a7deec08908e
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 264283423
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d7a71a527f95dd9b607cb2db3a2809003818573e6f550a96238ac2a55cc8a635
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 258272205
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e08c2b9e5bf641a524636e7b699830d27464636e431dd992a4a49f56b27d4dd2
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 257276911
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c14d0178b17e5184c17090bcc624da3e21f84497140865a2e41d6c3d54a0e42a
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 255893796
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:570784e273b695e401ef77926ca6acc2af523f706e9eff1ddd2bba1568d610a5
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 251107160
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f5e521c35f624e1d40570f476c590490ee8624c55834c22f3dfa7e7ab4fd79f
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 243361727
- names:
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b56ea96a9f19d51b2822719cb5b7e4cd34a5253f97f34538b64f3a7111489d8
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<none>
sizeBytes: 237382357
nodeInfo:
architecture: amd64
bootID: e26520f5-f180-4826-ac7a-b206ca16f76b
containerRuntimeVersion: cri-o://1.17.4-14.dev.rhaos4.4.gitb93af5d.el8
kernelVersion: 4.18.0-147.8.1.el8_1.x86_64
kubeProxyVersion: v1.17.1+3f6f40d
kubeletVersion: v1.17.1+3f6f40d
machineID: 7751b004e15743aea649269a173add91
operatingSystem: linux
osImage: Red Hat Enterprise Linux CoreOS 44.81.202006080130-0 (Ootpa)
systemUUID: 7751b004-e157-43ae-a649-269a173add91
I always wondered if the label selector in mcp is correct:
Machine Config Selector:
Match Expressions:
Key: machineconfiguration.openshift.io/role
Operator: In
Values:
worker-cnf
worker
Why not match just on worker-cnf?
Automation hit this issue again. Observe that worker-1 has no 'worker-cnf' label, however, RT kernel is still installed. # oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master-0 Ready master 4h18m v1.17.1+912792b 192.168.111.20 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.20.1.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 master-1 Ready master 4h17m v1.17.1+912792b 192.168.111.21 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.20.1.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 master-2 Ready master 4h17m v1.17.1+912792b 192.168.111.22 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.20.1.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 worker-0 Ready worker,worker-cnf 3h31m v1.17.1+912792b 192.168.111.23 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.8.1.rt24.101.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 worker-1 Ready worker 3h30m v1.17.1+912792b 192.168.111.24 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.8.1.rt24.101.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 worker-2 Ready worker 3h25m v1.17.1+912792b 192.168.111.25 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.20.1.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 worker-3 Ready worker 3h34m v1.17.1+912792b 192.168.111.26 <none> Red Hat Enterprise Linux CoreOS 44.81.202006161946-0 (Ootpa) 4.18.0-147.20.1.el8_1.x86_64 cri-o://1.17.4-17.dev.rhaos4.4.gitf0cfdfc.el8 From machine-config-daemon logs I see it tries to remove rt-kernel and all additional stuff from /proc/cmdline (at least log shows that rpm-ostree command was executed without errors):
I0623 15:45:13.267134 2751 update.go:1291] Running rpm-ostree [kargs --delete=nohz=on --delete=nosoftlockup --delete=skew_tick=1 --delete=intel_pstate=disable --delete=intel_iommu=on --delete=iommu=pt --delete=rcu_nocbs=1-3 --delete=tuned.non_isolcpus=00000001 --delete=default_hugepagesz=1G --delete=hugepagesz=1G --delete=hugepages=1]
I0623 15:47:47.761103 2751 update.go:1291] Initiating switch from kernel realtime to default
I0623 15:47:47.769240 2751 update.go:1291] Switching to kernelType=default, invoking rpm-ostree ["override" "reset" "kernel" "kernel-core" "kernel-modules" "kernel-modules-extra" "--uninstall" "kernel-rt-core" "--uninstall" "kernel-rt-modules" "--uninstall" "kernel-rt-modules-extra"]
I0623 15:49:48.824461 2751 update.go:1291] initiating reboot: Node will reboot into config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:49:50.421631 2751 daemon.go:553] Shutting down MachineConfigDaemon
I0623 15:51:05.278092 2586 start.go:74] Version: v4.4.0-202006160135-dirty (b6c95fea3987483780994c8a5809a6afd15a633d)
I0623 15:51:05.290953 2586 start.go:84] Calling chroot("/rootfs")
I0623 15:51:05.293832 2586 rpm-ostree.go:366] Running captured: rpm-ostree status --json
I0623 15:51:05.835798 2586 daemon.go:209] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67 (44.81.202006161946-0)
I0623 15:51:05.845673 2586 metrics.go:106] Registering Prometheus metrics
I0623 15:51:05.845776 2586 metrics.go:111] Starting metrics listener on 127.0.0.1:8797
I0623 15:51:05.861258 2586 update.go:1291] Starting to manage node: worker-1
I0623 15:51:05.892696 2586 rpm-ostree.go:366] Running captured: rpm-ostree status
I0623 15:51:06.170340 2586 daemon.go:778] State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67
CustomOrigin: Managed by machine-config-operator
Version: 44.81.202006161946-0 (2020-06-16T19:52:18Z)
RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-147.20.1.el8_1
LocalPackages: kernel-rt-core-4.18.0-147.8.1.rt24.101.el8_1.x86_64
kernel-rt-modules-4.18.0-147.8.1.rt24.101.el8_1.x86_64
kernel-rt-modules-extra-4.18.0-147.8.1.rt24.101.el8_1.x86_64
Initramfs: -I '/etc/systemd/system.conf /etc/systemd/system.conf.d/setAffinity.conf'
pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0dcbd38df25ba42c4f62aba0520509e0fcab5c3ca2679df2f26d1e7de78f1f67
CustomOrigin: Managed by machine-config-operator
Version: 44.81.202006161946-0 (2020-06-16T19:52:18Z)
RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-147.20.1.el8_1
LocalPackages: kernel-rt-core-4.18.0-147.8.1.rt24.101.el8_1.x86_64
kernel-rt-modules-4.18.0-147.8.1.rt24.101.el8_1.x86_64
kernel-rt-modules-extra-4.18.0-147.8.1.rt24.101.el8_1.x86_64
I0623 15:51:06.170367 2586 rpm-ostree.go:366] Running captured: journalctl --list-boots
I0623 15:51:06.214931 2586 daemon.go:785] journalctl --list-boots:
-6 7348b2aa010442fe8f0130c400f2934e Tue 2020-06-23 13:14:00 UTC—Tue 2020-06-23 13:27:49 UTC
-5 7bddde3d635b4625abd942c8615e8d61 Tue 2020-06-23 13:28:02 UTC—Tue 2020-06-23 14:08:06 UTC
-4 9de73ead42a442d089fef57fbe57cde9 Tue 2020-06-23 14:08:19 UTC—Tue 2020-06-23 14:30:48 UTC
-3 c35644ae4cc34beba4c9a48b3c705c23 Tue 2020-06-23 14:30:59 UTC—Tue 2020-06-23 15:35:30 UTC
-2 3ca184967bfd49e29b7999321d362137 Tue 2020-06-23 15:35:44 UTC—Tue 2020-06-23 15:38:48 UTC
-1 1a83ee5e6c48431e9ef47ae63ab554dd Tue 2020-06-23 15:39:04 UTC—Tue 2020-06-23 15:49:55 UTC
0 b749a5bc768f4a1c9a9e96e100617aeb Tue 2020-06-23 15:50:11 UTC—Tue 2020-06-23 15:51:06 UTC
I0623 15:51:06.215422 2586 daemon.go:528] Starting MachineConfigDaemon
I0623 15:51:06.215734 2586 daemon.go:535] Enabling Kubelet Healthz Monitor
E0623 15:51:09.944725 2586 reflector.go:153] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
E0623 15:51:09.945801 2586 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
I0623 15:56:02.145210 2586 daemon.go:731] Current config: rendered-worker-test-ca4f34a17c0142a571e1a28dbd605d89
I0623 15:56:02.145255 2586 daemon.go:732] Desired config: rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.161923 2586 update.go:1291] Disk currentConfig rendered-worker-266d98cce7d051b2576afb3add50ec44 overrides node annotation rendered-worker-test-ca4f34a17c0142a571e1a28dbd605d89
I0623 15:56:02.169620 2586 daemon.go:955] Validating against pending config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.176061 2586 daemon.go:971] Validated on-disk state
I0623 15:56:02.196827 2586 daemon.go:1005] Completing pending config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.204937 2586 update.go:1291] completed update for config rendered-worker-266d98cce7d051b2576afb3add50ec44
I0623 15:56:02.212888 2586 daemon.go:1021] In desired config rendered-worker-266d98cce7d051b2576afb3add50ec44
but kernel was not reverted back.. as well as values in cmdline are still present:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-fc107e60e7e98cdbc54ca91fd9294d4cbf2ff5447d7b81511d7204df2f0b0e6c/vmlinuz-4.18.0-147.8.1.rt24.101.el8_1.x86_64 rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.0/rhcos/fc107e60e7e98cdbc54ca91fd9294d4cbf2ff5447d7b81511d7204df2f0b0e6c/0 ignition.platform.id=openstack nohz=on nosoftlockup skew_tick=1 intel_pstate=disable intel_iommu=on iommu=pt rcu_nocbs=1-3 tuned.non_isolcpus=00000001 default_hugepagesz=1G hugepagesz=1G hugepages=1
There is a log from journalctl: Jun 27 14:27:25 worker-4 systemd[1]: Unmounting Boot partition... Jun 27 14:27:25 worker-4 systemd[1]: Unmounting /var/lib/containers/storage/overlay... Jun 27 14:27:25 worker-4 systemd[1]: Stopped target Host and Network Name Lookups. Jun 27 14:27:25 worker-4 systemd[1]: Removed slice system-serial\x2dgetty.slice. Jun 27 14:27:25 worker-4 systemd[1]: system-serial\x2dgetty.slice: Consumed 175ms CPU time Jun 27 14:27:25 worker-4 systemd[1]: Stopping Permit User Sessions... Jun 27 14:27:25 worker-4 systemd[1]: Removed slice system-getty.slice. Jun 27 14:27:25 worker-4 systemd[1]: system-getty.slice: Consumed 37ms CPU time Jun 27 14:27:25 worker-4 systemd[1]: Unmounted Boot partition. Jun 27 14:27:25 worker-4 systemd[1]: boot.mount: Consumed 26ms CPU time . Jun 27 14:27:25 worker-4 ostree[55230]: error: Unexpected state: /run/ostree-booted found, but no /boot/loader directory Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited status=1 Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'. Jun 27 14:27:25 worker-4 systemd[1]: Stopped OSTree Finalize Staged Deployment. Jun 27 14:27:25 worker-4 systemd[1]: ostree-finalize-staged.service: Consumed 37ms CPU time As i understand, it unmounts /boot/ partition first and after that ostree can't finish job there.. Closing as the fix happened in RHCOS and will be part of the next OCP release. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |
Description of problem: Removing 'worker-cnf' label from the worker node should remove RT Kernel from the worker node thereby reverting it to its previous state. Version-Release number of selected component (if applicable): v4.4.0-77 How reproducible: Not consistently reproducible. But we hit this issue twice during our nightly test execution. Steps to Reproduce: 1. Install OCP 4.4.7 2. Label worker node with 'worker-cnf'. 3. Deploy Performance Addon Operator and Performance Profile with 'realTimeKernel: {enabled: true}' 3. Ensure that RT Kernel is installed on 'worker-cnf' labeled worker node. 4. Remove 'worker-cnf' label. 5. Verify if RT Kernel is reverted back to non-RT Kernel. Actual results: RT Kernel is still installed. Expected results: RT Kernel is reverted back to non-RT Kernel Additional info: