Description of problem: The kernel bug in the HRTICK subsystem was resolved and the fixed versions are getting into RHCOS now. So it is time for us to re-enable stalld in the PAO tuned profile. Version-Release number of selected component (if applicable): Fix is present in: - kernel-4.18.0-240.20.1.el8_3 (released on Apr 6th, see bug 1930735) - as well as RHEL 8.4 nightly (bug 1912118) RHCOS that contains this kernel: 4.7 - builds from Apr 8th and Apr 9th have kernel-4.18.0-240.22.1.el8_3 4.8 - none yet (last build from Apr 4th is too old) Action items: Lets prepare the patches for 4.8 and merge them to 4.8 branch once there is a sane RHCOS build and backport to 4.7 right after that (have the patches ready).
What happens with existing deployments of OpenShift that are upgraded? Will stalld be re-enabled on those clusters, or just net-new installs?
I think it will be re-enabled, but obviously we need to test and verify that.
The current OCP 4.7 nightlies already contain the right kernel: 4.7.0-0.nightly-2021-04-10-082109 https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4.7.0-0.nightly/release/4.7.0-0.nightly-2021-04-10-082109 https://releases-rhcos-art.cloud.privileged.psi.redhat.com/contents.html?stream=releases%2Frhcos-4.7&release=47.83.202104090345-0 However 4.8 nightlies have been broken for some time time and the RHCOS there is not yet up-to-date https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/#4.8.0-0.nightly
OCP 4.8 CI already contains the fixed kernel (based on 8.4 beta)! https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4.8.0-0.ci/release/4.8.0-0.ci-2021-04-14-161658 https://releases-rhcos-art.cloud.privileged.psi.redhat.com/contents.html?stream=releases%2Frhcos-4.8&release=48.84.202104131355-0 kernel 4.18.0 240.22.1.el8_3 x86_64
Awesome: [service] service.stalld=start,enable This is what we were looking for.