Bug 2111817 - rpm-ostreed start timeout on nodes with medium/high load
Summary: rpm-ostreed start timeout on nodes with medium/high load
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.11
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.12.0
Assignee: Colin Walters
QA Contact: Rio Liu
URL:
Whiteboard: UpdateRecommendationsBlocked
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-28 09:03 UTC by Rio Liu
Modified: 2023-04-19 11:20 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:53:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
rpm-ostreed full log (33.33 KB, text/plain)
2022-07-28 09:03 UTC, Rio Liu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github coreos rpm-ostree pull 3905 0 None Merged unit: Bump `TimeoutStartSec=5m` 2022-08-15 18:48:37 UTC
Github openshift machine-config-operator pull 3291 0 None Merged Bug 2111817: daemon: Add a workaround for bug 2111817 2022-08-17 09:38:21 UTC
Red Hat Knowledge Base (Solution) 6972394 0 None None None 2022-08-17 21:46:38 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:54:27 UTC

Description Rio Liu 2022-07-28 09:03:45 UTC
Created attachment 1899889 [details]
rpm-ostreed full log

Thanks for reporting your issue!

In order for the CoreOS team to be able to quickly and successfully triage your issue, please fill out the following template as completely as possible.

Be ready for follow-up questions and please respond in a timely manner.

If we can't reproduce a bug, we might close your issue.

---

OCP Version at Install Time: 4.11.0-rc.5
RHCOS Version at Install Time: 411.86.202207210724-0
OCP Version after Upgrade (if applicable): 4.11.0-rc.6
RHCOS Version after Upgrade (if applicable): 411.86.202207260558-0
Platform: AWS, Azure, bare metal, GCP, vSphere, etc -  AWS
Architecture: x86_64/ppc64le/s390x - aarch64


What are you trying to do? What is your use case?

upgrade cluster from 4.11.0-rc.5-aarch64 to 4.11.0-rc.6-aarch64

What happened? What went wrong or what did you expect?

one node is degraded, with below error. 
>> MCD log

2022-07-27T16:23:02.460274359+00:00 stderr F I0727 16:23:02.460033    2824 rpm-ostree.go:324] Running captured: rpm-ostree kargs
2022-07-27T16:25:59.136027675+00:00 stderr F W0727 16:25:58.797941    2824 daemon.go:937] Got an error from auxiliary tools: kubelet health check has failed 1 times: Get "http://localhost:10248/healthz": context deadline exceeded
2022-07-27T16:26:31.156364046+00:00 stderr F E0727 16:26:31.037973    2824 daemon.go:579] Preflight config drift check failed: error running rpm-ostree kargs: exit status 1
2022-07-27T16:26:31.156364046+00:00 stderr F Job for rpm-ostreed.service failed because a timeout was exceeded.
2022-07-27T16:26:31.156364046+00:00 stderr F See "systemctl status rpm-ostreed.service" and "journalctl -xe" for details.
2022-07-27T16:26:31.156364046+00:00 stderr F ^[[0m^[[31merror: ^[[0mexit status: 1
2022-07-27T16:26:31.506536490+00:00 stderr F E0727 16:26:31.506484    2824 writer.go:200] Marking Degraded due to: error running rpm-ostree kargs: exit status 1
2022-07-27T16:26:31.506536490+00:00 stderr F Job for rpm-ostreed.service failed because a timeout was exceeded.
2022-07-27T16:26:31.506536490+00:00 stderr F See "systemctl status rpm-ostreed.service" and "journalctl -xe" for details.
2022-07-27T16:26:31.506536490+00:00 stderr F ^[[0m^[[31merror: ^[[0mexit status: 1
2022-07-27T16:26:34.666869340+00:00 stderr F I0727 16:26:34.666830    2824 daemon.go:500] Transitioned from state: Done -> Degraded

>> checkout rpm-ostree service

sh-4.4# systemctl status rpm-ostreed
● rpm-ostreed.service - rpm-ostree System Management Daemon
   Loaded: loaded (/usr/lib/systemd/system/rpm-ostreed.service; static; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/rpm-ostreed.service.d
           └─startlimit.conf
   Active: inactive (dead)
     Docs: man:rpm-ostree(1)

>> got some errors from journal log of rpm-ostreed

Jul 27 15:39:58 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: Succeeded.
Jul 27 15:39:58 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: Consumed 1.989s CPU time
Jul 27 16:23:02 ip-10-0-77-165 systemd[1]: Starting rpm-ostree System Management Daemon...
Jul 27 16:24:33 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: start operation timed out. Terminating.
Jul 27 16:24:34 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: Failed with result 'timeout'.
Jul 27 16:24:34 ip-10-0-77-165 systemd[1]: Failed to start rpm-ostree System Management Daemon.
Jul 27 16:24:34 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: Consumed 31.054s CPU time
Jul 27 16:26:37 ip-10-0-77-165 systemd[1]: Starting rpm-ostree System Management Daemon...
Jul 27 16:28:29 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: start operation timed out. Terminating.
Jul 27 16:28:33 ip-10-0-77-165 systemd[1]: rpm-ostreed.service: Failed with result 'timeout'.
Jul 27 16:28:33 ip-10-0-77-165 systemd[1]: Failed to start rpm-ostree System Management Daemon.

attached full log

What are the steps to reproduce your issue? Please try to reduce these steps to something that can be reproduced with a single RHCOS node.

upgrade from ocp 4.11.rc5 to rc6 via job https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-upgrade/job/upgrade-pipeline/18076


Please add anything else that might be useful, for example:
- kernel command line (`cat /proc/cmdline`)

sh-4.4# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-4c33b63a3c959abf94a68f35de38c77aa8fbdc29b059e4174ccd9b639bc93df8/vmlinuz-4.18.0-372.16.1.el8_6.aarch64 random.trust_cpu=on ostree=/ostree/boot.0/rhcos/4c33b63a3c959abf94a68f35de38c77aa8fbdc29b059e4174ccd9b639bc93df8/0 ignition.platform.id=aws fips=1 boot=LABEL=boot root=UUID=4d7c7757-3243-4f71-bd21-8075be032112 rw rootflags=prjquota

- contents of `/etc/NetworkManager/system-connections/`

sh-4.4# ls /etc/NetworkManager/system-connections/
br-ex.nmconnection  ovs-if-br-ex.nmconnection  ovs-if-phys0.nmconnection  ovs-port-br-ex.nmconnection  ovs-port-phys0.nmconnection

- contents of `/etc/sysconfig/network-scripts/`

empty

Comment 1 Colin Walters 2022-07-28 13:47:13 UTC
Do these AWS instances not have provisioned IOPS?  When we run out of IOPS on non-provisioned instances, I think we'll see symptoms like this.
More information: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html

rpm-ostree is one of the first somewhat I/O intensive processes we run, but notice also:

2022-07-27T16:25:59.136027675+00:00 stderr F W0727 16:25:58.797941    2824 daemon.go:937] Got an error from auxiliary tools: kubelet health check has failed 1 times: Get "http://localhost:10248/healthz": context deadline exceeded

i.e. kubelet is also failing to respond.

Comment 2 Rio Liu 2022-07-29 02:59:03 UTC
Colin,

I rebuilt the job to setup same cluster, found that the EC2 instance of the worker node does not have provisioned IOPS SSD volume. the volume type is gp3, IOPS is 3000. attached screenshot

Comment 4 zhaozhanqi 2022-07-29 07:42:27 UTC
seems I can reproduce this issue after restart openvswitch service by 'systemctl restart openvswitch' on one worker. 


with profile:  ipi-on-aws/versioned-installer-private_cluster-ovn-fips-etcd_encryption-arm-ci
version: 4.11.0-rc.5

# oc get node
NAME                                        STATUS     ROLES    AGE     VERSION
ip-10-0-48-223.us-east-2.compute.internal   Ready      worker   4h51m   v1.24.0+9546431
ip-10-0-57-4.us-east-2.compute.internal     Ready      master   4h59m   v1.24.0+9546431
ip-10-0-61-41.us-east-2.compute.internal    Ready      worker   4h49m   v1.24.0+9546431
ip-10-0-64-187.us-east-2.compute.internal   Ready      master   4h59m   v1.24.0+9546431
ip-10-0-73-158.us-east-2.compute.internal   Ready      master   4h59m   v1.24.0+9546431
ip-10-0-78-236.us-east-2.compute.internal   NotReady   worker   4h49m   v1.24.0+9546431


Events:
  Type     Reason                           Age                   From                 Message
  ----     ------                           ----                  ----                 -------
  Normal   NodeNotReady                     146m (x2 over 4h22m)  node-controller      Node ip-10-0-78-236.us-east-2.compute.internal status is now: NodeNotReady
  Warning  PreflightConfigDriftCheckFailed  136m (x5 over 150m)   machineconfigdaemon  error running rpm-ostree kargs: exit status 1
Job for rpm-ostreed.service failed because a timeout was exceeded.
See "systemctl status rpm-ostreed.service" and "journalctl -xe" for details.
error: exit status: 1

Comment 5 zhaozhanqi 2022-07-29 08:50:25 UTC
please ignore comment 4.  it's also due to upgrade

Comment 6 Rio Liu 2022-07-29 15:04:53 UTC
I setup a cluster with volume type io1 and iops 4900
>> job param: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/125635/parameters/
>> launch vars
vm_rootvolume_masters:
        type: io1
        iops: 4900
        size: 100
vm_rootvolume_workers:
        type: io1
        iops: 4900
        size: 100

>> verify the volume info on aws
$ aws ec2 describe-instances --instance-ids i-0030a7a0de27c7f73 |grep VolumeId
          VolumeId: vol-0f9d425cf7ebfa57c

$ aws ec2 describe-volumes --volume-ids vol-0f9d425cf7ebfa57c
- Volumes:
  - Attachments:
    - AttachTime: '2022-07-29T11:31:17+00:00'
      DeleteOnTermination: true
      Device: /dev/xvda
      InstanceId: i-0030a7a0de27c7f73
      State: attached
      VolumeId: vol-0f9d425cf7ebfa57c
    AvailabilityZone: us-east-2b
    CreateTime: '2022-07-29T11:31:17.883000+00:00'
    Encrypted: true
>>    Iops: 4900
    KmsKeyId: arn:aws:kms:us-east-2:301721915996:key/9fe48bb2-a03e-4efe-86ec-4796e044cc8b
    MultiAttachEnabled: false
    Size: 100
    SnapshotId: snap-0e17aaeecd7bd0701
    State: in-use
    Tags:
    - Key: kubernetes.io/cluster/rioliu-072905-kgx5q
      Value: owned
    - Key: Name
      Value: rioliu-072905-kgx5q-worker-us-east-2b-ncknw
    VolumeId: vol-0f9d425cf7ebfa57c
>>    VolumeType: io1

>> trigger upgrade job with this cluster
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-upgrade/job/upgrade-pipeline/18297/console

issue is reproduced again. 

# oc describe ip-10-0-71-208.us-east-2.compute.internal
error: the server doesn't have a resource type "ip-10-0-71-208"
[root@rioliu0712 ~]#
[root@rioliu0712 ~]# oc describe node/ip-10-0-71-208.us-east-2.compute.internal
Name:               ip-10-0-71-208.us-east-2.compute.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=arm64
                    beta.kubernetes.io/instance-type=m6g.xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-2
                    failure-domain.beta.kubernetes.io/zone=us-east-2b
                    kubernetes.io/arch=arm64
                    kubernetes.io/hostname=ip-10-0-71-208.us-east-2.compute.internal
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=m6g.xlarge
                    node.openshift.io/os_id=rhcos
                    topology.ebs.csi.aws.com/zone=us-east-2b
                    topology.kubernetes.io/region=us-east-2
                    topology.kubernetes.io/zone=us-east-2b
Annotations:        cloud.network.openshift.io/egress-ipconfig:
                      [{"interface":"eni-03bdec19acb488a1c","ifaddr":{"ipv4":"10.0.64.0/20"},"capacity":{"ipv4":14,"ipv6":15}}]
                    csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-087c0b6b14cb40907"}
                    k8s.ovn.org/host-addresses: ["10.0.71.208"]
                    k8s.ovn.org/l3-gateway-config:
                      {"default":{"mode":"shared","interface-id":"br-ex_ip-10-0-71-208.us-east-2.compute.internal","mac-address":"06:15:da:9b:23:34","ip-address...
                    k8s.ovn.org/node-chassis-id: 89f00fd6-c371-4154-92c1-1a1ee1a5fc75
                    k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.7/16"}
                    k8s.ovn.org/node-mgmt-port-mac-address: 2a:08:52:f5:18:9f
                    k8s.ovn.org/node-primary-ifaddr: {"ipv4":"10.0.71.208/20"}
                    k8s.ovn.org/node-subnets: {"default":"10.129.2.0/23"}
                    machine.openshift.io/machine: openshift-machine-api/rioliu-072905-kgx5q-worker-us-east-2b-6s2f7
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-1bc2b91629f6578254beac09dec237fa
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-8d1ce60505440cf572067c1dc4cfe7d3
                    machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-worker-1bc2b91629f6578254beac09dec237fa
                    machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-worker-1bc2b91629f6578254beac09dec237fa
                    machineconfiguration.openshift.io/reason:
                      error running rpm-ostree kargs: exit status 1
                      Job for rpm-ostreed.service failed because a timeout was exceeded.
                      See "systemctl status rpm-ostreed.service" and "journalctl -xe" for details.
                      error: exit status: 1
                    machineconfiguration.openshift.io/state: Degraded
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 29 Jul 2022 07:43:08 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-0-71-208.us-east-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Fri, 29 Jul 2022 09:40:46 -0400
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 29 Jul 2022 09:39:55 -0400   Fri, 29 Jul 2022 09:38:37 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 29 Jul 2022 09:39:55 -0400   Fri, 29 Jul 2022 09:38:37 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 29 Jul 2022 09:39:55 -0400   Fri, 29 Jul 2022 09:38:37 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Fri, 29 Jul 2022 09:39:55 -0400   Fri, 29 Jul 2022 09:39:55 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.0.71.208
  Hostname:     ip-10-0-71-208.us-east-2.compute.internal
  InternalDNS:  ip-10-0-71-208.us-east-2.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  39
  cpu:                         4
  ephemeral-storage:           104322028Ki
  hugepages-16Gi:              0
  hugepages-2Mi:               0
  hugepages-512Mi:             0
  memory:                      16275968Ki
  pods:                        250
Allocatable:
  attachable-volumes-aws-ebs:  39
  cpu:                         3500m
  ephemeral-storage:           96143180846
  hugepages-16Gi:              0
  hugepages-2Mi:               0
  hugepages-512Mi:             0
  memory:                      15124992Ki
  pods:                        250
System Info:
  Machine ID:                             e78999077c92451a932a1935833cbd5e
  System UUID:                            ec239dd1-ab11-bda3-667f-2ac9fc480fa0
  Boot ID:                                8eff4ff0-b4e9-49b3-865a-8f9b81e5f6ca
  Kernel Version:                         4.18.0-372.16.1.el8_6.aarch64
  OS Image:                               Red Hat Enterprise Linux CoreOS 411.86.202207201758-0 (Ootpa)
  Operating System:                       linux
  Architecture:                           arm64
  Container Runtime Version:              cri-o://1.24.1-11.rhaos4.11.gitb0d2ef3.el8
  Kubelet Version:                        v1.24.0+9546431
  Kube-Proxy Version:                     v1.24.0+9546431
ProviderID:                               aws:///us-east-2b/i-087c0b6b14cb40907
Non-terminated Pods:                      (60 in total)
  Namespace                               Name                                                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                               ----                                                      ------------  ----------  ---------------  -------------  ---
  clusterbuster-0                         clusterbuster-0-1-client-6f747bd756-5j57w                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-0                         clusterbuster-0-1-client-6f747bd756-b8dp9                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-0                         clusterbuster-0-1-client-6f747bd756-p7vjq                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-0                         clusterbuster-0-1-client-6f747bd756-txdlk                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-1                         clusterbuster-1-1-client-7cf6c68748-dwzrl                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-1                         clusterbuster-1-1-client-7cf6c68748-n5jc2                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-1                         clusterbuster-1-1-client-7cf6c68748-q24p8                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-1                         clusterbuster-1-1-client-7cf6c68748-r794b                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-2                         clusterbuster-2-0-client-65884ccdff-28fhb                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-2                         clusterbuster-2-0-client-65884ccdff-wjb2p                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-2                         clusterbuster-2-1-client-7d9884598f-225p2                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-2                         clusterbuster-2-1-client-7d9884598f-5xt45                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-0-client-7c86b7d9f-5pp52                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-0-client-7c86b7d9f-6k8vg                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-0-client-7c86b7d9f-86qxh                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-0-client-7c86b7d9f-lvdcp                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-1-client-cb6d77c5f-ltqv9                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-1-client-cb6d77c5f-rhmnl                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-3                         clusterbuster-3-1-client-cb6d77c5f-x24x4                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-4                         clusterbuster-4-0-client-54f75b8ffb-29bsz                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-4                         clusterbuster-4-0-client-54f75b8ffb-qhv4x                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-4                         clusterbuster-4-0-client-54f75b8ffb-tqlwk                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-4                         clusterbuster-4-1-client-dcb4c7665-7s7fp                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-4                         clusterbuster-4-1-client-dcb4c7665-hdht8                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-4                         clusterbuster-4-1-client-dcb4c7665-tpr56                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-5                         clusterbuster-5-0-client-99d8f8498-9wjfr                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-5                         clusterbuster-5-0-client-99d8f8498-c5l96                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-5                         clusterbuster-5-0-client-99d8f8498-nf8db                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-5                         clusterbuster-5-0-client-99d8f8498-pcjfx                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-5                         clusterbuster-5-1-client-76b5d7bc58-7nfdq                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-5                         clusterbuster-5-1-client-76b5d7bc58-mgmsn                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-5                         clusterbuster-5-1-client-76b5d7bc58-ql5ft                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-5                         clusterbuster-5-1-client-76b5d7bc58-xg5g8                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18m
  clusterbuster-6                         clusterbuster-6-0-client-6ff49b8868-pn9tw                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-6                         clusterbuster-6-0-client-6ff49b8868-rsgmj                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-6                         clusterbuster-6-0-client-6ff49b8868-vgmjn                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-6                         clusterbuster-6-1-client-6c6d7f6f7c-5s7bl                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-6                         clusterbuster-6-1-client-6c6d7f6f7c-prmrp                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-6                         clusterbuster-6-1-client-6c6d7f6f7c-s7zkx                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-0-client-5cb78d8cd4-6wwlv                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-0-client-5cb78d8cd4-fxg8f                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-0-client-5cb78d8cd4-kngfh                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-0-client-5cb78d8cd4-vg8fc                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-1-client-8bd8c65c7-64pmj                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-1-client-8bd8c65c7-67fkd                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  clusterbuster-7                         clusterbuster-7-1-client-8bd8c65c7-s5pg2                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         38m
  openshift-cluster-csi-drivers           aws-ebs-csi-driver-node-qcpcr                             30m (0%)      0 (0%)      150Mi (1%)       0 (0%)         117m
  openshift-cluster-node-tuning-operator  tuned-z24rj                                               10m (0%)      0 (0%)      50Mi (0%)        0 (0%)         117m
  openshift-dns                           dns-default-fvpm7                                         60m (1%)      0 (0%)      110Mi (0%)       0 (0%)         116m
  openshift-dns                           node-resolver-ghfhp                                       5m (0%)       0 (0%)      21Mi (0%)        0 (0%)         117m
  openshift-image-registry                node-ca-gtc8q                                             10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         117m
  openshift-ingress-canary                ingress-canary-fvpm7                                      10m (0%)      0 (0%)      20Mi (0%)        0 (0%)         116m
  openshift-machine-config-operator       machine-config-daemon-7rzsx                               40m (1%)      0 (0%)      100Mi (0%)       0 (0%)         117m
  openshift-monitoring                    node-exporter-f7xfk                                       9m (0%)       0 (0%)      47Mi (0%)        0 (0%)         117m
  openshift-monitoring                    prometheus-operator-admission-webhook-6bcb565bc9-8vkvm    5m (0%)       0 (0%)      30Mi (0%)        0 (0%)         18m
  openshift-multus                        multus-88dxq                                              10m (0%)      0 (0%)      65Mi (0%)        0 (0%)         117m
  openshift-multus                        multus-additional-cni-plugins-v52ts                       10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         117m
  openshift-multus                        network-metrics-daemon-zgzzv                              20m (0%)      0 (0%)      120Mi (0%)       0 (0%)         117m
  openshift-network-diagnostics           network-check-target-xz7vt                                10m (0%)      0 (0%)      15Mi (0%)        0 (0%)         117m
  openshift-ovn-kubernetes                ovnkube-node-scj5x                                        50m (1%)      0 (0%)      660Mi (4%)       0 (0%)         117m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         279m (7%)    0 (0%)
  memory                      1408Mi (9%)  0 (0%)
  ephemeral-storage           0 (0%)       0 (0%)
  hugepages-16Gi              0 (0%)       0 (0%)
  hugepages-2Mi               0 (0%)       0 (0%)
  hugepages-512Mi             0 (0%)       0 (0%)
  attachable-volumes-aws-ebs  0            0
Events:
  Type     Reason                   Age               From     Message
  ----     ------                   ----              ----     -------
  Warning  SystemOOM                14m               kubelet  System OOM encountered, victim process: prometheus, pid: 272621
  Warning  SystemOOM                2m26s             kubelet  System OOM encountered, victim process: elasticsearch-o, pid: 20033
  Warning  SystemOOM                2m26s             kubelet  System OOM encountered, victim process: opm, pid: 18415
  Warning  SystemOOM                2m26s             kubelet  System OOM encountered, victim process: adapter, pid: 5317
  Normal   NodeHasSufficientMemory  2m18s             kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    2m18s             kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     2m18s             kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeHasSufficientPID
  Normal   NodeNotReady             2m18s             kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeNotReady
  Warning  SystemOOM                93s               kubelet  System OOM encountered, victim process: cluster-logging, pid: 20034
  Warning  SystemOOM                93s               kubelet  System OOM encountered, victim process: learn-operator, pid: 270595
  Warning  SystemOOM                93s               kubelet  System OOM encountered, victim process: runc:[2:INIT], pid: 292380
  Warning  SystemOOM                93s               kubelet  System OOM encountered, victim process: runc:[2:INIT], pid: 292370
  Warning  SystemOOM                93s               kubelet  System OOM encountered, victim process: kube-rbac-proxy, pid: 19797
  Warning  SystemOOM                93s               kubelet  System OOM encountered, victim process: runc:[2:INIT], pid: 292608
  Warning  SystemOOM                8s (x9 over 10s)  kubelet  (combined from similar events): System OOM encountered, victim process: runc:[2:INIT], pid: 290698

#  oc debug node/ip-10-0-71-208.us-east-2.compute.internal
Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Starting pod/ip-10-0-71-208us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.71.208
If you don't see a command prompt, try pressing enter.
sh-4.4#
sh-4.4# chroot /host
sh-4.4# systemctl status rpm-ostreed
● rpm-ostreed.service - rpm-ostree System Management Daemon
   Loaded: loaded (/usr/lib/systemd/system/rpm-ostreed.service; static; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/rpm-ostreed.service.d
           └─startlimit.conf
   Active: activating (start) since Fri 2022-07-29 13:49:01 UTC; 27s ago
     Docs: man:rpm-ostree(1)
 Main PID: 330415 ((m-ostree))
    Tasks: 1 (limit: 25302)
   Memory: 20.6M
      CPU: 10.136s
   CGroup: /system.slice/rpm-ostreed.service
           └─330415 (m-ostree)

Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: Starting rpm-ostree System Management Daemon...

sh-4.4# journalctl -u rpm-ostreed |tail
Jul 29 13:47:30 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: start operation timed out. Terminating.
Jul 29 13:47:30 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Failed with result 'timeout'.
Jul 29 13:47:30 ip-10-0-71-208 systemd[1]: Failed to start rpm-ostree System Management Daemon.
Jul 29 13:47:30 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Consumed 44.362s CPU time
Jul 29 13:47:31 ip-10-0-71-208 systemd[1]: Starting rpm-ostree System Management Daemon...
Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: start operation timed out. Terminating.
Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Failed with result 'timeout'.
Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: Failed to start rpm-ostree System Management Daemon.
Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Consumed 36.345s CPU time
Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: Starting rpm-ostree System Management Daemon...

>> finally rpm-ostreed service is started successfully 

#  oc debug node/ip-10-0-71-208.us-east-2.compute.internal -- chroot /host systemctl status rpm-ostreed
Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Starting pod/ip-10-0-71-208us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
● rpm-ostreed.service - rpm-ostree System Management Daemon
   Loaded: loaded (/usr/lib/systemd/system/rpm-ostreed.service; static; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/rpm-ostreed.service.d
           └─startlimit.conf
   Active: inactive (dead)
     Docs: man:rpm-ostree(1)

Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Consumed 36.345s CPU time
Jul 29 13:49:01 ip-10-0-71-208 systemd[1]: Starting rpm-ostree System Management Daemon...
Jul 29 13:50:03 ip-10-0-71-208 rpm-ostree[330415]: Reading config file '/etc/rpm-ostreed.conf'
Jul 29 13:50:03 ip-10-0-71-208 rpm-ostree[330415]: In idle state; will auto-exit in 60 seconds
Jul 29 13:50:03 ip-10-0-71-208 systemd[1]: Started rpm-ostree System Management Daemon.
Jul 29 13:50:03 ip-10-0-71-208 rpm-ostree[330415]: client(id:machine-config-operator dbus:1.13851 unit:crio-68cd1942d5e8cc7bf235289ff12786eeda8928b10722eb9d1bc52042d72be3f8.scope uid:0) added; new total=1
Jul 29 13:50:03 ip-10-0-71-208 rpm-ostree[330415]: client(id:machine-config-operator dbus:1.13851 unit:crio-68cd1942d5e8cc7bf235289ff12786eeda8928b10722eb9d1bc52042d72be3f8.scope uid:0) vanished; remaining=0
Jul 29 13:50:03 ip-10-0-71-208 rpm-ostree[330415]: In idle state; will auto-exit in 63 seconds
Jul 29 13:51:06 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Succeeded.
Jul 29 13:51:06 ip-10-0-71-208 systemd[1]: rpm-ostreed.service: Consumed 19.021s CPU time

Removing debug pod ...

>> node is in Ready state

# oc describe node ip-10-0-71-208.us-east-2.compute.internal
Name:               ip-10-0-71-208.us-east-2.compute.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=arm64
                    beta.kubernetes.io/instance-type=m6g.xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-2
                    failure-domain.beta.kubernetes.io/zone=us-east-2b
                    kubernetes.io/arch=arm64
                    kubernetes.io/hostname=ip-10-0-71-208.us-east-2.compute.internal
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=m6g.xlarge
                    node.openshift.io/os_id=rhcos
                    topology.ebs.csi.aws.com/zone=us-east-2b
                    topology.kubernetes.io/region=us-east-2
                    topology.kubernetes.io/zone=us-east-2b
Annotations:        cloud.network.openshift.io/egress-ipconfig:
                      [{"interface":"eni-03bdec19acb488a1c","ifaddr":{"ipv4":"10.0.64.0/20"},"capacity":{"ipv4":14,"ipv6":15}}]
                    csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-087c0b6b14cb40907"}
                    k8s.ovn.org/host-addresses: ["10.0.71.208"]
                    k8s.ovn.org/l3-gateway-config:
                      {"default":{"mode":"shared","interface-id":"br-ex_ip-10-0-71-208.us-east-2.compute.internal","mac-address":"06:15:da:9b:23:34","ip-address...
                    k8s.ovn.org/node-chassis-id: 89f00fd6-c371-4154-92c1-1a1ee1a5fc75
                    k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.7/16"}
                    k8s.ovn.org/node-mgmt-port-mac-address: 2a:08:52:f5:18:9f
                    k8s.ovn.org/node-primary-ifaddr: {"ipv4":"10.0.71.208/20"}
                    k8s.ovn.org/node-subnets: {"default":"10.129.2.0/23"}
                    machine.openshift.io/machine: openshift-machine-api/rioliu-072905-kgx5q-worker-us-east-2b-6s2f7
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-8d1ce60505440cf572067c1dc4cfe7d3
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-8d1ce60505440cf572067c1dc4cfe7d3
                    machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-worker-8d1ce60505440cf572067c1dc4cfe7d3
                    machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-worker-8d1ce60505440cf572067c1dc4cfe7d3
                    machineconfiguration.openshift.io/reason:
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 29 Jul 2022 07:43:08 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-0-71-208.us-east-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Fri, 29 Jul 2022 09:52:09 -0400
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 29 Jul 2022 09:50:40 -0400   Fri, 29 Jul 2022 09:38:37 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 29 Jul 2022 09:50:40 -0400   Fri, 29 Jul 2022 09:38:37 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 29 Jul 2022 09:50:40 -0400   Fri, 29 Jul 2022 09:38:37 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Fri, 29 Jul 2022 09:50:40 -0400   Fri, 29 Jul 2022 09:39:55 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.0.71.208
  Hostname:     ip-10-0-71-208.us-east-2.compute.internal
  InternalDNS:  ip-10-0-71-208.us-east-2.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  39
  cpu:                         4
  ephemeral-storage:           104322028Ki
  hugepages-16Gi:              0
  hugepages-2Mi:               0
  hugepages-512Mi:             0
  memory:                      16275968Ki
  pods:                        250
Allocatable:
  attachable-volumes-aws-ebs:  39
  cpu:                         3500m
  ephemeral-storage:           96143180846
  hugepages-16Gi:              0
  hugepages-2Mi:               0
  hugepages-512Mi:             0
  memory:                      15124992Ki
  pods:                        250
System Info:
  Machine ID:                             e78999077c92451a932a1935833cbd5e
  System UUID:                            ec239dd1-ab11-bda3-667f-2ac9fc480fa0
  Boot ID:                                8eff4ff0-b4e9-49b3-865a-8f9b81e5f6ca
  Kernel Version:                         4.18.0-372.16.1.el8_6.aarch64
  OS Image:                               Red Hat Enterprise Linux CoreOS 411.86.202207201758-0 (Ootpa)
  Operating System:                       linux
  Architecture:                           arm64
  Container Runtime Version:              cri-o://1.24.1-11.rhaos4.11.gitb0d2ef3.el8
  Kubelet Version:                        v1.24.0+9546431
  Kube-Proxy Version:                     v1.24.0+9546431
ProviderID:                               aws:///us-east-2b/i-087c0b6b14cb40907
Non-terminated Pods:                      (13 in total)
  Namespace                               Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                               ----                                   ------------  ----------  ---------------  -------------  ---
  openshift-cluster-csi-drivers           aws-ebs-csi-driver-node-qcpcr          30m (0%)      0 (0%)      150Mi (1%)       0 (0%)         129m
  openshift-cluster-node-tuning-operator  tuned-z24rj                            10m (0%)      0 (0%)      50Mi (0%)        0 (0%)         129m
  openshift-dns                           dns-default-fvpm7                      60m (1%)      0 (0%)      110Mi (0%)       0 (0%)         128m
  openshift-dns                           node-resolver-ghfhp                    5m (0%)       0 (0%)      21Mi (0%)        0 (0%)         129m
  openshift-image-registry                node-ca-gtc8q                          10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         129m
  openshift-ingress-canary                ingress-canary-fvpm7                   10m (0%)      0 (0%)      20Mi (0%)        0 (0%)         128m
  openshift-machine-config-operator       machine-config-daemon-7rzsx            40m (1%)      0 (0%)      100Mi (0%)       0 (0%)         129m
  openshift-monitoring                    node-exporter-f7xfk                    9m (0%)       0 (0%)      47Mi (0%)        0 (0%)         128m
  openshift-multus                        multus-88dxq                           10m (0%)      0 (0%)      65Mi (0%)        0 (0%)         129m
  openshift-multus                        multus-additional-cni-plugins-v52ts    10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         129m
  openshift-multus                        network-metrics-daemon-zgzzv           20m (0%)      0 (0%)      120Mi (0%)       0 (0%)         129m
  openshift-network-diagnostics           network-check-target-xz7vt             10m (0%)      0 (0%)      15Mi (0%)        0 (0%)         129m
  openshift-ovn-kubernetes                ovnkube-node-scj5x                     50m (1%)      0 (0%)      660Mi (4%)       0 (0%)         129m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests     Limits
  --------                    --------     ------
  cpu                         274m (7%)    0 (0%)
  memory                      1378Mi (9%)  0 (0%)
  ephemeral-storage           0 (0%)       0 (0%)
  hugepages-16Gi              0 (0%)       0 (0%)
  hugepages-2Mi               0 (0%)       0 (0%)
  hugepages-512Mi             0 (0%)       0 (0%)
  attachable-volumes-aws-ebs  0            0
Events:
  Type     Reason                   Age                From     Message
  ----     ------                   ----               ----     -------
  Warning  SystemOOM                26m                kubelet  System OOM encountered, victim process: prometheus, pid: 272621
  Warning  SystemOOM                13m                kubelet  System OOM encountered, victim process: elasticsearch-o, pid: 20033
  Warning  SystemOOM                13m                kubelet  System OOM encountered, victim process: opm, pid: 18415
  Warning  SystemOOM                13m                kubelet  System OOM encountered, victim process: adapter, pid: 5317
  Normal   NodeHasSufficientMemory  13m                kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    13m                kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     13m                kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeHasSufficientPID
  Normal   NodeNotReady             13m                kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeNotReady
  Warning  SystemOOM                12m                kubelet  System OOM encountered, victim process: cluster-logging, pid: 20034
  Warning  SystemOOM                12m                kubelet  System OOM encountered, victim process: learn-operator, pid: 270595
  Warning  SystemOOM                12m                kubelet  System OOM encountered, victim process: runc:[2:INIT], pid: 292380
  Warning  SystemOOM                12m                kubelet  System OOM encountered, victim process: runc:[2:INIT], pid: 292370
  Warning  SystemOOM                12m                kubelet  System OOM encountered, victim process: kube-rbac-proxy, pid: 19797
  Warning  SystemOOM                12m                kubelet  System OOM encountered, victim process: runc:[2:INIT], pid: 292608
  Warning  SystemOOM                11m (x9 over 11m)  kubelet  (combined from similar events): System OOM encountered, victim process: runc:[2:INIT], pid: 290698
  Normal   NodeNotSchedulable       2m12s              kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeNotSchedulable
  Normal   NodeSchedulable          81s                kubelet  Node ip-10-0-71-208.us-east-2.compute.internal status is now: NodeSchedulable

>> mcp state looks good

# oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-1d58b9a40f20fc20d1a774a5477fe29d   True      False      False      3              3                   3                     0                      145m
worker   rendered-worker-8d1ce60505440cf572067c1dc4cfe7d3   True      False      False      3              3                   3                     0                      145m

>> check the volume status, i/o lgtm. attached the screenshot
>> Observed events SystemOOM. there are many pods like clusterbuster-$i were running at that time.

>> rpm-ostreed issue looks like a middle state of node. the service can be recovered finally. upgrade is not blocked and can be completed successfully.

Comment 8 Colin Walters 2022-08-03 18:56:43 UTC
Haven't had customer reports of this yet, this still needs more triage/analysis.

Is this cluster idle during the upgrade, or is there any (simulated?) workloads?

Comment 9 Rio Liu 2022-08-08 01:02:13 UTC
(In reply to Colin Walters from comment #8)
> Haven't had customer reports of this yet, this still needs more
> triage/analysis.
> 
> Is this cluster idle during the upgrade, or is there any (simulated?)
> workloads?
the pipeline has a stage to trigger pre-upgrade tests, per above node details there are some pods launched like clusterbuster-0,1,2 etc.

Comment 12 Colin Walters 2022-08-15 19:06:20 UTC
Forgot to note I'd done https://github.com/coreos/rpm-ostree/pull/3905 for this but haven't started the backporting yet.

Interesting.  So far this doesn't seem to be hit in any CI runs: https://search.ci.openshift.org/?search=failed+because+a+timeout+was+exceeded&maxAge=48h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

But AFAIK we don't have any periodics that actually load the cluster.

Comment 13 Colin Walters 2022-08-15 19:17:47 UTC
OK so one interesting bit here is that by default we turn on global accounting:
https://github.com/openshift/machine-config-operator/blob/9c6c2bfd7ed498bfbc296d530d1839bd6a177b0b/templates/worker/00-worker/_base/files/kubelet-cgroups.yaml#L7

That's why we get the "Consumed 1min 22.919s CPU time" log message.

Now that in and of itself is very interesting data, because it seems like the process was actively scheduled and using a CPU core, but without emitting any logging at all.

@wking if you've got an active shell/oc-debug-node open, can you just check with things like `top` to see if you see rpm-ostreed using 100% of a core?  If so (or really, regardless) gathering any information about what it's doing would be useful via e.g. strace.

Another alternative if you see the process in a stuck state is to `kill -SEGV <pid>` to get a core dump, then extract that core dump and attach it.

Comment 16 Colin Walters 2022-08-15 21:46:35 UTC
OK this is easy to reproduce by creating a ton of dummy containers via podman.

The problem seems to be our setting:

# grep InaccessiblePaths /host/usr/lib/systemd/system/rpm-ostreed.service
InaccessiblePaths=-/var/lib/containers
#

Looking at a strace, there's a ton of these:

[root@cosa-devsh /]# grep ' <0.0[1-9]' /tmp/strace.log | head
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/c360fce228303eb09d92eff577ff4e7dffefdaa0ffcc7b8830cef70f2f913fc6/merged", UMOUNT_NOFOLLOW) = 0 <0.012978>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/d4d4aeedf80b586f327c02b8bcb976dc774a6b829a73d0bcc6877b7cc8854130/merged", UMOUNT_NOFOLLOW) = 0 <0.013055>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/dbbf5b4aacd96309460d8fbae6723ec00c96fd0e4578e54ce8cabda3d80c49d8/merged", UMOUNT_NOFOLLOW) = 0 <0.011388>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/5028e5799dc92b48e346921fb10c5aa4df7a9deafedc75a0e8688f38235f5da4/merged", UMOUNT_NOFOLLOW) = 0 <0.014569>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/ba30ce1dad38337e07cdb41699cc54b2d128d5cda86b3080ecbe77b1f9f48d32/merged", UMOUNT_NOFOLLOW) = 0 <0.010967>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/3048b006d2bc325e3116b7dd29c63c5af582c97e7b7e9d03f7bb928e51e7d743/merged", UMOUNT_NOFOLLOW) = 0 <0.010937>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/07e600fbb8dc6602e315eb86d7f487731a2cb6d6d22973b8d4ed05d8cc5700be/merged", UMOUNT_NOFOLLOW) = 0 <0.017238>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/1c70acdd66d50c1cebfbda3619bcba0cede44e57e33c6f2d8234983a4f6abe92/merged", UMOUNT_NOFOLLOW) = 0 <0.010017>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/a9e104b04b34a695eb2b3c4f6cdd659977b1a74be579d856b58bbf0658e478e2/merged", UMOUNT_NOFOLLOW) = 0 <0.012963>
1029333 umount2("/run/systemd/unit-root/var/lib/containers/storage/overlay/ad5b8510041a102234286db863f675d87c3bc466d791d2cb38c1d442edec659d/merged", UMOUNT_NOFOLLOW) = 0 <0.013663>

And after each unmount, systemd ends up re-parsing /proc/self/mountinfo!  So there's really an O(N^2) effect going on here.

Comment 19 Colin Walters 2022-08-15 23:54:12 UTC
Fix in https://github.com/coreos/rpm-ostree/pull/3939

Here's a simple way to test this out - ensure you have a bunch of containers running on the host.  Then compare:

[root@cosa-devsh /]# time systemd-run -P -p InaccessiblePaths=/var/lib/containers true
Running as unit: run-u184425.service

real    0m6.802s
user    0m0.004s
sys     0m0.004s
[root@cosa-devsh /]# time systemd-run -P -p BindReadOnlyPaths=/var/lib/containers true
Running as unit: run-u184653.service

real    0m0.123s
user    0m0.002s
sys     0m0.003s
[root@cosa-devsh /]#

Comment 20 Colin Walters 2022-08-16 00:01:20 UTC
This only affects OCP 4.11 (which currently has rpm-ostree v2022.2) and 4.12.

Comment 21 Scott Dodson 2022-08-16 13:07:02 UTC
We're asking the following questions to evaluate whether or not this bug warrants changing update recommendations from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the ImpactStatementRequested label has been added to this bug. When responding, please remove ImpactStatementRequested and set the ImpactStatementProposed label. The expectation is that the assignee answers these questions.

Which 4.y.z to 4.y'.z' updates increase vulnerability? Which types of clusters?

    reasoning: This allows us to populate from, to, and matchingRules in conditional update recommendations for "the $SOURCE_RELEASE to $TARGET_RELEASE update is not recommended for clusters like $THIS".
    example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet. Check your vulnerability with oc ... or the following PromQL count (...) > 0.
    example: All customers upgrading from 4.y.z to 4.y+1.z fail. Check your vulnerability with oc adm upgrade to show your current cluster version.

What is the impact? Is it serious enough to warrant removing update recommendations?

    reasoning: This allows us to populate name and message in conditional update recommendations for "...because if you update, $THESE_CONDITIONS may cause $THESE_UNFORTUNATE_SYMPTOMS".
    example: Around 2 minute disruption in edge routing for 10% of clusters. Check with oc ....
    example: Up to 90 seconds of API downtime. Check with curl ....
    example: etcd loses quorum and you have to restore from backup. Check with ssh ....

How involved is remediation?

    reasoning: This allows administrators who are already vulnerable, or who chose to waive conditional-update risks, to recover their cluster. And even moderately serious impacts might be acceptable if they are easy to mitigate.
    example: Issue resolves itself after five minutes.
    example: Admin can run a single: oc ....
    example: Admin must SSH to hosts, restore from backups, or other non standard admin activities.

Is this a regression?

    reasoning: Updating between two vulnerable releases may not increase exposure (unless rebooting during the update increases vulnerability, etc.). We only qualify update recommendations if the update increases exposure.
    example: No, it has always been like this we just never noticed.
    example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1.

Comment 22 Colin Walters 2022-08-16 14:26:26 UTC
> Which 4.y.z to 4.y'.z' updates increase vulnerability? Which types of clusters?

Bug was introduced in 4.11 and is present in 4.12.

> What is the impact? Is it serious enough to warrant removing update recommendations?

Nodes with substantial numbers of containers (e.g. 200+) *and* contention for CPU time are likely to have the rpm-ostreed.service fail to start and the MCD will not be able to progress.

> How involved is remediation?

Draining nodes will allow the MCD to progress.  Additionally, a workaround is to:

```
$ mkdir -p /etc/systemd/system/rpm-ostreed.service.d
$ cat >  /etc/systemd/system/rpm-ostreed.service.d/bug2111817.conf << EOF
[Service]
InaccessiblePaths=
EOF
$ systemctl daemon-reload
```

> Is this a regression?

Yes, in 4.11.

Comment 23 Colin Walters 2022-08-16 14:58:13 UTC
This is a bug in the interaction between systemd and rpm-ostree, so it's a RHEL bug.  But, reassigning to the MCO because it impacts upgrade reliability, and so to unstick things (similar to https://bugzilla.redhat.com/show_bug.cgi?id=2104619 ) we need to add a workaround to the MCD which is upgraded before the OS.

Comment 24 Colin Walters 2022-08-16 19:34:46 UTC
RHEL8 BZ is https://bugzilla.redhat.com/show_bug.cgi?id=2118774

Comment 25 Lalatendu Mohanty 2022-08-16 21:47:13 UTC
We have moved all upgrade paths from 4.10 to 4.11 as conditional update paths in https://github.com/openshift/cincinnati-graph-data/pull/2365

Comment 27 Rio Liu 2022-08-18 12:18:39 UTC
verified on 4.12.0-0.nightly-2022-08-17-114919

trigger upgrade pipeline, upgrade from 4.11.0-rc.5-x86_64 to 4.12.0-0.nightly-2022-08-17-114919
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-upgrade/job/upgrade-pipeline/19966/

>> upgrade is success

$ oc get clusterversion -o yaml|yq -y '.items[].status.history'
- completionTime: '2022-08-18T10:29:45Z'
  image: registry.ci.openshift.org/ocp/release@sha256:53ec64973439f970f814b1737a8ebe03f864881962c0fef09101192c10ecd941
  startedTime: '2022-08-18T09:26:20Z'
  state: Completed
  verified: false
  version: 4.12.0-0.nightly-2022-08-17-114919
- completionTime: '2022-08-18T08:11:33Z'
  image: quay.io/openshift-release-dev/ocp-release@sha256:5b1a987e21b199321d200ac20ae27390a75c1f44b83805dadfae7e5a967b9e5d
  startedTime: '2022-08-18T07:46:17Z'
  state: Completed
  verified: false
  version: 4.11.0-rc.5

>> check out drop-in file

$ oc debug node/ip-10-0-212-251.us-east-2.compute.internal -- chroot /host cat /run/systemd/system/rpm-ostreed.service.d/bug2111817.conf 2>&1|grep -v Warning
Starting pod/ip-10-0-212-251us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
[Service]
InaccessiblePaths=

Removing debug pod ...

>> checkout journal log of rpm-ostreed to find time out log entry on all nodes, no timeout found 

$ for node in $(oc get node -o name);do echo;echo $node;oc debug $node -- chroot /host bash -c "journalctl -u rpm-ostreed|grep 'start operation timed out'" 2>&1|grep -v Warning;done

node/ip-10-0-143-56.us-east-2.compute.internal
Starting pod/ip-10-0-143-56us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

node/ip-10-0-147-76.us-east-2.compute.internal
Starting pod/ip-10-0-147-76us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

node/ip-10-0-181-37.us-east-2.compute.internal
Starting pod/ip-10-0-181-37us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

node/ip-10-0-184-215.us-east-2.compute.internal
Starting pod/ip-10-0-184-215us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

node/ip-10-0-200-149.us-east-2.compute.internal
Starting pod/ip-10-0-200-149us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

node/ip-10-0-205-27.us-east-2.compute.internal
Starting pod/ip-10-0-205-27us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

node/ip-10-0-212-251.us-east-2.compute.internal
Starting pod/ip-10-0-212-251us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
error: non-zero exit code from debug container

Comment 30 W. Trevor King 2022-08-23 07:41:39 UTC
[1] is tracking the 4.11.z backport of an InaccessiblePaths drop-in workaround, and may be sufficient to resolve exposure.  I'm unclear about how [2] fits into the backport/release flow.

[1]: https://issues.redhat.com/browse/OCPBUGS-197
[2]: https://github.com/coreos/rpm-ostree/pull/3905

Comment 31 Colin Walters 2022-08-23 12:16:36 UTC
https://github.com/openshift/machine-config-operator/pull/3274 (and when enabled) would also have also prevented this.  "kubens" is really about avoiding the impact of all of the container mounts on systemd in general, of which the rpm-ostree InaccessiblePaths was a special case.


> I'm unclear about how [2] fits into the backport/release flow.
> [2]: https://github.com/coreos/rpm-ostree/pull/3905

I chose not to backport it because it's at best a workaround and I didn't want to distract from the real fix.  That said the timeout bump will still land in Fedora and RHEL9 though.

Comment 32 W. Trevor King 2022-08-29 17:44:42 UTC
[1] shipped a fix to 4.11.2 [2].

[1]: https://issues.redhat.com//browse/OCPBUGS-197
[2]: https://access.redhat.com/errata/RHBA-2022:6143

Comment 35 errata-xmlrpc 2023-01-17 19:53:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.