Bug 1891915 - node reboot taking too long and causing e2-gcp-op tests to timeout
Summary: node reboot taking too long and causing e2-gcp-op tests to timeout
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Kirsten Garrison
QA Contact: Rio Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-27 16:43 UTC by Sinny Kumari
Modified: 2022-08-25 21:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-25 21:48:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-dns-operator pull 213 0 None closed Revert "Configure CoreDNS to shut down gracefully" 2021-01-21 05:41:56 UTC

Description Sinny Kumari 2020-10-27 16:43:07 UTC
Description of problem:

e2e-gcp-op tests are failing as a result of timeout reach.

By doing comparison between some of the recent and old tests, it looks like reboot time has increased from ~60 seconds to ~90 seconds.

Some of the logs are:
By analyzing some of gcp-op test run logs, it seems system reboot time has increased by around 30 seconds.

1. With increased reboot time ~90 seconds
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/2182/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1320812936498778112/artifacts/e2e-gcp-op/pods/openshift-machine-config-operator_machine-config-daemon-5c86g_machine-config-daemon.log

```
I1026 20:31:17.810600    1972 update.go:1590] initiating reboot: Node will reboot into config rendered-worker-ddf64801127323a695f308d91109d951
I1026 20:31:17.893443    1972 daemon.go:641] Shutting down MachineConfigDaemon
I1026 20:32:50.940436    2088 start.go:108] Version: machine-config-daemon-4.6.0-202006240615.p0-370-g999521b6-dirty (999521b61c81577b156331b7bf8495347a8503c1)
I1026 20:32:50.949591    2088 start.go:121] Calling chroot("/rootfs")
I1026 20:32:50.949813    2088 rpm-ostree.go:261] Running captured: rpm-ostree status --json
I1026 20:32:51.457544    2088 daemon.go:226] Booted osImageURL: registry.build01.ci.openshift.org/ci-op-y4rk2lpr/stable@sha256:ce348cfb50d39297969c9a0c2f928d23eb2ab8ded7cacd5e39685bd0931bbfac (47.82.202010261347-0)
```

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/2181/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1321047711511744512/artifacts/e2e-gcp-op/pods/openshift-machine-config-operator_machine-config-daemon-h5bd9_machine-config-daemon.log

```
I1027 12:09:08.575381    1848 update.go:1607] initiating reboot: Node will reboot into config rendered-worker-4ad27ca30531f77416a00e38bda8c8e6
I1027 12:09:08.668785    1848 daemon.go:641] Shutting down MachineConfigDaemon
I1027 12:10:44.764737    2142 start.go:108] Version: machine-config-daemon-4.6.0-202006240615.p0-370-gf72a5ace-dirty (f72a5ace0b5432ad96bba59b2b91633d8bb8315c)
I1027 12:10:44.772268    2142 start.go:121] Calling chroot("/rootfs")
I1027 12:10:44.772468    2142 rpm-ostree.go:261] Running captured: rpm-ostree status --json
I1027 12:10:45.343467    2142 daemon.go:226] Booted osImageURL: registry.build01.ci.openshift.org/ci-op-gdsvbywh/stable@sha256:cd6d0d6c4cbaa7ceaca75e7a00fad2ced6344bc89c38cadfa121b24209038e2f (47.82.202010270142-0)
```

2. Previous reboot time ~60 seconds

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/2177/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1319724839077941248/artifacts/e2e-gcp-op/pods/openshift-machine-config-operator_machine-config-daemon-wmvdr_machine-config-daemon.log

```
I1023 21:06:25.466932    1842 update.go:1590] initiating reboot: Node will reboot into config rendered-infra-54a8f08a03796f8c94187a986edcbc7a
I1023 21:06:25.594958    1842 daemon.go:641] Shutting down MachineConfigDaemon
I1023 21:07:30.693671    1850 start.go:108] Version: machine-config-daemon-4.6.0-202006240615.p0-370-g56ded555-dirty (56ded5550c030d88cdffa8de630ff1a1287303f3)
I1023 21:07:30.700093    1850 start.go:121] Calling chroot("/rootfs")
I1023 21:07:30.700507    1850 rpm-ostree.go:261] Running captured: rpm-ostree status --json
I1023 21:07:31.107832    1850 daemon.go:226] Booted osImageURL: registry.build01.ci.openshift.org/ci-op-vcw3wmdq/stable@sha256:113a36da35d4aff5f8ef43ff97bed0e97cdaea2139c8db44fbac31051bec43c8 (47.82.202010231442-0)
```

Expected Result:

e2e-gcp-op test run should successfully finish.


Note You need to log in before you can comment on or make changes to this bug.