Bug 2083886
| Summary: | e2e-metal-ipi-ovn-ipv6 taking a lot longer to complete in 4.11 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Stephen Benjamin <stbenjam> |
| Component: | Installer | Assignee: | Tomas Sedovic <tsedovic> |
| Installer sub component: | OpenShift on Bare Metal IPI | QA Contact: | Amit Ugol <augol> |
| Status: | CLOSED DEFERRED | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | derekh, janders, wking |
| Version: | 4.11 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-03-09 01:19:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Stephen Benjamin
2022-05-10 22:43:51 UTC
Picking a recent run from each job [1,2]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-ipv6/1524094298423824384/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-devscripts-setup/artifacts/root/dev-scripts/logs/06_create_cluster-2022-05-10-192143.log | grep -A6 'Time elapsed per stage' 2022-05-10 20:10:24 level=debug msg=Time elapsed per stage: 2022-05-10 20:10:24 level=debug msg= bootstrap: 2m12s 2022-05-10 20:10:24 level=debug msg= masters: 14m32s 2022-05-10 20:10:24 level=debug msg=Bootstrap Complete: 12m40s 2022-05-10 20:10:24 level=debug msg= Bootstrap Destroy: 17s 2022-05-10 20:10:24 level=debug msg= Cluster Operators: 18m29s 2022-05-10 20:10:24 level=info msg=Time elapsed: 48m39s $ curl -s https://storage.googleapis.com/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-ipv6/1524094298423824384/build-log.txt | grep 'devscripts-setup succe eded after' INFO[2022-05-10T20:10:53Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-setup succeeded after 1h0m50s. $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1524084786757373952/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-devscripts-setup/artifacts/root/dev-scripts/logs/06_create_cluster-2022-05-10-182006.log | grep -A6 'Time elapsed per stage' 2022-05-10 19:07:22 level=debug msg=Time elapsed per stage: 2022-05-10 19:07:22 level=debug msg= bootstrap: 10s 2022-05-10 19:07:22 level=debug msg= masters: 12m59s 2022-05-10 19:07:22 level=debug msg=Bootstrap Complete: 14m45s 2022-05-10 19:07:22 level=debug msg= Bootstrap Destroy: 8s 2022-05-10 19:07:22 level=debug msg= Cluster Operators: 18m42s 2022-05-10 19:07:22 level=info msg=Time elapsed: 47m14s $ curl -s https://storage.googleapis.com/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1524084786757373952/build-log.txt | grep 'devscripts-setup succe eded after' INFO[2022-05-10T19:07:41Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-setup succeeded after 57m30s. So... that's pretty close? Maybe I'm interpreting "metal installs" too narrowly. $ curl -s https://storage.googleapis.com/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-ipv6/1524094298423824384/build-log.txt | grep 'succeeded after'INFO[2022-05-10T19:08:43Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-packet-setup succeeded after 9m40s. INFO[2022-05-10T19:08:53Z] Step e2e-metal-ipi-ovn-ipv6-ipi-install-rbac succeeded after 10s. INFO[2022-05-10T19:10:03Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-proxy succeeded after 1m10s. INFO[2022-05-10T20:10:53Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-setup succeeded after 1h0m50s. INFO[2022-05-10T20:10:53Z] Step phase pre succeeded after 1h11m50s. INFO[2022-05-10T21:30:33Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-e2e-test succeeded after 1h19m40s. INFO[2022-05-10T21:30:33Z] Step phase test succeeded after 1h19m40s. INFO[2022-05-10T21:31:33Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-gather succeeded after 1m0s. INFO[2022-05-10T21:33:33Z] Step e2e-metal-ipi-ovn-ipv6-gather-must-gather succeeded after 2m0s. INFO[2022-05-10T21:36:43Z] Step e2e-metal-ipi-ovn-ipv6-gather-extra succeeded after 3m10s. INFO[2022-05-10T21:37:34Z] Step e2e-metal-ipi-ovn-ipv6-gather-audit-logs succeeded after 50s. INFO[2022-05-10T21:37:54Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-packet-teardown succeeded after 20s. INFO[2022-05-10T21:37:54Z] Step phase post succeeded after 7m20s. $ curl -s https://storage.googleapis.com/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1524084786757373952/build-log.txt | grep 'succeeded after'INFO[2022-05-10T18:09:01Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-packet-setup succeeded after 10s. INFO[2022-05-10T18:09:11Z] Step e2e-metal-ipi-ovn-ipv6-ipi-install-rbac succeeded after 10s. INFO[2022-05-10T18:10:11Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-proxy succeeded after 1m0s. INFO[2022-05-10T19:07:41Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-setup succeeded after 57m30s. INFO[2022-05-10T19:07:41Z] Step phase pre succeeded after 58m50s. INFO[2022-05-10T20:51:22Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-e2e-test succeeded after 1h43m40s. INFO[2022-05-10T20:51:22Z] Step phase test succeeded after 1h43m40s. INFO[2022-05-10T20:52:02Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-devscripts-gather succeeded after 40s. INFO[2022-05-10T20:54:02Z] Step e2e-metal-ipi-ovn-ipv6-gather-must-gather succeeded after 2m0s. INFO[2022-05-10T20:56:42Z] Step e2e-metal-ipi-ovn-ipv6-gather-extra succeeded after 2m40s. INFO[2022-05-10T20:57:22Z] Step e2e-metal-ipi-ovn-ipv6-gather-audit-logs succeeded after 40s. INFO[2022-05-10T20:57:32Z] Step e2e-metal-ipi-ovn-ipv6-baremetalds-packet-teardown succeeded after 10s. INFO[2022-05-10T20:57:32Z] Step phase post succeeded after 6m10s. That's packet-setup bumping from 10s to ~10m. And e2e-test going from ~1h20m to ~1h44m. Nothing else looks significantly different. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-ipv6/1524094298423824384 [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1524084786757373952 > That's packet-setup bumping from 10s to ~10m This discrepancy can happen on any job now that we are using the resource pooler, if a host is available in the pool it will take ~10s else 10 to 20 minutes. Its not release/job specific, to confirm this in the examples you referenced you can see the 4.10 fails to get a host from the pool > curl: (22) The requested URL returned error: 409 CONFLICT and then continues to use the old on demand method > And e2e-test going from ~1h20m to ~1h44m This is definitely a thing we need to look into, I've compared 3 jobs from March 27/28th[1] to 3 Jobs from Apr 21/22nd[2] I couldn't see any single test taking up 20 minutes, the longest single tests were 10 minutes (2 of them) but they existed on all 6 jobs, the one thing I noticed is the difference in log file size, has the debug level been increased? [derekh@laptop i]$ ls -lh total 71M -rw-rw-r--. 1 derekh derekh 9.3M May 11 14:59 1507979108498280448.log -rw-rw-r--. 1 derekh derekh 9.3M May 11 14:59 1508084230968578048.log -rw-rw-r--. 1 derekh derekh 9.1M May 11 14:59 1508302866471718912.log -rw-rw-r--. 1 derekh derekh 15M May 11 14:59 1517223154093658112.log -rw-rw-r--. 1 derekh derekh 14M May 11 15:00 1517299324424294400.log -rw-rw-r--. 1 derekh derekh 15M May 11 15:00 1517389993293647872.log [1] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1508302866471718912/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-e2e-test/build-log.txt https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1508084230968578048/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-e2e-test/build-log.txt https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1507979108498280448/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-e2e-test/build-log.txt [2] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1517223154093658112/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-e2e-test/build-log.txt https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1517299324424294400/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-e2e-test/build-log.txt https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1517389993293647872/artifacts/e2e-metal-ipi-ovn-ipv6/baremetalds-e2e-test/build-log.txt OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira. https://issues.redhat.com/browse/OCPBUGS-9264 |