Bug 1844469
| Summary: | s2i builds are failing on OCP 4.3.z that are successfully building on OCP 3.7 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Anand Paladugu <apaladug> |
| Component: | Build | Assignee: | Adam Kaplan <adam.kaplan> |
| Status: | CLOSED ERRATA | QA Contact: | wewang <wewang> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.3.z | CC: | adam.kaplan, aos-bugs, gmontero, nalin, palonsor, wzheng |
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: buildah had an extraneous call that read layers from its blob cache
Consequence: layers could fail to read, particularly if a layer was large
Fix: removed extraneous call to read layers
Result: buildah builds should succeed and not fail to read an image layer that had already been pulled
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-10-27 16:05:27 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Anand Paladugu
2020-06-05 13:39:44 UTC
OK I've attached the latest round of debug data from the customer. There is a hiccup in buildah during the copy after the build completes. As noted in the description, there certainly appears to be adequate storage available on the host. For the last few weeks Nalin from our buildah team has been in the middle of some buildah copy optimzations. It is quite possible those would have bearing here. Making him the owner (but will leave under OCP/build for now) so he can look at the data I attached. Some of the errors in the linked issue look similar to bug #1720730, though I'm not familiar enough with what's happening in the assemble scripts to be able to diagnose what part we're playing when they don't succeed. https://github.com/inteliquent is accessible Nalin but not https://github.com/inteliquent/ng911-common from their build config Nor do I see it in the customer case attachments. Let's ask for it: Arnand - we need any s2i scripts that are related to the name: common namespace: ng911 Build config. @Nalin Please confirm if attachments are ok or if you need any other info? It looks like the customer was able to fix the issue by updating the base gradle-spring-boot image. The new image is optimized for space. The previous image was approximately 400MB; the new one is approximately 30MB and all build are succeeding. I will probably close the case, but let me know if BZ still needs to be open. Thx Anand and the Customer posted this question today. I think he meant 400 GB. "Although we have been able to work past our current issue. It appears there may be size limits and/or bugs related to image build size. We do not have plans to deploy images of 400MB anytime soon, but would like to understand if there are size limitations within OpenShift environment" Adam: Any update on this ticket ? This bug was caused by an issue with buildah having an extraneous call to read an image from its blob cache [1]. This was fixed in buildah v1.14.11, which was vendored into OpenShift builds in 4.6.0 [2] and 4.5.z [3]. [1] https://github.com/containers/buildah/pull/2502 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1720730 [3] https://bugzilla.redhat.com/show_bug.cgi?id=1868401 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |