Description of problem: s2i builds are failing on OCP 4.3.z that are successfully building on OCP 3.7 Version-Release number of selected component (if applicable): OCP 4.3.z How reproducible: Readily Steps to Reproduce: 1. Run the builds psap or common 2. 3. Actual results: builds that are building larger images are failing Expected results: All builds should be successful as there were passing OCP 3.7 Additional info: 1. Debug level 6 logs are attached to the case 2. Node and Pod have sufficient ephemeral storage 3. Docker file is common for all build and source code changes from build to build 4. Docker file and build config are attached to the case
OK I've attached the latest round of debug data from the customer. There is a hiccup in buildah during the copy after the build completes. As noted in the description, there certainly appears to be adequate storage available on the host. For the last few weeks Nalin from our buildah team has been in the middle of some buildah copy optimzations. It is quite possible those would have bearing here. Making him the owner (but will leave under OCP/build for now) so he can look at the data I attached.
Some of the errors in the linked issue look similar to bug #1720730, though I'm not familiar enough with what's happening in the assemble scripts to be able to diagnose what part we're playing when they don't succeed.
https://github.com/inteliquent is accessible Nalin but not https://github.com/inteliquent/ng911-common from their build config Nor do I see it in the customer case attachments. Let's ask for it: Arnand - we need any s2i scripts that are related to the name: common namespace: ng911 Build config.
@Nalin Please confirm if attachments are ok or if you need any other info?
It looks like the customer was able to fix the issue by updating the base gradle-spring-boot image. The new image is optimized for space. The previous image was approximately 400MB; the new one is approximately 30MB and all build are succeeding. I will probably close the case, but let me know if BZ still needs to be open. Thx Anand
and the Customer posted this question today. I think he meant 400 GB. "Although we have been able to work past our current issue. It appears there may be size limits and/or bugs related to image build size. We do not have plans to deploy images of 400MB anytime soon, but would like to understand if there are size limitations within OpenShift environment"
Adam: Any update on this ticket ?
This bug was caused by an issue with buildah having an extraneous call to read an image from its blob cache [1]. This was fixed in buildah v1.14.11, which was vendored into OpenShift builds in 4.6.0 [2] and 4.5.z [3]. [1] https://github.com/containers/buildah/pull/2502 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1720730 [3] https://bugzilla.redhat.com/show_bug.cgi?id=1868401
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days