Bug 1630265

Summary: Image jenkins-2-rhel7:v3.10.45-3 is broken
Product: OpenShift Container Platform Reporter: XiuJuan Wang <xiuwang>
Component: ImageAssignee: Gabe Montero <gmontero>
Status: CLOSED ERRATA QA Contact: XiuJuan Wang <xiuwang>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: abradshaw, ahaile, aos-bugs, bparees, dapark, gmontero, jack.ottofaro, jokerman, mlabonte, mmccomas, xiuwang
Target Milestone: ---Keywords: Regression, Reopened
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-10 09:27:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1724792    

Description XiuJuan Wang 2018-09-18 09:51:27 UTC
Description of problem:
Did't include java binary into jenkins-2-rhel7:v3.10.45-3, this conduce pod crashLoopBackOff

jenkins-2-rhel7:v3.10.45-2 and jenkins-2-rhel7:v3.11 work well

Version-Release number of selected component (if applicable):

jenkins-2-rhel7:v3.10.45-3

How reproducible:
always

Steps to Reproduce:
1.Create a jenkins app  with jenkins-2-rhel7:v3.10.45-3
2.
3.

Actual results:
$ oc logs  -f jenkins-2-kqpxc  -n test 
alternatives version 1.7.4 - Copyright (C) 2001 Red Hat, Inc.
This may be freely redistributed under the terms of the GNU Public License.

usage: alternatives --install <link> <name> <path> <priority>
                    [--initscript <service>]
                    [--family <family>]
                    [--slave <link> <name> <path>]*
       alternatives --remove <name> <path>
       alternatives --auto <name>
       alternatives --config <name>
       alternatives --display <name>
       alternatives --set <name> <path>
       alternatives --list

common options: --verbose --test --help --usage --version --keep-missing
                --altdir <directory> --admindir <directory>
alternatives version 1.7.4 - Copyright (C) 2001 Red Hat, Inc.
This may be freely redistributed under the terms of the GNU Public License.

usage: alternatives --install <link> <name> <path> <priority>
                    [--initscript <service>]
                    [--family <family>]
                    [--slave <link> <name> <path>]*
       alternatives --remove <name> <path>
       alternatives --auto <name>
       alternatives --config <name>
       alternatives --display <name>
       alternatives --set <name> <path>
       alternatives --list

common options: --verbose --test --help --usage --version --keep-missing
                --altdir <directory> --admindir <directory>
OPENSHIFT_JENKINS_JVM_ARCH='', CONTAINER_MEMORY_IN_MB='512', using /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64/jre/bin/java and 
/usr/local/bin/jenkins-common.sh: line 33: java: command not found
mkdir: cannot create directory ‘/var/lib/jenkins/logs’: File exists
Administrative monitors that contact the update center will remain active
Detected password environment variable change, updating Jenkins configuration ...
Migrating slave image configuration to current version tag ...
+ exec java -XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Xmx256m -Dfile.encoding=UTF8 -Djavamelody.displayed-counters=log,error -Duser.home=/var/lib/jenkins -Djavamelody.application-name=JENKINS -jar /usr/lib/jenkins/jenkins.war
/usr/libexec/s2i/run: line 472: exec: java: not found


Expected results:
Jenkins pod should be running

Additional info:

Comment 1 Ben Parees 2018-09-18 13:35:00 UTC
Gabe, guessing you'll need to pull the release team in on this.

Comment 2 Gabe Montero 2018-09-18 14:20:15 UTC
OK, first, a clarification of the registry used would help.

I'm assuming we are not talking registry.redhat.io or registry.access.redhat.com, as I don't even see those tags there.

I do see the v3.10.45-3 listed for jenkins-2-rhel7 on brew-pulp, but when I  try to pull, I get this:

gmontero ~ $ docker pull brew-pulp-docker01.web.prod.ext.phx2.redhat.com/openshift3/jenkins-2-rhel7:v3.10.45-3
Error response from daemon: error parsing HTTP 404 response body: invalid character '<' looking for beginning of value: "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /v2/openshift3/jenkins-2-rhel7/manifests/v3.10.45-3 was not found on this server.</p>\n</body></html>\n"
gmontero ~ $ 



Makes me think the image was removed (perhaps because the release team knows it was bad).

I'll track that possibility down separately.  

But if QA can clarify the precise docker spec used, and which registry it was pulled from, that would also help.

Comment 3 Gabe Montero 2018-09-18 14:29:31 UTC
Adam Haile set me straight ... missing the port on my url ... have the image now and am looking.

Comment 4 Gabe Montero 2018-09-18 15:28:13 UTC
OK, according to the build log for that image at http://download.eng.bos.redhat.com/brewroot/packages/openshift-jenkins-2-container/v3.10.45/3/data/logs/x86_64-build.log, java was installed

And when I run the image manually, it comes up fine, both standalone and in a pod.


Here are the image particulars:

brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-2-rhel7   v3.10.45-3          d14de96593e6        22 hours ago        1.06GB

It would seem to be environment specific then.

Another round of information needed from QA:
1) confirm the image sha matches what I list above ... or did you in fact pull v3.10.45-3 from another registry besides brew-pulp?

2) provide the deploymentconfig and pod yaml when the problem occurs ... perhaps there is a configuration difference from what we are running

3) purge and repull the images ... does it still occur?

4) purge/repull the image, delete the dc/pods/any PVs, redeploy the template .... does it still occur?

Comment 5 Gabe Montero 2018-09-18 17:11:55 UTC
Among other things, I'm curious if an attempt is being made to run 32 bit java.

Comment 6 Gabe Montero 2018-09-18 17:26:53 UTC
OK, got some more feedback from Adam Haile based on an anomaly with 32 bit I saw in the logs for that build.

Turns out that build was a special case thing done by the power pc folks, where among other things they manipulated our Dockerfile in non-standard ways.

Effectively, Adam it is a bad build for testing by our QA for x86.

Assuming the earlier and later versions that QA already noted were fine are the solution path here.

Closing out.

Comment 7 Ben Parees 2018-09-18 17:30:51 UTC
moving back to on_qa as we still need qa to confirm that the v3.10 image we intend to ship is valid.  We definitely don't want to inadvertently ship a v3.10 errata w/ that bad build.

Comment 8 XiuJuan Wang 2018-09-19 02:18:55 UTC
Sorry, I didn't clary the registry I used.
I used the brew registry, the image brew-***/openshift3/jenkins-2-rhel7:v3.10 is pointing to v3.10.45-3

jenkins-2-rhel7:v3.10 in the stage registry for errata 36306 is pointing to v3.10.45-2 and jenkins-2-rhel7:v3.11 in registry.dev.redhat.io, both work well.

Comment 9 XiuJuan Wang 2018-09-19 02:27:06 UTC
IMO,It's better to rebuild a workable image,just in case we re-push images to stage registry.
Now the image brew-***/openshift3/jenkins-2-rhel7:v3.10 is still pointing to v3.10.45-3

brew-**/openshift3/jenkins-2-rhel7   v3.10               d14de96593e6        32 hours ago        1.06 GB
brew-**/openshift3/jenkins-2-rhel7   v3.10.45-3          d14de96593e6        32 hours ago        1.06 GB

Comment 10 Ben Parees 2018-09-19 02:29:05 UTC
Good idea, but that request will need to be filed w/ the release team.

Comment 11 Gabe Montero 2018-09-19 13:31:15 UTC
Marking no doc as the image should not make it to the public registry.

Comment 12 Gabe Montero 2018-10-05 02:00:04 UTC
Per https://github.com/openshift/jenkins/issues/713 looks like some rhel images missing 32bit made it to registry.access.redhat.com

Perhaps related to the problematic powerpc build Adam had pointed us to?

Comment 13 Gabe Montero 2018-10-05 14:42:45 UTC
Yeah the v3.10 tags for the images on brew-pulp are still missing the 32 bit java RPMs.

Via an email to aos-art-request@redhat.com, ticket https://jira.coreos.com/browse/ART-205 has been opened.

QA should subscribe to that jira and attempt to re-verity against the vanilla v3.10 tag 

After talking to Brenton, looks like there is not a ART component in bugzilla, and they track all there work with Jira now.

So moving this to POST until the jira is complete.  Will then move to on qa once we have brew-pulp images with the v3.10 tag in them to try.

Comment 14 XiuJuan Wang 2018-10-08 03:28:45 UTC
I have checked jenkins v3.10 related images in registry.access.redhat.com registry.
Those images are broken, have added comment in jira issue.

registry.access.redhat.com/openshift3/jenkins-2-rhel7:v3.10                                              86253d8d3c1b      10 days ago         1.18 GB

registry.access.redhat.com/openshift3/jenkins-agent-nodejs-8-rhel7          v3.10               5eadb373bbf1        10 days ago         1.11 GB

registry.access.redhat.com/openshift3/jenkins-agent-maven-35-rhel7          v3.10               06afbf69c3f1        3 weeks ago         1.38 GB

Comment 15 Gabe Montero 2018-10-08 14:06:14 UTC
Yep https://projects.engineering.redhat.com/browse/RCM-43033 has been opened to address the core jenkins and nodejs

the maven image is in fact OK ... I just checked it ...

Same sha as you noted

registry.access.redhat.com/openshift3/jenkins-agent-maven-35-rhel7                            v3.10               06afbf69c3f1        3 weeks ago         1.38GB

Comment 16 Gabe Montero 2018-10-08 14:07:12 UTC
See .. .the 32 bit jvm is there:

gmontero ~ $ docker run -it 06afbf69c3f1 /bin/bash
OPENSHIFT_JENKINS_JVM_ARCH='', CONTAINER_MEMORY_IN_MB='8796093022207', using /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/bin/java
OPENSHIFT_JENKINS_JVM_ARCH='', CONTAINER_MEMORY_IN_MB='8796093022207', using /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/bin/javac
bash-4.2$ rpm -qa | grep java-1.8
rpm -qa | grep java-1.8
java-1.8.0-openjdk-headless-1.8.0.181-3.b13.el7_5.x86_64
java-1.8.0-openjdk-javadoc-1.8.0.181-3.b13.el7_5.noarch
java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.i686
java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64
java-1.8.0-openjdk-devel-1.8.0.181-3.b13.el7_5.i686
java-1.8.0-openjdk-headless-1.8.0.181-3.b13.el7_5.i686
java-1.8.0-openjdk-devel-1.8.0.181-3.b13.el7_5.x86_64
bash-4.2$

Comment 18 Gabe Montero 2018-10-09 14:28:31 UTC
See https://jira.coreos.com/browse/ART-205

we have v3.10 images on brew-pulp with both 32 and 64 bit java

let's have QA actually try to use the images, make sure basic functionality is there

- bring up jenkins with jenkins-2-rhel7:v3.10 from brew-pulp
- run a maven and nodejs slave build with the maven and nodejs images

Comment 19 Gabe Montero 2018-10-09 15:24:55 UTC
@Jack the OPENSHIFT_JENKINS_JVM_ARCH env var var on the pod being set to "x86_64" is the workaround

Comment 22 Ben Parees 2018-10-09 17:25:09 UTC
Regarding QE testing, QE should bring it up with OPENSHIFT_JENKINS_JVM_ARCH set to both i386 and x86_64 to fully confirm both JVMs are installed/available/working.

Comment 24 XiuJuan Wang 2018-10-10 03:41:48 UTC
jenkins related images with v3.10 in brew registry point to:
jenkins-2-rhel7:v3.10 --> v.3.10.45-8
jenkins-agent-maven-35-rhel7:v3.10 --> v3.10.45-8
jenkins-agent-nodejs-8-rhel7:v3.10 --> v3.10.45-8
jenkins-slave-base-rhel7:v3.10     --> v3.10.45-10

Have set up jenkins pod with OPENSHIFT_JENKINS_JVM_ARCH to i386 and x86_64,both running.Also maven and nodejs slave build succeed.

Comment 28 Gabe Montero 2018-10-11 16:53:11 UTC
*** Bug 1638466 has been marked as a duplicate of this bug. ***

Comment 29 Gabe Montero 2018-10-12 14:22:38 UTC
Also, I have confirmed that all 3 images with the v3.10 tag at registry.access.redhat.com have been successfully reverted and have both 32 and 64 bit java.

Aside for general awareness for folks subscribed here and interested in the issue, I thought this detail is relevant wrt an expediting of pushing new v3.10 images through our ART / distgit pipeline.

Comment 30 Ade Bradshaw 2018-10-13 11:50:49 UTC
If all the images are reverted - why am I still getting the same problem ?

Comment 31 Gabe Montero 2018-10-13 13:34:59 UTC
@Ade Often times there are registry caches that do not get updated in a timely fashion unfortunately.

They eventually get refreshed, but the time for refresh seems to vary, and is not something we have control over.

In fact, when I do pulls of the image from home vs. work, I get different images.  I got the good/reverted image at work, and the corrupted one at home.

The sha of the bad jenkins-2 image is 86253d8d3c1b

Comment 32 XiuJuan Wang 2018-10-15 03:03:39 UTC
@Gabe,
Only nodejs agent image has reverted to v3.10.45-2. jenkins-2-rhel7:v3.10 is still 86253d8d3c1b

registry.access.redhat.com/openshift3/jenkins-2-rhel7                v3.10               86253d8d3c1b        2 weeks ago         1.18 GB
registry.access.redhat.com/openshift3/jenkins-agent-nodejs-8-rhel7   v3.10               a4ac0f64e476        4 weeks ago         1.257 GB

Comment 33 Gabe Montero 2018-10-15 13:41:44 UTC
Again it depends on where you are pulling from unfortunately, as there are registry caches all over the place that delay these things (and we have no control over updating these caches).

When I pull from the office:

gmontero ~ $ docker pull registry.access.redhat.com/openshift3/jenkins-2-rhel7:v3.10
v3.10: Pulling from openshift3/jenkins-2-rhel7





Digest: sha256:97ecffb2ecd8f8517abe9c8296a1404e5661038a1327f95f068c1fcc5c6e08b8
Status: Downloaded newer image for registry.access.redhat.com/openshift3/jenkins-2-rhel7:v3.10
gmontero ~ $ docker images | grep jenkins
registry.access.redhat.com/openshift3/jenkins-2-rhel7   v3.10               ed4b21aabcd9        4 weeks ago         1.37GB
gmontero ~ $ docker run -it ed4b21aabcd9 rpm -qa | grep java-1
java-1.8.0-openjdk-headless-1.8.0.181-3.b13.el7_5.i686
java-1.8.0-openjdk-devel-1.8.0.181-3.b13.el7_5.i686
java-1.8.0-openjdk-headless-1.8.0.181-3.b13.el7_5.x86_64
java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64
java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.i686
gmontero ~ $ 


The "ed4b21aabcd9" sha is the one you want.  If the v3.10 tag is still defaulting to the broken image for you, pull the sha directly.

Comment 35 errata-xmlrpc 2019-01-10 09:27:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0026