Bug 1705123 - jenkins-slave produce process defunct [ Jenkins "SLAVE" ]
Summary: jenkins-slave produce process defunct [ Jenkins "SLAVE" ]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: ImageStreams
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.2.0
Assignee: Gabe Montero
QA Contact: XiuJuan Wang
URL:
Whiteboard:
Depends On: 1700314 1707447 1707448 1718379
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-01 14:36 UTC by Gabe Montero
Modified: 2019-10-16 06:28 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Long running jenkins agent/slave pods can experience the defect process phenomenon that we previously observed with the jenkins master Consequence: A lot of defect processes show up in process listings until the pod is terminated. Fix: Employ `dumb-init` as with the openshift/jenkins master image to clean up these defect processes which occur during jenkins job processing. Result: Process listings within agent/slave pods and on the hosts those pods reside no longer include the defunct processes.
Clone Of: 1700314
Environment:
Last Closed: 2019-10-16 06:28:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:28:33 UTC

Comment 1 Gabe Montero 2019-05-01 14:41:21 UTC
PR https://github.com/openshift/jenkins/pull/845 is up

Current thought amongst devex team is this is a post 4.1.0 GA item.

Comment 2 Gabe Montero 2019-05-14 17:19:55 UTC
PR has merged ... in for 4.2 inclusion

will use this bug for that and will clone for 4.1.z inclusion

Comment 3 Gabe Montero 2019-05-14 20:11:57 UTC
also need https://github.com/openshift/ocp-build-data/pull/126 for the osbs/brew side to be able to install dumb-init

Comment 4 XiuJuan Wang 2019-05-23 08:11:32 UTC
Can't reproduce this with 4.2.0-0.ci-2019-05-23-003410 payload.

Steps:
1. Create jenkins server and maven| nodejs pipeline buildconfigs.
2.Login to jenkins console to set the maven/nodejs pod idle 30 mins
3.Trigger maven and nodejs pipeline builds.
4.Rsh into slave pod when time is almost out.
dumb-init process has cleaned defunct processes, no defunct processes exist.

$ oc get pods 
NAME                                  READY   STATUS      RESTARTS   AGE
maven-d1dvn                           1/1     Running     0          22m
nodejs-l6p30                          1/1     Running     0          22m

$ oc rsh nodejs-l6p30 
sh-4.2$ ps -ef 
UID         PID   PPID  C STIME TTY          TIME CMD
default       1      0  0 07:46 ?        00:00:00 /usr/bin/dumb-init -- /usr/local/bin/run-jnlp-client 89e8841a50290a213f74d378a5f2939031e6b330ef27d24e052439ffa294ad44 nodejs-l6p30
default       6      1  1 07:46 ?        00:00:25 java -XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -cp /home/jenkins/remoting.jar hudson.re
default     319      0  0 08:09 pts/0    00:00:00 /bin/sh
default     327    319  0 08:09 pts/0    00:00:00 ps -ef
sh-4.2$ exit
$ oc rsh maven-d1dvn 
sh-4.2$ ps -ef
UID         PID   PPID  C STIME TTY          TIME CMD
default       1      0  0 07:46 ?        00:00:00 /usr/bin/dumb-init -- /usr/local/bin/run-jnlp-client 26293759c3dac125c3fc913b3d9381a5f5b6a3972f4e1bf96b6ef3cf706e2b89 maven-d1dvn
default       6      1  2 07:46 ?        00:00:30 java -XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -cp /home/jenkins/remoting.jar hudson.re
default     461      0  0 08:10 pts/0    00:00:00 /bin/sh
default     469    461  0 08:10 pts/0    00:00:00 ps -ef

Comment 5 errata-xmlrpc 2019-10-16 06:28:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.