Bug 1491126

Summary: CentOS regression failing
Product: [Community] GlusterFS Reporter: Sunil Kumar Acharya <sheggodu>
Component: project-infrastructureAssignee: Nigel Babu <nigelb>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-infra, nigelb, rgowdapp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-14 15:16:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sunil Kumar Acharya 2017-09-13 07:13:46 UTC
CentOS regression is failing for : https://review.gluster.org/#/c/17151/

Log : https://build.gluster.org/job/centos6-regression/6351/

Comment 1 Nigel Babu 2017-09-13 07:16:58 UTC
java.io.EOFException
12:08:01 	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
12:08:01 	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
12:08:01 	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
12:08:01 	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
12:08:01 	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
12:08:01 	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
12:08:01 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
12:08:01 Caused: java.io.IOException: Unexpected termination of the channel
12:08:01 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
12:08:01 Caused: hudson.remoting.ChannelClosedException: channel is already closed
12:08:01 	at hudson.remoting.Channel.send(Channel.java:605)
12:08:01 	at hudson.remoting.Request.call(Request.java:130)
12:08:01 	at hudson.remoting.Channel.call(Channel.java:829)
12:08:01 	at hudson.FilePath.act(FilePath.java:987)
12:08:01 Caused: java.io.IOException: remote file operation failed: /home/jenkins/root/workspace/centos6-regression at hudson.remoting.Channel@38fcce0e:slave1.cloud.gluster.org
12:08:01 	at hudson.FilePath.act(FilePath.java:994)
12:08:01 	at hudson.FilePath.act(FilePath.java:976)
12:08:01 	at hudson.FilePath.mkdirs(FilePath.java:1159)
12:08:01 	at hudson.model.AbstractProject.checkout(AbstractProject.java:1274)
12:08:01 	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:560)
12:08:01 	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
12:08:01 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:485)
12:08:01 	at hudson.model.Run.execute(Run.java:1735)
12:08:01 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
12:08:01 	at hudson.model.ResourceController.execute(ResourceController.java:97)
12:08:01 	at hudson.model.Executor.run(Executor.java:405)

Comment 2 Nigel Babu 2017-09-13 07:17:43 UTC
The best guess I have is intermittent network issues between the cage and this node. I've taken the node offline for now.

Comment 4 Nigel Babu 2017-09-14 15:16:35 UTC
This was a lovely issue. The entire RAM was filled with rouge gluster processes that looked like this on ps ax:

30301 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy194 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy194.pid -S /var/run/gluster/a4855a3707133667e741a8b25f777ba6.socket --brick-name /d/backends/patchy194 -l /var/log/glusterfs/bricks/d-backends-patchy194.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49345 --xlator-option patchy-server.listen-port=49345                                                                                        
30322 ?        Ssl    2:04 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy195 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy195.pid -S /var/run/gluster/06c756cd2a900bafabc3a2cdfec9f252.socket --brick-name /d/backends/patchy195 -l /var/log/glusterfs/bricks/d-backends-patchy195.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49346 --xlator-option patchy-server.listen-port=49346                                                                                        
30343 ?        Ssl    2:06 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy196 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy196.pid -S /var/run/gluster/1f718d220e02fef7d8c75178aaba87db.socket --brick-name /d/backends/patchy196 -l /var/log/glusterfs/bricks/d-backends-patchy196.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49347 --xlator-option patchy-server.listen-port=49347                                                                                        
30364 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy197 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy197.pid -S /var/run/gluster/9c477977bd939896de48eb8256bcea6d.socket --brick-name /d/backends/patchy197 -l /var/log/glusterfs/bricks/d-backends-patchy197.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49348 --xlator-option patchy-server.listen-port=49348                                                                                        
30385 ?        Ssl    2:06 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy198 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy198.pid -S /var/run/gluster/2f1edccbaa1bd898a747124c9ae89916.socket --brick-name /d/backends/patchy198 -l /var/log/glusterfs/bricks/d-backends-patchy198.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49349 --xlator-option patchy-server.listen-port=49349                                                                                        
30406 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy199 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy199.pid -S /var/run/gluster/ded1419c8068570d94d7a2473dbbef38.socket --brick-name /d/backends/patchy199 -l /var/log/glusterfs/bricks/d-backends-patchy199.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49350 --xlator-option patchy-server.listen-port=49350                                                                                        
30427 ?        Ssl    2:06 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy200 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy200.pid -S /var/run/gluster/4cd50c30298df50845c505b386973409.socket --brick-name /d/backends/patchy200 -l /var/log/glusterfs/bricks/d-backends-patchy200.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49351 --xlator-option patchy-server.listen-port=49351                                                                                        
30448 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy201 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy201.pid -S /var/run/gluster/ff3242ef1538df4f31d0ba570496a62f.socket --brick-name /d/backends/patchy201 -l /var/log/glusterfs/bricks/d-backends-patchy201.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49352 --xlator-option patchy-server.listen-port=49352                                                                                        
30469 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy202 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy202.pid -S /var/run/gluster/b2bcbc8daa903c74174e1c7b126c6b9b.socket --brick-name /d/backends/patchy202 -l /var/log/glusterfs/bricks/d-backends-patchy202.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49353 --xlator-option patchy-server.listen-port=49353                                                                                        
30490 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy203 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy203.pid -S /var/run/gluster/37c7b919cf70a79744a339b5576762ac.socket --brick-name /d/backends/patchy203 -l /var/log/glusterfs/bricks/d-backends-patchy203.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49354 --xlator-option patchy-server.listen-port=49354                                                                                        
30511 ?        Ssl    2:06 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy204 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy204.pid -S /var/run/gluster/7e939e1e271855d5657aaaf10471117a.socket --brick-name /d/backends/patchy204 -l /var/log/glusterfs/bricks/d-backends-patchy204.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49355 --xlator-option patchy-server.listen-port=49355                                                                                        
30532 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy205 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy205.pid -S /var/run/gluster/d83845a6167892ec46adc9c4849a754d.socket --brick-name /d/backends/patchy205 -l /var/log/glusterfs/bricks/d-backends-patchy205.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49356 --xlator-option patchy-server.listen-port=49356                                                                                        
30553 ?        Ssl    2:05 /build/install/sbin/glusterfsd -s slave1.cloud.gluster.org --volfile-id patchy.slave1.cloud.gluster.org.d-backends-patchy206 -p /var/run/gluster/vols/patchy/slave1.cloud.gluster.org-d-
backends-patchy206.pid -S /var/run/gluster/060cb1f6d4c375a6b6e42b48c8cb7eab.socket --brick-name /d/backends/patchy206 -l /var/log/glusterfs/bricks/d-backends-patchy206.log --xlator-option *-posix.glusterd-uuid=a
ece481c-aa10-493f-94c9-1b5e7306e1b3 --process-name brick --brick-port 49357 --xlator-option patchy-server.listen-port=49357


There was so little memory that java kept getting hit with OOMKiller. I did a pkill on gluster processes and restarted the machine to be completely sure things were normal.