Bug 1375521 - slave33.cloud.gluster.org is out of space
Summary: slave33.cloud.gluster.org is out of space
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: project-infrastructure
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nigel Babu
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-13 10:03 UTC by Niels de Vos
Modified: 2016-09-15 03:57 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-09-13 10:41:26 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Niels de Vos 2016-09-13 10:03:13 UTC
Description of problem:
slave33.cloud.gluster.org has been marked offline in Jenkins.

https://build.gluster.org/job/centos6-regression/737/console failed with a weird Jenkins error:

ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from git://review.gluster.org/glusterfs.git
	at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:810)
	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1066)
	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1097)
	at hudson.scm.SCM.checkout(SCM.java:485)
	at hudson.model.AbstractProject.checkout(AbstractProject.java:1269)
	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
	at hudson.model.Run.execute(Run.java:1738)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:410)
Caused by: hudson.plugins.git.GitException: Command "git config remote.origin.url git://review.gluster.org/glusterfs.git" returned status code 4:
stdout: 
stderr: error: failed to write new configuration file .git/config.lock


An other job on the same system failed with ENOSPACE:
https://build.gluster.org/job/devrpm-el7/1408/console

ERROR: Error cloning remote repo 'origin'
hudson.plugins.git.GitException: Could not init /home/jenkins/root/workspace/devrpm-el7
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$5.execute(CliGitAPIImpl.java:656)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:463)
	at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:152)
	at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:145)
	at hudson.remoting.UserRequest.perform(UserRequest.java:120)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:332)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
	at ......remote call to slave33.cloud.gluster.org(Native Method)
	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416)
	at hudson.remoting.UserResponse.retrieve(UserRequest.java:220)
	at hudson.remoting.Channel.call(Channel.java:781)
	at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.execute(RemoteGitImpl.java:145)
	at sun.reflect.GeneratedMethodAccessor129.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.invoke(RemoteGitImpl.java:131)
	at com.sun.proxy.$Proxy48.execute(Unknown Source)
	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1057)
	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1097)
	at hudson.scm.SCM.checkout(SCM.java:485)
	at hudson.model.AbstractProject.checkout(AbstractProject.java:1269)
	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
	at hudson.model.Run.execute(Run.java:1738)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:410)
Caused by: hudson.plugins.git.GitException: Command "git init /home/jenkins/root/workspace/devrpm-el7" returned status code 1:
stdout: 
stderr: /home/jenkins/root/workspace/devrpm-el7/.git: No space left on device

Comment 1 Nigel Babu 2016-09-13 10:08:26 UTC
Seeing lots of these:

Sep 11 04:20:42 slave33 sm-notify[16681]: Already notifying clients; Exiting!
Sep 11 04:20:42 slave33 sm-notify[16684]: Version 1.2.3 starting
Sep 11 04:20:42 slave33 sm-notify[16684]: Already notifying clients; Exiting!
Sep 11 04:20:42 slave33 sm-notify[16689]: Version 1.2.3 starting
Sep 11 04:20:42 slave33 sm-notify[16689]: Already notifying clients; Exiting!
Sep 11 04:20:42 slave33 sm-notify[16692]: Version 1.2.3 starting
Sep 11 04:20:42 slave33 sm-notify[16692]: Already notifying clients; Exiting!
Sep 11 04:20:42 slave33 sm-notify[16695]: Version 1.2.3 starting
Sep 11 04:20:42 slave33 sm-notify[16695]: Already notifying clients; Exiting!
Sep 11 04:20:42 slave33 sm-notify[16698]: Version 1.2.3 starting
Sep 11 04:20:42 slave33 sm-notify[16698]: Already notifying clients; Exiting!

Comment 2 Nigel Babu 2016-09-13 10:09:12 UTC
Also relevant:
[root@slave33 log]# ps ax | grep rpc
 2154 ?        Ss     0:00 /sbin/rpc.statd
 6606 ?        S      0:25 [rpciod/0]
 6607 ?        S      0:27 [rpciod/1]
 6786 ?        Ss   500:19 /sbin/rpc.statd
 8108 ?        Ss     0:00 /sbin/rpc.statd
16957 ?        Ss     3:02 rpcbind -w
18389 pts/0    S+     0:00 grep rpc

Comment 3 Nigel Babu 2016-09-13 10:32:10 UTC
Filed bug 1375526 for the test harness issue. I'll clean up the /var/messages file and bring the node back online.

Comment 4 Nigel Babu 2016-09-13 10:41:26 UTC
Back online.

Comment 5 Nigel Babu 2016-09-15 03:57:43 UTC
This needed a restart so the free space would be released by the process writing to /var/log/messages.

Now done.


Note You need to log in before you can comment on or make changes to this bug.