Description of problem: https://build.gluster.org/job/strfmt_errors/13347/console : 16:43:52 Building remotely on builder17.int.rht.gluster.org (smoke7 rpm7 regression7) in workspace /home/jenkins/root/workspace/strfmt_errors ... 16:53:02 INFO: Cleaning up build root ('cleanup_on_success=True') 16:53:02 Start: clean chroot 16:53:08 Finish: clean chroot 16:53:08 Finish: run 16:53:08 Archiving artifacts 16:53:08 ERROR: Step ‘Archive the artifacts’ aborted due to exception: 16:53:08 java.nio.file.FileSystemException: /var/lib/jenkins/jobs/strfmt_errors/builds/13347/archive: No space left on device 16:53:08 at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) 16:53:08 at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 16:53:08 at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 16:53:08 at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) 16:53:08 at java.nio.file.Files.createDirectory(Files.java:674) 16:53:08 at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) 16:53:08 at java.nio.file.Files.createDirectories(Files.java:767) 16:53:08 at hudson.FilePath.mkdirs(FilePath.java:3102) 16:53:08 at hudson.FilePath.readFromTar(FilePath.java:2458) 16:53:08 Caused: java.io.IOException: Failed to extract /home/jenkins/root/workspace/strfmt_errors/transfer of 5 files 16:53:08 at hudson.FilePath.readFromTar(FilePath.java:2474) 16:53:08 at hudson.FilePath.copyRecursiveTo(FilePath.java:2360) 16:53:08 at jenkins.model.StandardArtifactManager.archive(StandardArtifactManager.java:61) 16:53:08 at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:235) 16:53:08 at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81) 16:53:08 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 16:53:08 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) 16:53:08 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690) 16:53:08 at hudson.model.Build$BuildExecution.post2(Build.java:186) 16:53:08 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:635) 16:53:08 at hudson.model.Run.execute(Run.java:1823) 16:53:08 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 16:53:08 at hudson.model.ResourceController.execute(ResourceController.java:97) 16:53:08 at hudson.model.Executor.run(Executor.java:429) 16:53:08 Finished: FAILURE
Huh, I was under the impression we added more space to Jenkins we did a restart. Seemingly not. Going to correct this right away.
Root cause of this is that we've run more jobs in the last 2 weeks than we normally do. This has blown away the space estimates that we had. For now I've deleted archives older than 2 weeks. We need to do some clean up on that server so we have more free space. Once that's done we'll attach more space. Leaving the bug open to track the increased space.
We did add 40G, since the default was 150G and that's now 190G.
(In reply to Nigel Babu from comment #2) > Root cause of this is that we've run more jobs in the last 2 weeks than we > normally do. This has blown away the space estimates that we had. For now > I've deleted archives older than 2 weeks. > > We need to do some clean up on that server so we have more free space. Once > that's done we'll attach more space. Leaving the bug open to track the > increased space. I wonder if VDO can help... I also think we need to look at shallow cloning: [ykaul@ykaul tmp]$ time git clone ssh://mykaul.org/glusterfs Cloning into 'glusterfs'... remote: Counting objects: 2933, done remote: Finding sources: 100% (71/71) remote: Total 165164 (delta 0), reused 165119 (delta 0) Receiving objects: 100% (165164/165164), 89.17 MiB | 2.43 MiB/s, done. Resolving deltas: 100% (102537/102537), done. real 0m52.042s user 0m25.482s sys 0m1.876s [ykaul@ykaul tmp]$ du -ch glusterfs |grep total 124M total [ykaul@ykaul tmp]$ ls -lR glusterfs | wc -l 3764 [ykaul@ykaul tmp]$ time git clone --depth 1 ssh://mykaul.org/glusterfs Cloning into 'glusterfs'... remote: Counting objects: 2486, done remote: Finding sources: 100% (2486/2486) remote: Total 2486 (delta 86), reused 1325 (delta 86) Receiving objects: 100% (2486/2486), 4.50 MiB | 1.56 MiB/s, done. Resolving deltas: 100% (86/86), done. real 0m10.380s user 0m0.603s sys 0m0.352s [ykaul@ykaul tmp]$ du -ch glusterfs |grep total 35M total [ykaul@ykaul tmp]$ ls -lR glusterfs | wc -l 3764
VDO is still in beta, and would requires a new partition, and so space to copy. I doubt we can do miracle here. It would also be trading space for CPU, not sure if that's worthwhile. The git clone are done on the builders, I doubt that's gonna decrease the space on the jenkins server itself.
(In reply to M. Scherer from comment #5) > VDO is still in beta, and would requires a new partition, and so space to > copy. I doubt we can do miracle here. It would also be trading space for > CPU, not sure if that's worthwhile. VDO is NOT on beta. It was released as GA on RHEL 7.5 (I'm talking about VDO on our Jenkins hosts). > > The git clone are done on the builders, I doubt that's gonna decrease the > space on the jenkins server itself. It's going to decrease time and space usage on the builders. That's important too. (In fact, I believe we should checkout to /dev/shm, ensure GCC temp dir is on /dev/shm and compile it there)
Oh, indeed, I did misread the title of blog post, who was "VDO in RHEL 7.5 beta" as beta for for VDO, but that was for RHEL 7.5. But we would still need to do some heavy partition changes and I rather avoid doing that right now. Longer term, yeah, compression could help a lot, not sure about dedup. As for builders, this is irrelevant to that bug. If you wish, open a new one, but I am not sure we can do that, since the git clone is done by a jenkins plugin, and maybe it doesn't support that. But I also think that would be a minimal improvement for the time of regression, maybe reduce the time for a git clone could help having less process on gerrit side. But again, let's not mix bugs, or it is gonna become a mess.
All of the original problems with space are fixed, so I'm closing this bug now.