Bug 1498390
Summary: | Centos regressions fail on slave20 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Nithya Balachandran <nbalacha> | ||||
Component: | project-infrastructure | Assignee: | bugs <bugs> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | mainline | CC: | bugs, gluster-infra, mscherer, nigelb | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-12-06 08:46:24 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Nithya Balachandran
2017-10-04 08:12:13 UTC
Created attachment 1334089 [details]
Centos regression failure.
slave27 is also affected, I am rebooting it, after taking slave20 out of rotation and inspecting the others. Might be relatex to https://review.gluster.org/#/c/17789/ since this patch did ran on slave20 and slave27, who are broken and slave25, that I try to investigate right now. So I reenabled slave27, I rebooted slave25. I guess I will try to dig a bit more on slave20 to dig what is the error, but likely reboot once I figure I have no idea to debug what is going with it. The process on slave20 seems to have been started by https://review.gluster.org/18271 # (for i in $(ps fax |grep gluster |awk '{ print $1}' ); do cat /proc/$i/environ |sed 's/=/\n/g' |grep -a -A 1 GERRIT_CHANGE_URL |sed 's/SUDO_COMMAND//' ; done;) |grep -a http | sort -u https://review.gluster.org/18271 I am gonna reboot and keep a eye on this one. If any other builder fail, please ping me on irc. So slave27 issue was different. I did had to wipe the /home/jenkins/root to make it work again for some reason. I looked on the rpm content, no issue, no issue with selinux, no disk full. I am a bit puzzled. So today, that's slave22 with a ton of process broken after running regressions for https://review.gluster.org/#/c/18271/ For the record, I did reboot slave22, but forgot to write it down, sorry about that. Slave22 wasn't fully recovered, i suspect we have a 2nd issue on our hands. I terminated all java process on that server, and did restart the agent. The log say nothing useful from where I look. This issue need a full node restarted and a disconnect/reconnect. We don't see this problem anymore. |