Bug 1381433

Summary: Need access to machines which run smoke
Product: [Community] GlusterFS Reporter: Pranith Kumar K <pkarampu>
Component: project-infrastructureAssignee: Nigel Babu <nigelb>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-infra, misc, pkarampu
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-19 04:15:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Collect statedumps before bricks/mounts are killed in smoke.sh none

Description Pranith Kumar K 2016-10-04 05:49:00 UTC
Description of problem:
hi Nigel,
     This is my public key:
11:17:50 :) ⚡ cat ~/.ssh/id_rsa.pub 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDkduuGBq++zm/JKYVUcfM6YOqzYp2Dj0ag3OvlkFTXyNZ1QVOoEWuH9MAeF/MlHd14nLvFKSdpI+qr+faY+Wtyt/Za09YnizyMBuEo9hIw307EwynOdfAO8N/PKLAvtsNQ7Xk3UHUfHrvVuJr5qZFs1sWNau67/DBxd3bUO/FUl3FZoZqWg3/qsG8ZTCVEPc4N0qY9xiDFxgDh81lmK8t24S8d9RfMrKtpPbSe75HW1CxqM6AGLpQtDscIydGqmRYYcYSn9box4T3erbVxNpcpSlk6K1akMJhbuNoEbDfD7n4t8X/BLj/h3gJIUTlrXnpPj+hluiHDmeBlhu7a7ctd pk.eng.blr.redhat.com


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Michael S. 2016-10-04 07:37:51 UTC
I would like to get a bit more justification, if only for tracking.

Comment 2 Pranith Kumar K 2016-10-04 08:45:13 UTC
Oops, sorry, the context is that we are trying to find RC for why smoke is hanging. To find that out we need a machine to run smoke in a loop and see where it is hanging. To reserve one of the machines for this purpose Nigel asked me to raise a bug with public-key.

Comment 3 Nigel Babu 2016-10-12 09:02:59 UTC
You should be all set on slave33.cloud.gluster.org. Please ssh in as jenkins.gluster.org.

Remember to delete any temporary files you create and undo any script changes you make.

Comment 4 Nigel Babu 2016-10-17 17:02:22 UTC
Apologies, it's slave34, not slave33.

Comment 5 Nigel Babu 2016-10-18 12:34:05 UTC
I'm guessing you need more time since the usual 1 week time ends tomorrow. I'll check back in next Tuesday/Wednesday.

Comment 6 Pranith Kumar K 2016-10-18 14:54:40 UTC
Nigel,
     There is some peculiar behavior, like  Milind was mentioning yesterday, it happened only the first time and never after that. Wonder what the trigger is. I think the best way forward for this hang is to improve our smoke.sh to take statedumps before killing the hung mount, which would give us more clue about why it hung. Let me ping you on IRC about the necessary things we can do! Take the machine back.

Pranith

Comment 7 Pranith Kumar K 2016-10-18 17:36:27 UTC
Created attachment 1211787 [details]
Collect statedumps before bricks/mounts are killed in smoke.sh

Attaching the modified smoke.sh, Can we deploy this on the build machines which run smoke? We can probably get a review too before we deploy it on all the machines? I am not sure about the practice.

Comment 8 Nigel Babu 2016-10-19 03:36:31 UTC
Hrm, have you thought of changing the gluster commit, rebuilding, and trying smoke again? That seems to be difference between "first time" and later attempts.

Comment 9 Pranith Kumar K 2016-10-19 03:41:22 UTC
Did that once, but didn't help. Since we anyway need smoke.sh to get better, that seemed like a better approach for now. I will be submitting a pull request today sometime. Let's see.

Comment 10 Nigel Babu 2016-10-19 04:15:53 UTC
I'll close this bug and put the machine back in the pool, then.