Bug 1393748

Summary: BEAH was not installed - job got stucked
Product: [Retired] Beaker Reporter: Ivo Hradek <ihradek>
Component: beahAssignee: beaker-dev-list
Status: CLOSED NOTABUG QA Contact: tools-bugs <tools-bugs>
Severity: low Docs Contact:
Priority: unspecified    
Version: 23CC: dcallagh, jstourac, mjia, rjoost
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-24 01:04:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ivo Hradek 2016-11-10 09:10:15 UTC
Hello,

During testing I've encountered issue on RHEL7.3 when beah was not installed successfully and beaker job go stuck. I've re-run the job and everything was fine.

Comparing unsuccessful [1] and successful [2] installation, I see that different Repo-baseurl ([3] and [4]) was used for beaker-harness repo.

Is this expected? Moreover beaker reported installation as completed despite the errors.

Note: The "connection refused" seen in [1] might be *.mw.lab.eng.bos.redhat.com machines issue.

Thanks,

Ivo H.
--
[1] https://beaker.engineering.redhat.com/recipes/3241171#installation
[2] https://beaker.engineering.redhat.com/recipes/3241543#installation
[3] http://download-node-02.eng.bos.redhat.com/*
[4] http://beaker.engineering.redhat.com/*

Comment 1 Ivo Hradek 2016-11-10 14:21:21 UTC
EDIT: Repo-baseurl [4] was http://download.eng.bos.redhat.com/*

More-info: Notice that beaker silently continued in installation and stucked on "dev131 login:", probably when calling _beah-fwd-backend_, which was not even installed. From my point of view I'd expect the installation would failed at some point.

Comment 2 Dan Callaghan 2017-10-24 01:04:43 UTC
Sorry Ivo, the logs from those recipes are gone now so it is hard to tell exactly what happened.

I am assuming it was a network problem (or outage on download.eng.bos.redhat.com or similar) which caused the beah installation to fail.

Unfortunately since that step just executes in the kickstart %post, there is no good way for Beaker to know if that failed or not, apart from the fact that the machine will never actually start running anything. And eventually the recipe's watchdog timer expires and it will be aborted.

One other thing worth noting: normally Beaker would install the harness packages from the local cache on the Beaker server itself, at: http://beaker.engineering.redhat.com/harness/ This is specifically to avoid adding an extra possible failure point (the download server). So I'm not sure why your recipe would have used download.eng.bos.redhat.com instead. I'm guessing it was a system-specific configuration on those *.mw.lab.eng.bos machines at the time.

So I'm going to close this bug now, as I don't think there is any defect in Beaker itself.

Comment 3 Dan Callaghan 2017-10-24 01:07:20 UTC
FYI relevant to this, there are some existing quite pie-in-the-sky RFEs which would be useful in this scenario, but quite hard to implement:

Bug 893075: integrate with Anaconda (as a plugin) to better detect failures
Bug ???: make the console install failure detector catch yum errors

(I can't find the bug number for the latter one but it has been discussed before.)