Description of problem: beah is sporadically giving up immediately after installation: [ 21.706545] beah-fwd-backend[4687]: 2016-08-10 05:14:54,476 backend linfo: INFO BackendFactory: Started to connect. [ 21.715520] beah-fwd-backend[4687]: 2016-08-10 05:14:54,480 backend linfo: INFO BackendFactory: Connection failed. Reason: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectError'>: An error occurred while connecting: 2: No such file or directory. [ 21.718006] beah-fwd-backend[4687]: ] [ 21.787252] beah-srv[4686]: 2016-08-10 05:14:54,559 beah start_server: INFO ################################ [ 21.791648] beah-srv[4686]: 2016-08-10 05:14:54,559 beah start_server: INFO # Starting a Controller... # [ 21.792904] beah-srv[4686]: 2016-08-10 05:14:54,560 beah start_server: INFO ################################ [ 21.794503] beah-srv[4686]: 2016-08-10 05:14:54,566 beah start_server: INFO Controller: BackendListener listening on :: port 12432 [ 21.795918] beah-srv[4686]: 2016-08-10 05:14:54,567 beah start_server: INFO Controller: BackendListener listening on /var/beah/backend12432.socket [ 21.800033] beah-srv[4686]: 2016-08-10 05:14:54,571 beah start_server: INFO Controller: TaskListener listening on :: port 12434 [ 21.801298] beah-srv[4686]: 2016-08-10 05:14:54,572 beah start_server: INFO Controller: TaskListener listening on /var/beah/task12434.socket [ 21.962790] beah-beaker-backend[4681]: 2016-08-10 05:14:54,734 backend has_ipv6: INFO Resolved IPv6 address of the LC [ 21.965173] beah-beaker-backend[4681]: 2016-08-10 05:14:54,737 backend make_proxy: INFO Communicating to LC over IPv6 (2620:52:0:1040:5054:ff:fe38:5099) [ 21.973151] beah-beaker-backend[4681]: 2016-08-10 05:14:54,744 backend linfo: INFO BackendFactory: Started to connect. [ 21.976648] beah-beaker-backend[4681]: 2016-08-10 05:14:54,748 backend linfo: INFO BackendFactory: Connected. Address: UNIXAddress('/var/beah/backend12432.socket') [ 21.978915] beah-beaker-backend[4681]: 2016-08-10 05:14:54,750 backend linfo: INFO BackendFactory: Resetting reconnection delay [ 21.980551] beah-srv[4686]: 2016-08-10 05:14:54,748 beah buildProtocol: INFO BackendListener: Connected. Address: UNIXAddress('') [ 21.982402] beah-beaker-backend[4681]: 2016-08-10 05:14:54,751 backend set_controller: INFO Connection to controller lost. [ 21.987383] beah-beaker-backend[4681]: 2016-08-10 05:14:54,759 backend set_controller: INFO Connected to controller. [ 22.287510] beah-beaker-backend[4681]: 2016-08-10 05:14:55,058 backend handle_recipe_exception: INFO The recipe has finished. [ 24.663004] beah-fwd-backend[4687]: 2016-08-10 05:14:57,434 backend linfo: INFO BackendFactory: Started to connect. [ 24.664735] beah-fwd-backend[4687]: 2016-08-10 05:14:57,436 backend linfo: INFO BackendFactory: Connected. Address: UNIXAddress('/var/beah/backend12432.socket') [ 24.666359] beah-srv[4686]: 2016-08-10 05:14:57,436 beah buildProtocol: INFO BackendListener: Connected. Address: UNIXAddress('') [ 24.668014] beah-fwd-backend[4687]: 2016-08-10 05:14:57,436 backend linfo: INFO BackendFactory: Resetting reconnection delay This is presumably because of unexpected recipe state, as I recall seeing occasionally also "Installing" state. While beah has this code: rs = xml_attr(recipe_parser.recipe_node, 'status') if rs not in ['Running', 'Waiting']: raise NothingToDoException("The recipe has finished.") Version-Release number of selected component (if applicable): 0.7.9-2.el7 How reproducible: sporadically Steps to Reproduce: Actual results: recipe runs no tasks Expected results: recipe runs all tasks Additional info:
And it looks patch for this has been posted already: https://gerrit.beaker-project.org/#/c/4818/ Bug 991245 comment 12 says that Beaker will move status back to Waiting once the installation is finished, but as I recall status updates are asynchronous and presumably race with reboot and first run of beah.
Hmm yes and the asynchronous status updates are happening slower than normal right now, due to the mess with bug 1362414. So it's more likely for beah to see a recipe still in Installing state. But that beah patch should have been released with this update, for this exact reason... I will have to figure out what's gone wrong.
(In reply to Jan Stancek from comment #2) > And it looks patch for this has been posted already: > https://gerrit.beaker-project.org/#/c/4818/ Oh damn, turns out I never tagged a new version with this patch. :-( I guess I forgot since it didn't have a dedicated bz, it was done as part of bug 991245. Sometimes I think we might have to bite the bullet and make a dedicated bz product for Beah...
Tagged and built beah 0.7.10 with this fix.
This fix has been deployed in production by beah 0.7.10-1.el6.