Bug 1286829

Summary: fedmsg-tail "hangs" when host cannot be resolved
Product: [Fedora] Fedora Reporter: Zbigniew Jędrzejewski-Szmek <zbyszek>
Component: fedmsgAssignee: Ralph Bean <rbean>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: lmacken, rbean
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-17 14:59:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zbigniew Jędrzejewski-Szmek 2015-11-30 20:36:13 UTC
Description of problem:
$ fedmsg-tail --pretty
[2015-11-30 15:21:42][    fedmsg WARNING] Couldn't resolve 'hub.fedoraproject.org'
^C()

If the host cannot be resolved, continuing doesn't seem to make much sense. Should exit with an error.

Comment 1 Ralph Bean 2016-03-17 14:55:46 UTC
Good idea, yes.

Comment 2 Ralph Bean 2016-03-17 14:59:43 UTC
Actually, on second thought - take a look at the code here:  https://github.com/fedora-infra/fedmsg/blob/develop/fedmsg/core.py#L359-L364

A fedmsg client can (and usually does) connect to many message producing endpoints.  For a Fedora end-user, you are typically only subscribed to one:  the Fedora Infrastructure outbound gateway at hub.fedoraproject.org:9940.  But you could also be subscribed to debian's endpoint, or others.  Internally in Fedora Infrastructure, our message consumers are connected to hundreds of message producing endpoints.

If *one* of those endpoints is not resolvable for some reason, should the consumer crash?  fedmsg's design suggests "no".  One of our main goals is to be able to fail graceful in the face of partial network outages, so things can continue to operate "as best they can".  That's why there's a warning there, and not an error.

Thanks for the report!  I'm going to close as NOTABUG.  Feel free to take it up on the upstream fedmsg issue tracker if you like and we can debate it more :)

Comment 3 Zbigniew Jędrzejewski-Szmek 2016-03-17 15:08:17 UTC
It could fail if *all* endpoints failed (in particular in this case all==one).

But the whole issue is a bit of an edge case, so I'm not going to open a bug upstream.