Bug 1324922
Summary: | Log handler repeatedly crashes | ||
---|---|---|---|
Product: | [Fedora] Fedora EPEL | Reporter: | John Eckersberg <jeckersb> |
Component: | erlang | Assignee: | Peter Lemenkov <lemenkov> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | epel7 | CC: | apevec, binarin, draganHR, ealcaniz, erlang, fdinitto, jeckersb, jschluet, lemenkov, lhh, oblaut, rjones, s, steven.dake, ushkalim |
Target Milestone: | --- | Keywords: | Regression, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | erlang-R16B-03.17.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1322609 | Environment: | |
Last Closed: | 2016-07-29 06:50:15 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1322609 | ||
Bug Blocks: | 1324185 |
Description
John Eckersberg
2016-04-07 15:12:57 UTC
erlang-R16B-03.17.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-e1035fad90 erlang-R16B-03.17.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-e1035fad90 I think your speculation is incorrect that the depth logging change, whatever that was, introduced this regression. The problem was introduced in -16 (adding IPV6 support). This fundamentally changes how epmd operates. epmd either binds to ipv4 or ipv6, depending on config, but not both. One workaround mentioned here: https://github.com/openstack/kolla/blob/master/ansible/roles/rabbitmq/templates/rabbitmq-env.conf.j2#L8 in comment #6 works on -16, but triggers epmd to bind to 0.0.0.0 (all interfaces) which could interfere with neutron, then tenant network, etc. If your going to enable ipv6, might as well fix epmd binding so its handled properly. btw otp-23 patch is a disaster. I have yet to try 17 with removal of EPMD binding which would be a good short term workaround but not a good long term workaround. Long term this will cause heisenbugs in neutron and other parts of the system that you just haven't discovered yet ;) I have tried -17 and it suffers from this same binding problem consistently. removal of EPMD binding solves the epmd: could not bind to any interface, followed by a erlang crash. Unfortunately with this mode of operation, a wildcard bind is done to all interfaces on the control nodes in OpenStack. Steven, is there some part of the conversation that is missing or have you posted your comments to a wrong bug? ) Because this one is only about broken logging - all other things should function just fine. Alexey, this is follow up to Bodhi comment in the linked update https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-e1035fad90 OK, let's regroup and clarify some things here before we get more confused. Part of this is my fault because I directed you on IRC to post on the bodhi update about your crash. I didn't realize at first that you were seeing an IPv6 crash and thought it was just the logging crash. Anyway... This particular bug is about broken logging. The current released version (R16B-03.16.el7) has broken logging. The only change[1] in the .17 release is to revert the patch that added the broken logging. So I would ask two things. (1) Ignore the IPv6 thing for this bug. It would be a huge help if you could just sanity check that .16 has broken logging and that .17 is correct (and update karma on the update accordingly). Then we can either ship that update ASAP or bundle it with an IPv6 fix (if we can get it quickly). (2) We'll file another bug for the IPv6 issue. We already fixed one crash bug in https://bugzilla.redhat.com/show_bug.cgi?id=1310808 (incidentally this is the update you said introduced your crash). If you can get it to reproduce and capture a coredump of the crash that would be awesome. I will try to reproduce as well by toying with ERL_EPMD_ADDRESS on my end. [1] http://pkgs.fedoraproject.org/cgit/rpms/erlang.git/commit/?h=epel7&id=6515854c294bc6be60987407a54d9680fd8faf65 Any updates ? I think what happened here is I confused the .15 and .16 together into one change according to jeckersb's statement. The issue with (.15 then) is that EPMD wildcard binds which could result in some really weird behavior if anyone in a cloud environment uses that port while neutron is in use on the box. I'm not sure if this is a legitimate situation, but no service in OpenStack should wildcard bind. That said, .16 is totally bust with logging - your right on that point. I don't recall where the erlang repo is to test -17 with, but if you could provide that I'll test Kolla's current master with it. It takes about 2 hours to test as soon as I have a repo to work with. Thanks -steve Steve, in RDO Mitaka testing repo http://buildlogs.centos.org/centos/7/cloud/x86_64/openstack-mitaka/ we have: erlang-R16B-03.17.el7 rabbitmq-server-3.3.5-17.el7 erlang-R16B-03.17.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report. |