Bug 989934
Summary: | corosync 1.4.6 crash when an unpluged network cable is pluged back in udpu mode | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Shining <nshi_nb> |
Component: | corosync | Assignee: | Jan Friesse <jfriesse> |
Status: | CLOSED DUPLICATE | QA Contact: | Cluster QE <mspqa-list> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.2 | CC: | ccaulfie, cluster-maint, sdake |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-08-05 08:10:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Shining
2013-07-30 07:47:38 UTC
Corosync 1.4.6 is not part of RHEL-6, it's version 1.4.1. Does this happen on a supported version? I build corosync-1.4.6 based on the lastest corosync source code and corosync src rpm package from rhel6. I will make another test on corosync-1.4.1 to make sure whether the bugs exists in 1.4.1. I am so sorry. This bug is caused by the service written by myself. After remove my service from corosync, the corosync works correct again. No problem. Can you close this BZ then please :) The bug missing is an mistake. It is still there. Because I had open the corefile flag in my service, I can get the corosync crash by the exist of corefile. After remove my service, there's no corefile generated when corosync is crashed. ----------------------------------------------------------------- Aug 01 14:09:45 corosync [TOTEM ] The network interface [172.20.0.128] is now up. Aug 01 14:09:45 corosync [TOTEM ] adding new UDPU member {172.20.0.128} my_failed_list 1 my_proc_list 2 token_memb_entries 1 Aug 01 14:09:45 corosync [TOTEM ] entering GATHER state from 15. my_failed_list 1 my_proc_list 2 token_memb_entries 1 my_failed_list 1 my_proc_list 2 token_memb_entries 1 ... ... my_failed_list 1 my_proc_list 2 token_memb_entries 1 my_failed_list 2 my_proc_list 2 token_memb_entries 0 corosync: totemsrp.c:1258: memb_consensus_agreed: Assertion `token_memb_entries >= 1' failed. Aug 01 14:09:46 corosync [TOTEM ] entering GATHER state from 0. ./myrun: line 3: 2003 Aborted (core dumped) ./corosync -f "$@" ----------------------------------------------------------------- my_failed_list 1: 172.20.0.128 my_proc_list 2: 172.20.0.128 127.0.0.1 at the point crash: my_failed_list 2: 172.20.0.128 127.0.0.1 my_proc_list 2: 172.20.0.128 127.0.0.1 Does the my_failed_list or my_proc_list need to be reinitialized after the network interface is up? --------------------- my_failed_list 1: 172.20.0.128 my_proc_list 2: 172.20.0.128 127.0.0.1 --------------------- should be --------------------- my_failed_list 2: 172.20.0.128 127.0.0.1 my_proc_list 1: 172.20.0.128 --------------------- Ifdown is unsupported. Only supported way to simulate failure is iptables drop (both uncast and multicast traffic) or unplug cable WITHOUT network manager (NM does ifdown on cable unplug). Also this is clone of 881694. *** This bug has been marked as a duplicate of bug 881694 *** |