Created attachment 346900 [details] Core file Description of problem: Corosync segfaults Version-Release number of selected component (if applicable): Steps to Reproduce: 1. not sure, but I think it happens during startup of cman Additional info: (gdb) bt full #0 0x0000003cb3280e70 in strlen () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f8461ea4e2d in ?? () from /usr/libexec/lcrso/config_cmanpre.lcrso No symbol table info available. #2 0x00007f8461ea5d49 in ?? () from /usr/libexec/lcrso/config_cmanpre.lcrso No symbol table info available. #3 0x00007f8461ea70d5 in ?? () from /usr/libexec/lcrso/config_cmanpre.lcrso No symbol table info available. #4 0x0000000000406440 in main (argc=<value optimized out>, argv=<value optimized out>) at main.c:601 error_string = 0x7f84622aa360 "Successfully read config from /etc/cluster/cluster.conf\n" main_config = {logfile = 0x0, logmode = 1783457056, syslog_facility = 32767, minimum_priority = 1649243536, user = 0x3cb333849b "clock_gettime", group = 0x7fff6a515230 ""} totem_config = {version = 1783456920, interfaces = 0x3cb2e0a1ce, interface_count = 0, node_id = 0, private_key = " eMj�\177", '\0' <repeats 26 times>, "[?��<", '\0' <repeats 35 times>, " eMj�\177\000\000�K#�<\000\000\000\beMj�\177\000\000`C3�<\000\000\0000dMj�\177\000\000�WV\000\000\000\000\000�\000\000\000\000\000\000", private_key_len = 191, token_timeout = 0, token_retransmit_timeout = 1783456816, token_hold_timeout = 32767, token_retransmits_before_loss_const = 6475784, join_timeout = 0, send_join_timeout = 0, consensus_timeout = 0, merge_timeout = 1783457256, downcheck_timeout = 32767, fail_to_recv_const = 4342669, seqno_unchanged_const = 0, rrp_token_expired_timeout = 4353661, rrp_problem_count_timeout = 0, rrp_problem_count_threshold = 1783456944, rrp_mode = "�\177\000\000\030\000\000\000\000\000\000\000�eMj�\177\000\000\001\000\000\000\000\000\000\000\000�b\000\000\000\000\000f1B\000\000\000\000\000�eMj�\177\000\000\000�b\000\000\000\000\000�eMj", totem_logging_configuration = {log_printf = 0x40603b <logsys_system_init+43>, log_level_security = 0, log_level_error = 0, log_level_warning = -1289180463, log_level_notice = 60, log_level_debug = 1}, secauth = 4342486, net_mtu = 0, threads = 1, heartbeat_failures_allowed = 750006350, max_network_delay = 4342272, window_size = 0, max_messages = 0, vsf_type = 0x405783 "H\203�\b��52}\""} objdb_handle = 0 config_handle = 2 objdb_p = (void *) 0x7f84624b02a0 config_p = (void *) 0x7f84620a85d0 config_iface = 0x163d760 "xmlconfig" iface = <value optimized out> res = 0 ch = <value optimized out> background = <value optimized out> setprio = 1 __PRETTY_FUNCTION__ = "main"
Sorry I forgot the corosync version - it is corosync-0.92-2.fc10.x86_64
Jakob We plan to update corosync/cman to latest versions after we have a more stable version available. Look for a solution in 7-10 days. Reassigning to Jan.
OK, I found several other crashes in cman/corosync with our lab setup. I'll refrain from filing these until the next update. Can you please ping me when the updated versions are built in koji so I can give them a spin and don't have to wait for the whole updates-testing/updates-stable machinery? Thanks!
Ya the version you are using is really old. We had to stop updates because of dependency updates and our changing abi was creating problems in fedora. I'll ping you when we have koji rpms. Regards -steve
Hello, is there any update on the new packages?
Yes, it's almost everything sorted. rawhide just got a new set of packages that are very close corosync 1.0 stable and the whole chain is being tested at the moment. I backported those packages to F10 a couple of days ago for testing and they seem to work fine. Once corosync is 1.0 we will start the official update process into Fedora. In the meantime you can use rawhide srpm and rebuild in F10. Fabio
Problems should be fixed in f10/f11/rawhide. Please reopen if you encounter the problem again. Thanks -steve