Bug 2039675
| Summary: | pacemaker coredump with ocf:heartbeat:mysql resource | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Martin Juricek <mjuricek> | |
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 8.6 | CC: | cluster-maint, msmazova, nwahl, sbradley | |
| Target Milestone: | rc | Keywords: | Regression, Triaged | |
| Target Release: | 8.6 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | Unspecified | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | pacemaker-2.1.2-3.el8 | Doc Type: | No Doc Update | |
| Doc Text: |
This bug is in an intermediate build not yet released in RHEL
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2040443 (view as bug list) | Environment: | ||
| Last Closed: | 2022-05-10 14:10:03 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2040443 | |||
The coredumps start with https://github.com/ClusterLabs/pacemaker/pull/2506/commits/31c7fa8a. To sum up: - An event string gets converted to an XML node in lrmd_ipc_dispatch. - lrmd_dispatch_internal uses the value of the XML node's F_LRMD_RSC_EXIT_REASON attribute directly when building an lrmd_event_data_t object. The string is not copied. - Via a callback, this event makes its way down to pcmk__create_history_xml (call trace in comment 2). - pcmk__create_history_xml frees the exit-reason for a notify operation as part of its way of ignoring the result. if (pcmk__str_eq(task, CRMD_ACTION_NOTIFY, pcmk__str_none)) { const char *n_type = crm_meta_value(op->params, "notify_type"); const char *n_task = crm_meta_value(op->params, "notify_operation"); CRM_LOG_ASSERT(n_type != NULL); CRM_LOG_ASSERT(n_task != NULL); op_id = pcmk__notify_key(op->rsc_id, n_type, n_task); if (op->op_status != PCMK_EXEC_PENDING) { /* Ignore notify errors. * * @TODO It might be better to keep the correct result here, and * ignore it in process_graph_event(). */ lrmd__set_result(op, 0, PCMK_EXEC_DONE, NULL); } - lrmd_ipc_dispatch later frees the whole XML node that it created. That XML node contains the exit-reason string that's already been freed. Fixed upstream as of commit ed8b2c86a thanks to nwahl Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1885 |
Notes in case I don't make it back to this: Issue seems to disappear if I remove notify=true. I didn't set up mysql, so my resource fails differently, but valgrind shows a double free of xml. ==71171== Invalid free() / delete / delete[] / realloc() ==71171== at 0x4C3610C: free (vg_replace_malloc.c:538) ==71171== by 0x84AD447: xmlFreeNodeList__internal_alias (tree.c:3703) ==71171== by 0x84AD447: xmlFreeNodeList (tree.c:3665) ==71171== by 0x84AD52A: xmlFreeProp (tree.c:2097) ==71171== by 0x84AD6A8: xmlFreePropList__internal_alias (tree.c:2072) ==71171== by 0x84AD6A8: xmlFreePropList (tree.c:2067) ==71171== by 0x84AD36A: xmlFreeNodeList__internal_alias (tree.c:3697) ==71171== by 0x84AD36A: xmlFreeNodeList (tree.c:3665) ==71171== by 0x84AD119: xmlFreeDoc (tree.c:1253) ==71171== by 0x5DC770A: free_xml_with_position (xml.c:775) ==71171== by 0x574CAA2: lrmd_ipc_dispatch (lrmd_client.c:341) ==71171== by 0x574CAA2: lrmd_ipc_dispatch (lrmd_client.c:332) ==71171== by 0x5DA7753: mainloop_gio_callback (mainloop.c:770) ==71171== by 0x818A95C: g_main_dispatch (gmain.c:3193) ==71171== by 0x818A95C: g_main_context_dispatch (gmain.c:3873) ==71171== by 0x818AD17: g_main_context_iterate.isra.21 (gmain.c:3946) ==71171== by 0x818B041: g_main_loop_run (gmain.c:4142) ==71171== by 0x110DE2: crmd_init (pacemaker-controld.c:160) ==71171== by 0x110BFD: main (pacemaker-controld.c:130) ==71171== Address 0x12446e50 is 0 bytes inside a block of size 56 free'd ==71171== at 0x4C3610C: free (vg_replace_malloc.c:538) ==71171== by 0x57506E0: lrmd__set_result (lrmd_client.c:2374) ==71171== by 0x50AEC61: pcmk__create_history_xml (pcmk_sched_utils.c:616) ==71171== by 0x119DF9: build_operation_update.part.13 (controld_execd.c:697) ==71171== by 0x11C012: build_operation_update (controld_execd.c:2127) ==71171== by 0x11C012: controld_ack_event_directly (controld_execd.c:2126) ==71171== by 0x11D3CC: process_lrm_event (controld_execd.c:2877) ==71171== by 0x574B524: lrmd_dispatch_internal (lrmd_client.c:323) ==71171== by 0x574CA9A: lrmd_ipc_dispatch (lrmd_client.c:340) ==71171== by 0x574CA9A: lrmd_ipc_dispatch (lrmd_client.c:332) ==71171== by 0x5DA7753: mainloop_gio_callback (mainloop.c:770) ==71171== by 0x818A95C: g_main_dispatch (gmain.c:3193) ==71171== by 0x818A95C: g_main_context_dispatch (gmain.c:3873) ==71171== by 0x818AD17: g_main_context_iterate.isra.21 (gmain.c:3946) ==71171== by 0x818B041: g_main_loop_run (gmain.c:4142) ==71171== by 0x110DE2: crmd_init (pacemaker-controld.c:160) ==71171== by 0x110BFD: main (pacemaker-controld.c:130) ==71171== Block was alloc'd at ==71171== at 0x4C34F0B: malloc (vg_replace_malloc.c:307) ==71171== by 0x850495B: xmlBufResize (buf.c:827) ==71171== by 0x850495B: xmlBufResize (buf.c:745) ==71171== by 0x8504A1B: xmlBufAdd (buf.c:902) ==71171== by 0x8504A1B: xmlBufAdd (buf.c:868) ==71171== by 0x84B07D5: xmlStringLenGetNodeList (tree.c:1452) ==71171== by 0x85562AB: xmlSAX2AttributeNs (SAX2.c:2051) ==71171== by 0x8556995: xmlSAX2StartElementNs (SAX2.c:2416) ==71171== by 0x849FA0D: xmlParseStartTag2 (parser.c:9637) ==71171== by 0x84A370E: xmlParseElement (parser.c:9982) ==71171== by 0x84A3D29: xmlParseDocument (parser.c:10765) ==71171== by 0x84A4136: xmlDoRead (parser.c:15237) ==71171== by 0x5DCABEA: string2xml (xml.c:887) ==71171== by 0x574CA8C: lrmd_ipc_dispatch (lrmd_client.c:338) ==71171== by 0x574CA8C: lrmd_ipc_dispatch (lrmd_client.c:332) ==71171== by 0x5DA7753: mainloop_gio_callback (mainloop.c:770) ==71171== by 0x818A95C: g_main_dispatch (gmain.c:3193) ==71171== by 0x818A95C: g_main_context_dispatch (gmain.c:3873) ==71171== by 0x818AD17: g_main_context_iterate.isra.21 (gmain.c:3946) ==71171== by 0x818B041: g_main_loop_run (gmain.c:4142) ==71171== by 0x110DE2: crmd_init (pacemaker-controld.c:160) ==71171== by 0x110BFD: main (pacemaker-controld.c:130) ==71171== { <insert_a_suppression_name_here> Memcheck:Free fun:free fun:xmlFreeNodeList__internal_alias fun:xmlFreeNodeList fun:xmlFreeProp fun:xmlFreePropList__internal_alias fun:xmlFreePropList fun:xmlFreeNodeList__internal_alias fun:xmlFreeNodeList fun:xmlFreeDoc fun:free_xml_with_position fun:lrmd_ipc_dispatch fun:lrmd_ipc_dispatch fun:mainloop_gio_callback fun:g_main_dispatch fun:g_main_context_dispatch fun:g_main_context_iterate.isra.21 fun:g_main_loop_run fun:crmd_init fun:main } ==71171== ==71171== HEAP SUMMARY: ==71171== in use at exit: 23,621 bytes in 70 blocks ==71171== total heap usage: 211,342 allocs, 211,276 frees, 43,953,549 bytes allocated ==71171== ==71171== LEAK SUMMARY: ==71171== definitely lost: 0 bytes in 0 blocks ==71171== indirectly lost: 0 bytes in 0 blocks ==71171== possibly lost: 0 bytes in 0 blocks ==71171== still reachable: 23,069 bytes in 62 blocks ==71171== suppressed: 552 bytes in 8 blocks ==71171== Reachable blocks (those to which a pointer was found) are not shown. ==71171== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==71171== ==71171== For lists of detected and suppressed errors, rerun with: -s ==71171== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from 0)