RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2039675 - pacemaker coredump with ocf:heartbeat:mysql resource
Summary: pacemaker coredump with ocf:heartbeat:mysql resource
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: pacemaker
Version: 8.6
Hardware: Unspecified
OS: Linux
urgent
high
Target Milestone: rc
: 8.6
Assignee: Ken Gaillot
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 2040443
TreeView+ depends on / blocked
 
Reported: 2022-01-12 08:36 UTC by Martin Juricek
Modified: 2022-06-01 15:26 UTC (History)
4 users (show)

Fixed In Version: pacemaker-2.1.2-3.el8
Doc Type: No Doc Update
Doc Text:
This bug is in an intermediate build not yet released in RHEL
Clone Of:
: 2040443 (view as bug list)
Environment:
Last Closed: 2022-05-10 14:10:03 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ClusterLabs pacemaker pull 2603/commits/ed8b2c8 0 None None None 2022-01-13 21:35:51 UTC
Red Hat Issue Tracker KCSOPP-1858 0 None None None 2022-06-01 15:26:01 UTC
Red Hat Issue Tracker RHELPLAN-107535 0 None None None 2022-01-12 08:39:39 UTC
Red Hat Product Errata RHBA-2022:1885 0 None None None 2022-05-10 14:10:10 UTC

Comment 2 Reid Wahl 2022-01-12 11:21:39 UTC
Notes in case I don't make it back to this:

Issue seems to disappear if I remove notify=true.

I didn't set up mysql, so my resource fails differently, but valgrind shows a double free of xml.

==71171== Invalid free() / delete / delete[] / realloc()
==71171==    at 0x4C3610C: free (vg_replace_malloc.c:538)
==71171==    by 0x84AD447: xmlFreeNodeList__internal_alias (tree.c:3703)
==71171==    by 0x84AD447: xmlFreeNodeList (tree.c:3665)
==71171==    by 0x84AD52A: xmlFreeProp (tree.c:2097)
==71171==    by 0x84AD6A8: xmlFreePropList__internal_alias (tree.c:2072)
==71171==    by 0x84AD6A8: xmlFreePropList (tree.c:2067)
==71171==    by 0x84AD36A: xmlFreeNodeList__internal_alias (tree.c:3697)
==71171==    by 0x84AD36A: xmlFreeNodeList (tree.c:3665)
==71171==    by 0x84AD119: xmlFreeDoc (tree.c:1253)
==71171==    by 0x5DC770A: free_xml_with_position (xml.c:775)
==71171==    by 0x574CAA2: lrmd_ipc_dispatch (lrmd_client.c:341)
==71171==    by 0x574CAA2: lrmd_ipc_dispatch (lrmd_client.c:332)
==71171==    by 0x5DA7753: mainloop_gio_callback (mainloop.c:770)
==71171==    by 0x818A95C: g_main_dispatch (gmain.c:3193)
==71171==    by 0x818A95C: g_main_context_dispatch (gmain.c:3873)
==71171==    by 0x818AD17: g_main_context_iterate.isra.21 (gmain.c:3946)
==71171==    by 0x818B041: g_main_loop_run (gmain.c:4142)
==71171==    by 0x110DE2: crmd_init (pacemaker-controld.c:160)
==71171==    by 0x110BFD: main (pacemaker-controld.c:130)
==71171==  Address 0x12446e50 is 0 bytes inside a block of size 56 free'd
==71171==    at 0x4C3610C: free (vg_replace_malloc.c:538)
==71171==    by 0x57506E0: lrmd__set_result (lrmd_client.c:2374)
==71171==    by 0x50AEC61: pcmk__create_history_xml (pcmk_sched_utils.c:616)
==71171==    by 0x119DF9: build_operation_update.part.13 (controld_execd.c:697)
==71171==    by 0x11C012: build_operation_update (controld_execd.c:2127)
==71171==    by 0x11C012: controld_ack_event_directly (controld_execd.c:2126)
==71171==    by 0x11D3CC: process_lrm_event (controld_execd.c:2877)
==71171==    by 0x574B524: lrmd_dispatch_internal (lrmd_client.c:323)
==71171==    by 0x574CA9A: lrmd_ipc_dispatch (lrmd_client.c:340)
==71171==    by 0x574CA9A: lrmd_ipc_dispatch (lrmd_client.c:332)
==71171==    by 0x5DA7753: mainloop_gio_callback (mainloop.c:770)
==71171==    by 0x818A95C: g_main_dispatch (gmain.c:3193)
==71171==    by 0x818A95C: g_main_context_dispatch (gmain.c:3873)
==71171==    by 0x818AD17: g_main_context_iterate.isra.21 (gmain.c:3946)
==71171==    by 0x818B041: g_main_loop_run (gmain.c:4142)
==71171==    by 0x110DE2: crmd_init (pacemaker-controld.c:160)
==71171==    by 0x110BFD: main (pacemaker-controld.c:130)
==71171==  Block was alloc'd at
==71171==    at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==71171==    by 0x850495B: xmlBufResize (buf.c:827)
==71171==    by 0x850495B: xmlBufResize (buf.c:745)
==71171==    by 0x8504A1B: xmlBufAdd (buf.c:902)
==71171==    by 0x8504A1B: xmlBufAdd (buf.c:868)
==71171==    by 0x84B07D5: xmlStringLenGetNodeList (tree.c:1452)
==71171==    by 0x85562AB: xmlSAX2AttributeNs (SAX2.c:2051)
==71171==    by 0x8556995: xmlSAX2StartElementNs (SAX2.c:2416)
==71171==    by 0x849FA0D: xmlParseStartTag2 (parser.c:9637)
==71171==    by 0x84A370E: xmlParseElement (parser.c:9982)
==71171==    by 0x84A3D29: xmlParseDocument (parser.c:10765)
==71171==    by 0x84A4136: xmlDoRead (parser.c:15237)
==71171==    by 0x5DCABEA: string2xml (xml.c:887)
==71171==    by 0x574CA8C: lrmd_ipc_dispatch (lrmd_client.c:338)
==71171==    by 0x574CA8C: lrmd_ipc_dispatch (lrmd_client.c:332)
==71171==    by 0x5DA7753: mainloop_gio_callback (mainloop.c:770)
==71171==    by 0x818A95C: g_main_dispatch (gmain.c:3193)
==71171==    by 0x818A95C: g_main_context_dispatch (gmain.c:3873)
==71171==    by 0x818AD17: g_main_context_iterate.isra.21 (gmain.c:3946)
==71171==    by 0x818B041: g_main_loop_run (gmain.c:4142)
==71171==    by 0x110DE2: crmd_init (pacemaker-controld.c:160)
==71171==    by 0x110BFD: main (pacemaker-controld.c:130)
==71171== 
{
   <insert_a_suppression_name_here>
   Memcheck:Free
   fun:free
   fun:xmlFreeNodeList__internal_alias
   fun:xmlFreeNodeList
   fun:xmlFreeProp
   fun:xmlFreePropList__internal_alias
   fun:xmlFreePropList
   fun:xmlFreeNodeList__internal_alias
   fun:xmlFreeNodeList
   fun:xmlFreeDoc
   fun:free_xml_with_position
   fun:lrmd_ipc_dispatch
   fun:lrmd_ipc_dispatch
   fun:mainloop_gio_callback
   fun:g_main_dispatch
   fun:g_main_context_dispatch
   fun:g_main_context_iterate.isra.21
   fun:g_main_loop_run
   fun:crmd_init
   fun:main
}
==71171== 
==71171== HEAP SUMMARY:
==71171==     in use at exit: 23,621 bytes in 70 blocks
==71171==   total heap usage: 211,342 allocs, 211,276 frees, 43,953,549 bytes allocated
==71171== 
==71171== LEAK SUMMARY:
==71171==    definitely lost: 0 bytes in 0 blocks
==71171==    indirectly lost: 0 bytes in 0 blocks
==71171==      possibly lost: 0 bytes in 0 blocks
==71171==    still reachable: 23,069 bytes in 62 blocks
==71171==         suppressed: 552 bytes in 8 blocks
==71171== Reachable blocks (those to which a pointer was found) are not shown.
==71171== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==71171== 
==71171== For lists of detected and suppressed errors, rerun with: -s
==71171== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from 0)

Comment 3 Reid Wahl 2022-01-13 02:45:15 UTC
The coredumps start with https://github.com/ClusterLabs/pacemaker/pull/2506/commits/31c7fa8a.

Comment 4 Reid Wahl 2022-01-13 03:27:52 UTC
To sum up:
  - An event string gets converted to an XML node in lrmd_ipc_dispatch.
  - lrmd_dispatch_internal uses the value of the XML node's F_LRMD_RSC_EXIT_REASON attribute directly when building an lrmd_event_data_t object. The string is not copied.
  - Via a callback, this event makes its way down to pcmk__create_history_xml (call trace in comment 2). 
  - pcmk__create_history_xml frees the exit-reason for a notify operation as part of its way of ignoring the result.

    if (pcmk__str_eq(task, CRMD_ACTION_NOTIFY, pcmk__str_none)) {
        const char *n_type = crm_meta_value(op->params, "notify_type");
        const char *n_task = crm_meta_value(op->params, "notify_operation");

        CRM_LOG_ASSERT(n_type != NULL);
        CRM_LOG_ASSERT(n_task != NULL);
        op_id = pcmk__notify_key(op->rsc_id, n_type, n_task);

        if (op->op_status != PCMK_EXEC_PENDING) {
            /* Ignore notify errors.
             *
             * @TODO It might be better to keep the correct result here, and
             * ignore it in process_graph_event().
             */
            lrmd__set_result(op, 0, PCMK_EXEC_DONE, NULL);
        }

  - lrmd_ipc_dispatch later frees the whole XML node that it created. That XML node contains the exit-reason string that's already been freed.

Comment 7 Ken Gaillot 2022-01-13 17:19:48 UTC
Fixed upstream as of commit ed8b2c86a thanks to nwahl

Comment 14 errata-xmlrpc 2022-05-10 14:10:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1885


Note You need to log in before you can comment on or make changes to this bug.