Bug 2240042

Summary: Update to pmix breaks OpenMPI
Product: [Fedora] Fedora Reporter: Shane Hart <glanzick>
Component: pmixAssignee: Philip Kovacs <pkfed>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 39CC: dledford, hladky.jiri, orion, pkfed
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: pmix-4.1.3-1.fc39 pmix-4.1.3-1.fc38 pmix-4.1.3-1.fc37 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-10-03 13:58:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shane Hart 2023-09-21 13:29:31 UTC
Description of problem:

The pmix package was updated recently to fix a CVE.  DNF history:

    Upgrade       pmix-4.1.3-1.fc39.x86_64                                     @updates-testing
    Upgraded      pmix-4.1.2-5.fc39.x86_64                                     @@System

This causes OpenMPI to break as symbols are missing in the pmix libraries:

[vmuser@fedora39-vm build]$ mpirun hostname
[fedora39-vm:45599] mca_base_component_repository_open: unable to open mca_pmix_ext3x: /usr/lib64/openmpi/lib/openmpi/mca_pmix_ext3x.so: undefined symbol: pmix_value_load (ignored)
[fedora39-vm:45599] [[30451,0],0] ORTE_ERROR_LOG: Not found in file ess_hnp_module.c at line 320
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------

Version-Release number of selected component (if applicable):

pmix was updated from 4.1.2-5 to 4.1.3-1 on Fedora 39.  OpenMPI might just need to be rebuilt.

How reproducible:

Every time.


Steps to Reproduce:
1. Run a simple MPI program, or even just "mpirun hostname".

Actual results:

See above.

Expected results:

For "mpirun -n 4 hostname" the hostname should be printed 4 times

Additional info:

Might just need a rebuild?

Comment 1 Orion Poplawski 2023-09-21 14:43:46 UTC
So, we may need to just rebuild openmpi and other pmix users, but pmix shouldn't be breaking ABI with a minor update with no soname bump.

Comment 2 Shane Hart 2023-09-21 16:53:51 UTC
(In reply to Orion Poplawski from comment #1)
> So, we may need to just rebuild openmpi and other pmix users, but pmix
> shouldn't be breaking ABI with a minor update with no soname bump.

It's possible, I was just taking a stab at it since it was a PMIx missing symbol.

Comment 3 Orion Poplawski 2023-09-23 03:15:01 UTC
$ abipkgdiff --d1 pmix-debuginfo-4.1.2-5.fc39.x86_64.rpm --d2 pmix-debuginfo-4.1.3-1.fc40.x86_64.rpm --devel1 pmix-devel-4.1.2-5.fc39.x86_64.rpm --devel2 pmix-devel-4.1.3-1.fc40.x86_64.rpm pmix-4.1.2-5.fc39.x86_64.rpm pmix-4.1.3-1.fc40.x86_64.rpm
================ changes of 'libpmix.so.2.5.2'===============
  Functions changes summary: 34 Removed, 6 Changed (313 filtered out), 36 Added functions
  Variables changes summary: 2 Removed, 47 Changed (73 filtered out), 16 Added variables

  34 Removed functions:

    [D] 'function pmix_status_t pmix_argv_append_nosize(char***, const char*)'    {pmix_argv_append_nosize}
    [D] 'function pmix_status_t pmix_argv_append_unique_nosize(char***, const char*)'    {pmix_argv_append_unique_nosize}
    [D] 'function char** pmix_argv_copy(char**)'    {pmix_argv_copy}
    [D] 'function int pmix_argv_count(char**)'    {pmix_argv_count}
    [D] 'function void pmix_argv_free(char**)'    {pmix_argv_free}
    [D] 'function char* pmix_argv_join(char**, int)'    {pmix_argv_join}
    [D] 'function pmix_status_t pmix_argv_prepend_nosize(char***, const char*)'    {pmix_argv_prepend_nosize}
    [D] 'function char** pmix_argv_split(const char*, int)'    {pmix_argv_split}
    [D] 'function char** pmix_argv_split_with_empty(const char*, int)'    {pmix_argv_split_with_empty}
    [D] 'function int pmix_cmd_line_add(pmix_cmd_line_t*, pmix_cmd_line_init_t*)'    {pmix_cmd_line_add}
    [D] 'function int pmix_cmd_line_create(pmix_cmd_line_t*, pmix_cmd_line_init_t*)'    {pmix_cmd_line_create}
    [D] 'function int pmix_cmd_line_get_argc(pmix_cmd_line_t*)'    {pmix_cmd_line_get_argc}
    [D] 'function char* pmix_cmd_line_get_argv(pmix_cmd_line_t*, int)'    {pmix_cmd_line_get_argv}
    [D] 'function int pmix_cmd_line_get_ninsts(pmix_cmd_line_t*, const char*)'    {pmix_cmd_line_get_ninsts}
    [D] 'function char* pmix_cmd_line_get_param(pmix_cmd_line_t*, const char*, int, int)'    {pmix_cmd_line_get_param}
    [D] 'function int pmix_cmd_line_get_tail(pmix_cmd_line_t*, int*, char***)'    {pmix_cmd_line_get_tail}
    [D] 'function char* pmix_cmd_line_get_usage_msg(pmix_cmd_line_t*)'    {pmix_cmd_line_get_usage_msg}
    [D] 'function bool pmix_cmd_line_is_taken(pmix_cmd_line_t*, const char*)'    {pmix_cmd_line_is_taken}
    [D] 'function int pmix_cmd_line_make_opt3(pmix_cmd_line_t*, char, const char*, const char*, int, const char*)'    {pmix_cmd_line_make_opt3}
    [D] 'function int pmix_cmd_line_make_opt_mca(pmix_cmd_line_t*, pmix_cmd_line_init_t)'    {pmix_cmd_line_make_opt_mca}
    [D] 'function pmix_status_t pmix_info_list_add(void*, const char*, void*, pmix_data_type_t)'    {pmix_info_list_add}
    [D] 'function pmix_status_t pmix_info_list_convert(void*, pmix_data_array_t*)'    {pmix_info_list_convert}
    [D] 'function void pmix_info_list_release(void*)'    {pmix_info_list_release}
    [D] 'function void* pmix_info_list_start()'    {pmix_info_list_start}
    [D] 'function pmix_status_t pmix_info_list_xfer(void*, const pmix_info_t*)'    {pmix_info_list_xfer}
    [D] 'function int pmix_mca_base_cmd_line_process_args(pmix_cmd_line_t*, char***, char***)'    {pmix_mca_base_cmd_line_process_args}
    [D] 'function int pmix_mca_base_cmd_line_setup(pmix_cmd_line_t*)'    {pmix_mca_base_cmd_line_setup}
    [D] 'function void pmix_mca_base_cmd_line_wrap_args(char**)'    {pmix_mca_base_cmd_line_wrap_args}
    [D] 'function pmix_status_t pmix_setenv(const char*, const char*, bool, char***)'    {pmix_setenv}
    [D] 'function int pmix_sync_wait_mt(pmix_wait_sync_t*)'    {pmix_sync_wait_mt}
    [D] 'function void pmix_value_load(pmix_value_t*, void*, pmix_data_type_t)'    {pmix_value_load}
    [D] 'function pmix_status_t pmix_value_unload(pmix_value_t*, void**, size_t*)'    {pmix_value_unload}
    [D] 'function pmix_status_t pmix_value_xfer(pmix_value_t*, const pmix_value_t*)'    {pmix_value_xfer}
    [D] 'function int pmix_vpmix_snprintf(char*, size_t, const char*, __va_list_tag*)'    {pmix_vpmix_snprintf}

I think I complained about this a while back but upstream didn't seem particularly concerned.  I've complained again: https://github.com/openpmix/openpmix/issues/3163

Comment 4 Fedora Update System 2023-09-25 19:24:21 UTC
FEDORA-2023-1185eca900 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-1185eca900

Comment 5 Fedora Update System 2023-09-25 22:34:01 UTC
FEDORA-2023-d6dbdf62ad has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-d6dbdf62ad

Comment 6 Fedora Update System 2023-09-25 22:34:56 UTC
FEDORA-2023-155d2f22f1 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-155d2f22f1

Comment 7 Fedora Update System 2023-09-26 01:25:54 UTC
FEDORA-2023-1185eca900 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-1185eca900`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-1185eca900

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 8 Fedora Update System 2023-09-26 02:22:23 UTC
FEDORA-2023-155d2f22f1 has been pushed to the Fedora 37 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-155d2f22f1`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-155d2f22f1

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 9 Fedora Update System 2023-09-26 02:29:13 UTC
FEDORA-2023-d6dbdf62ad has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-d6dbdf62ad`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-d6dbdf62ad

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Shane Hart 2023-09-26 12:06:06 UTC
I can confirm that the update fixes the issue.

Comment 11 Fedora Update System 2023-10-03 13:58:41 UTC
FEDORA-2023-1185eca900 has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 12 Fedora Update System 2023-10-04 02:33:40 UTC
FEDORA-2023-d6dbdf62ad has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 13 Fedora Update System 2023-10-04 02:59:44 UTC
FEDORA-2023-155d2f22f1 has been pushed to the Fedora 37 stable repository.
If problem still persists, please make note of it in this bug report.