Description of problem: libfabric errors with SIGILL on Version-Release number of selected component (if applicable): libfabric-1.12.1-1.fc33.x86_64 How reproducible: on old CPU, always Steps to Reproduce: 1. call MPI_Init from openmpi OR: 1. compile: main() { fi_getinfo(); return 0; } 2. with gcc <file> -lfabric 3. run ./a.out Actual results: SIGILL Thread 1 "python3" received signal SIGILL, Illegal instruction. 0x00007fffe542a270 in fi_psm3_ini () from /lib64/libfabric.so.1 (gdb) bt #0 0x00007fffe542a270 in fi_psm3_ini () from /lib64/libfabric.so.1 #1 0x00007fffe531fc27 in fi_ini () from /lib64/libfabric.so.1 #2 0x00007fffe532357d in fi_getinfo () from /lib64/libfabric.so.1 #3 0x00007fffe554c330 in usnic_component_init () from /usr/lib64/openmpi/lib/openmpi/mca_btl_usnic.so #4 0x00007fffe69d5989 in mca_btl_base_select () from /usr/lib64/openmpi/lib/libopen-pal.so.40 #5 0x00007fffe56ca178 in mca_bml_r2_component_init () from /usr/lib64/openmpi/lib/openmpi/mca_bml_r2.so Expected results: Fall back to save instructions, get no error. Additional info: CPU Info: model name : Pentium(R) Dual-Core CPU E5300 @ 2.60GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm pti tpr_shadow vnmi flexpriority vpid dtherm vmx flags : vnmi flexpriority tsc_offset vtpr vapic bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
(In reply to david08741 from comment #0) > model name : Pentium(R) Dual-Core CPU E5300 @ 2.60GHz ^^^^^^^^^^^^^^^^^^^^^^^^^ Well, it is a really old CPU. > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm > constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 > monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm pti tpr_shadow > vnmi flexpriority vpid dtherm I did not look into the backtrace dump. But it seems the CPU does not support AVX/AVX2 instructions. It is likely a duplicated issue of https://bugzilla.redhat.com/show_bug.cgi?id=1659852 .
Anton, could you please have a look?
At first pass, I would agree with Hongang. I've added Adam G. to take a closer look. Keeping NEED INFO flag.
Agreed, with Anton and Hongang, PSM3 is only meant to be run on AVX or higher CPUs.
Clearing need info flag for anton.bodner
Sorry, 2nd attempt
So is this a bug in openmpi, as it shouldn't call `fi_getinfo` on non-AVX CPUs, or is this a bug in libfabric? I assume the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1659852 for the PSM2 could also be applied for PSM3?
(In reply to david08741 from comment #7) > So is this a bug in openmpi, as it shouldn't call `fi_getinfo` on non-AVX > CPUs, or is this a bug in libfabric? I think it is libfabric bug. > I assume the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1659852 for > the PSM2 could also be applied for PSM3? Yes, you are right. But it is unlikely will be fixed for old CPUs. As you see, bz1659852 hang on for 2.5+ years without fix.
This message is a reminder that Fedora 33 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '33'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 33 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Still present in F34
This message is a reminder that Fedora Linux 34 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '34'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 34 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07. Fedora Linux 34 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. Thank you for reporting this bug and we are sorry it could not be fixed.