Bug 617766

Summary: openmpi build failure against python 2.7, possibly rawhide also
Product: [Fedora] Fedora Reporter: Dave Malcolm <dmalcolm>
Component: openmpiAssignee: Jay Fenlason <fenlason>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: dledford, fenlason, jfeeney, oget.fedora
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-20 19:02:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dave Malcolm 2010-07-23 21:32:07 UTC
Description of problem:
Attempted to rebuild openmpi against python 2.7, but it failed:
http://koji.fedoraproject.org/koji/getfile?taskID=2344896&name=build.log
shows:
dt_module.c:177:64: error: expected expression before ')' token
dt_module.c:182:64: error: expected expression before ')' token
dt_module.c:187:64: error: expected expression before ')' token
dt_module.c:192:64: error: expected expression before ')' token
dt_module.c:203:61: error: expected expression before ')' token
dt_module.c:208:61: error: expected expression before ')' token
dt_module.c:219:64: error: expected expression before ')' token
dt_module.c:224:64: error: expected expression before ')' token
dt_module.c:229:64: error: expected expression before ')' token
dt_module.c:234:64: error: expected expression before ')' token
dt_module.c:250:65: error: expected expression before ')' token
within openmpi-1.4.1/ompi/datatype

I've had a report that it also fails to build in rawhide.

Line 177 is the OMPI_DECLSPEC line below:
#if OMPI_HAVE_FORTRAN_LOGICAL1
OMPI_DECLSPEC ompi_predefined_datatype_t ompi_mpi_logical1 = { INIT_BASIC_FORTRAN_TYPE( DT_LOGIC, LOGICAL1, OMPI_SIZEOF_FORTRAN_LOGICAL1, OMPI_ALIGNMENT_FOR\
TRAN_LOGICAL1, 0) };
#else

Version-Release number of selected component (if applicable):
1.4.1-5.fc14 in CVS devel

How reproducible:
100%

Steps to Reproduce:
koji build dist-f14-py27-rebuild cvs://cvs.fedoraproject.org/cvs/pkgs?rpms/openmpi/devel#openmpi-1_4_1-5_fc14

Comment 1 Dave Malcolm 2010-07-23 21:39:03 UTC
Adding -save-temps to gcc invocation I see that that lines becomes:

__attribute__((__visibility__("default"))) ompi_predefined_datatype_t ompi_mpi_logical1 = { { { (&(ompi_datatype_t_class)), 1 }, (0x0040 | 0x0004 | 0x0080 | 
 0x0100 | 0x0002) | 0xC000 | (0), (0x17), 1, 0 , 1 , 0 , 1 , (), 1, (((uint64_t)1)<<(0x17)), ((void *)0), 0, "MPI_" "LOGICAL1", {0, 0, ((void *)0)}, {0, 0, 
((void *)0)}, ((void *)0), ((void *)0), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0 } } };

Looks to me like "1 , (), 1 " is the problem

Comment 2 Dave Malcolm 2010-07-23 21:47:24 UTC
Looking near top of file:
#define INIT_BASIC_FORTRAN_TYPE( TYPE, NAME, SIZE, ALIGN, FLAGS )             \
    { BASEOBJ_DATA, DT_FLAG_BASIC | DT_FLAG_DATA_FORTRAN | (FLAGS),           \
            (TYPE), SIZE, 0/*true_lb*/, SIZE/*true_ub*/, 0/*lb*/, SIZE/*ub*/, \
            (ALIGN), 1, (((uint64_t)1)<<(TYPE)), EMPTY_DATA(NAME) }
suggests that the "()" is the ALIGN from that macro

Comment 3 Dave Malcolm 2010-07-23 21:52:02 UTC
"config.h" has this empty macro definition, leading to the compilation failure:
/* Alignment of Fortran 77 LOGICAL*1 */
#define OMPI_ALIGNMENT_FORTRAN_LOGICAL1

"configure" generates it thus:
cat >>confdefs.h <<_ACEOF                                                                                                                                    
#define OMPI_ALIGNMENT_FORTRAN_LOGICAL1 $ofc_type_alignment                                                                                                  
_ACEOF                                                                                                                                                       

(wordwrapped by bugzilla)

Comment 4 Dave Malcolm 2010-07-23 22:01:25 UTC
Relevant fragment from configure.ac appears to me:
OMPI_F77_CHECK([LOGICAL*1], [yes],
               [char, int8_t, short, int32_t, int, int64_t, long long, long], [1])

Comment 5 Orcan Ogetbil 2010-07-23 23:01:43 UTC
(In reply to comment #3)
> "config.h" has this empty macro definition, leading to the compilation failure:
> /* Alignment of Fortran 77 LOGICAL*1 */
> #define OMPI_ALIGNMENT_FORTRAN_LOGICAL1
> 
> "configure" generates it thus:
> cat >>confdefs.h <<_ACEOF                                                       
> #define OMPI_ALIGNMENT_FORTRAN_LOGICAL1 $ofc_type_alignment                     
> _ACEOF                                                                          
> 

about 30 lines before this, $ofc_type_alignment is set as
   ofc_type_alignment=`eval 'as_val=${'type_var'};$as_echo "$as_val"'`

whereas in the F-13 build at the same place, we have
   ofc_type_alignment=$ompi_cv_f77_alignment_LOGICALp1

autotools is playing games with us.

Comment 6 Orcan Ogetbil 2010-07-24 01:46:18 UTC
Very very ugly hack. But doing a

sed -i 's|ofc_type_alignment=`eval.*|ofc_type_alignment=$ompi_cv_f77_alignment_LOGICALp1|' configure

right before the %configure makes the build finish:
   http://koji.fedoraproject.org/koji/taskinfo?taskID=2347485

Can we use this hack until the real solution is found so that we can finish the python builds?

Comment 7 Dave Malcolm 2010-07-24 16:14:12 UTC
(Note to openmpi devs: this is blocking "boost" in the python 2.7 rebuild, and thus quite a few other packages).

Thanks for workaround.

As you say, it's an ugly hack.  I was a bit worried that it might duplicate the value of LOGICAL*1 for all values, so I tried an alternate approach: I noticed that the captured value appears in $ac_res in each test, so I tried this workaround:
sed -i \
  's|ofc_type_alignment=`eval.*`|ofc_type_alignment=$ac_res|g' \
  configure

This worked in a scratch build:
  http://koji.fedoraproject.org/koji/taskinfo?taskID=2348248
so I committed it to CVS and am trying this build (1.4.1-6):
  http://koji.fedoraproject.org/koji/taskinfo?taskID=2348272

Having said that, I checked the logs of your scratch build, and it does look like it got the various alignments correct.

Some more notes:

The generated header file is: opal/include/opal_config.h

The code that generates the relevant part of the configure script seems to be:

config/f77_check.m4 contains:
AC_DEFUN([OMPI_F77_CHECK], [

    (snip)
  ofc_type_alignment=$ac_cv_sizeof_int

    (snip)

            # Get the alignment of the type
            if test "$ofc_have_type" = "1"; then
                OMPI_F77_GET_ALIGNMENT([$1], [ofc_type_alignment])
            fi

    (snip)

  [OMPI_ALIGNMENT_FORTRAN_]m4_bpatsubst(m4_bpatsubst([$1], [*], []), [[^a-zA-Z0-9_]], [_])[=$ofc_type_alignment]

    (snip)
])dnl

config/f77_get_alignment.m4 has the definition:
# OMPI_F77_GET_ALIGNMENT(type, shell variable to set)
# ----------------------------------------------------
AC_DEFUN([OMPI_F77_GET_ALIGNMENT],[

            OMPI_F77_GET_ALIGNMENT([$1], [ofc_type_alignment])
which appears to be where the code in comment #5 is generated from


The body of that macro appears to set "type_var" from conftestval:
           [AS_VAR_SET(type_var, [`cat conftestval`])],
(with various error handling conditions)

Comment 8 Dave Malcolm 2010-07-24 16:23:14 UTC
Rebuild was successful, which (when it hits the buildroot) will unblock parts of the stack.

However, given that this is a workaround, I'm going to leave this bug open: please can the openmpi maintainers review this and figure out the root cause and a more robust fix.

Comment 9 Bug Zapper 2010-07-30 12:48:24 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Doug Ledford 2012-01-20 19:02:03 UTC
This doesn't appear to be a problem with recent openmpi versions as the workaround is no longer present and yet builds work.  This is likely due to the fact that we no longer run the autogen.sh script, and so we don't regenerate the configure scripts, and the shipped script works.  Closing this out.