Bug 1474774

Summary: Latest ceph is not built on ppc64
Product: [Fedora] Fedora Reporter: Kaleb KEITHLEY <kkeithle>
Component: cephAssignee: Boris Ranto <branto>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: ahs3+donotuse, ahs3, berrange, bhubbard, branto, david, esandeen, extras-qa, fedora, kdreyer, loic, ramkrsna, rjones, steve, yselkowi
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1474743 Environment:
Last Closed: 2017-08-03 02:58:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1474743    
Bug Blocks: 1474772, 1474773    

Description Kaleb KEITHLEY 2017-07-25 11:35:07 UTC
+++ This bug was initially created as a clone of Bug #1474743 +++

Description of problem:
The latest update of Ceph in rawhide just excluded several Fedora architectures

 ExcludeArch:   i686 armv7hl ppc64

As a result this has broken downstream packages that depend on ceph like libvirt & QEMU.

Please re-enable these architectures asap.

If there are build problems then plesae create blocker bugs against the arch exclude trackers

https://fedoraproject.org/wiki/Packaging:Guidelines#Architecture_Build_Failures

Version-Release number of selected component (if applicable):
12.1.1-1

--- Additional comment from Kaleb KEITHLEY on 2017-07-25 07:22:38 EDT ---

It doesn't build on those platforms.

File a bug against Ceph.

--- Additional comment from Daniel Berrange on 2017-07-25 07:29:19 EDT ---

That's not the way Fedora works. As per the packaging guidelines link above, maintainers need to explicitly track any build problems on architectures & mark them as blocking the arch trackers so the problem can be resolved.

Comment 1 Kaleb KEITHLEY 2017-07-25 11:41:39 UTC
see build log at 

https://koji.fedoraproject.org/koji/taskinfo?taskID=20683233

Comment 2 Daniel Berrangé 2017-07-31 09:32:22 UTC
It looks like the core problem with PPC64 is that we've not set HAVE_POWER8 in the cmake rules. So the source code is using  #ifdef __powerpc__  to enable ppc specific code, but the cmake if HAVE_POWER8 rules are all evaluating to false, so we get link errors.

Adding something like the following to  cmake/modules/SIMDExt.cmake 


+elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "(powerpc|ppc)64")
+  set(HAVE_PPC64 1)
+  message(STATUS " we are ppc64")
+  CHECK_C_COMPILER_FLAG("-mcpu=power8" HAVE_POWER8)
+  if(HAVE_POWER8)
+    message(STATUS " HAVE_POWER8 yes")
+  endif()

should get it closer to building successfully, but I've not tested this - this is just from observation of what's failed in that link above

Comment 3 Ken Dreyer (Red Hat) 2017-07-31 17:21:09 UTC
Al, are you familiar with this build failure?

Comment 4 Al Stone 2017-07-31 19:22:45 UTC
(In reply to Ken Dreyer (Red Hat) from comment #3)
> Al, are you familiar with this build failure?

It looks familiar, but may not be exactly what I've seen.  The first thing I would try is fixing the spec file; these lines in the log --

CMake Warning:
  Manually-specified variables were not used by the project:
    WTIH_BABELTRACE

are from a typo, so s/WTIH_BABELTRACE/WITH_BABELTRACE/g.  Even if it doesn't fix the problem, it's at least removing an annoying message that doesn't even affect anything since the defaults are correct.

I don't think the version of GCC being used has the problem (it may be GCC 4.9 or so that does), but another thing I'm looking at is a bug where GCC memory use grows dramatically.  In that case, removing '-pipe' from the compile options and trying just 'make' (aka, 'make -j1') might allow the compile to complete.  This one shows up like a race condition so the symptoms pop up all over the place during compile and link -- it all depends on what is being done in parallel at that point in time.  In this case, 'make -j2' may not be that big a deal, but I don't know how big the machine (or VM) is that's doing the compiling.

Comment 5 Kaleb KEITHLEY 2017-08-01 14:57:37 UTC
Here are the results with Dan's suggested fix from Comment 2 above

   https://kojipkgs.fedoraproject.org//work/tasks/6524/20946524/build.log

And no, it's not running out of memory — `make -j1` is not cure. Also -DWITH_BABELTRACE already been addressed and also wasn't the issue in the build failure.

Comment 6 Boris Ranto 2017-08-01 19:45:56 UTC
I played with this a bit. Daniel seems to be right that we are missing cmake definitions for ppc64. However, adding that won't help as it looks like we have recently dropped the support for ppc in favor of ppc64le in some places. If you simply add the cmake support for ppc64, the code won't compile because the instructions for crc32c_ppc (the asm files) are written for ppc64le, not ppc64 (the code also requires altivec extension btw). I am currently running a scratch build that will build only a naive implementation (i.e. like it did on pre-power8 machines) of the crc32c_ppc module on ppc64. I'll keep you posted.

Comment 7 Boris Ranto 2017-08-02 06:58:13 UTC
This should be fixed by the following command (my scratch build succeeded, running a regular build, now):

http://pkgs.fedoraproject.org/cgit/rpms/ceph.git/commit/?id=13a18359e9f68423b1b1f0c62b9fe3f75d4e0094

Eventually, it did require a bit more changes, specifically fixing the macros for the optimized crc32c code to build the optimized version on ppc64le only, build a naive implementation for other ppc platforms but not actually use it -- we have to use the sctp crc32 function instead on non-ppc64le platforms, at least for now.

Comment 8 Boris Ranto 2017-08-03 02:58:01 UTC
The build completed succesfully:

https://koji.fedoraproject.org/koji/taskinfo?taskID=20961346