Bug 961007

Summary: FTBFS: builds get stuck during self checks
Product: [Fedora] Fedora Reporter: Karsten Hopp <karsten>
Component: hdfAssignee: Orion Poplawski <orion>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 19CC: dan, dwa, orion, pertusus, volker27
Target Milestone: ---   
Target Release: ---   
Hardware: powerpc   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-22 17:38:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Karsten Hopp 2013-05-08 13:59:12 UTC
Description of problem:
Builds of hdf-4.2.9-1.fc19 on PPC64 and PPC32 get stuck during the sel checks without any indication what's going on.

make  check-TESTS
make[3]: Entering directory `/builddir/build/BUILD/hdf-4.2.9/hdf/test'
make[4]: Entering directory `/builddir/build/BUILD/hdf-4.2.9/hdf/test'
===Serial tests in test begin Tue May  7 14:50:03 MST 2013===
make[5]: Entering directory `/builddir/build/BUILD/hdf-4.2.9/hdf/test'
============================
Testing testhdf 

At this point it waits until koji gets a timeout and kills it.

A test with 'make check Verbosity=9' didn't provide any further output either.


Version-Release number of selected component (if applicable):
hdf-4.2.9-1.fc19

How reproducible:
always

Steps to Reproduce:
1.ppc-koji build --scratch f19 hdf-4.2.9-1.fc19.src.rpm
2.
3.
  
Actual results:

http://ppc.koji.fedoraproject.org/koji/taskinfo?taskID=1092238

Verbosity=9
http://ppc.koji.fedoraproject.org/koji/taskinfo?taskID=1093011

Comment 1 Orion Poplawski 2013-05-09 16:56:52 UTC
I'm at a complete loss.  It seems like even "strace -f testhdf" produces no output.  I have no ppc machines myself to test with.

Comment 2 Karsten Hopp 2013-05-13 17:06:17 UTC
It hangs in test_mgr_interlace(1) when dimsize[0] = 4 and dimsize[1] = 5
 (around line 2293 in mgr.c)

gdb shows more, mcache_look seems to be looping here:

0x00000000100bf100 in mcache_look (pgno=1, mp=0x102eb6e0) at mcache.c:1183
1183        for (bp = head->cqh_first; bp != (VOID *)head; bp = bp->hq.cqe_next)

(gdb) p *bp
$12 = {hq = {cqe_next = 0x1022ff20, cqe_prev = 0x1022ff20}, q = {cqe_next = 0x10284f50, cqe_prev = 0x102eb6e0}, page = 0x1022ff50, pgno = 4, 
  flags = 0 '\000'}

(gdb) p pgno
$13 = 1

(gdb) p bp
$14 = (BKT *) 0x1022ff20

(gdb) p bp->hq.cqe_next
$15 = (struct _bkt *) 0x1022ff20

(gdb) p head
$16 = (struct _hqh *) 0x102eb6f0

This looks to me like there is no way to get out of this loop:
    for (bp = head->cqh_first; bp != (VOID *)head; bp = bp->hq.cqe_next)
        if (bp->pgno == pgno)
          { /* hit....found page in cache */
#ifdef STATISTICS
              ++mp->cachehit;
#endif
              /* done */
              ret_value = RET_SUCCESS;
              goto done;
          }

Comment 3 Dan HorĂ¡k 2014-05-22 11:15:44 UTC
sounds like a gcc 4.8 issue, as the test suite passes in f21 with gcc 4.9