Bug 724937 - hwloc-1.2-0.fc16 fails xmlbuffer self check on PPC, but passes on PPC64
hwloc-1.2-0.fc16 fails xmlbuffer self check on PPC, but passes on PPC64
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: hwloc (Show other bugs)
rawhide
powerpc Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Jiri Hladky
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-22 07:16 EDT by Karsten Hopp
Modified: 2011-12-12 16:55 EST (History)
2 users (show)

See Also:
Fixed In Version: hwloc-1.3-1.fc16
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-11-24 21:16:01 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Karsten Hopp 2011-07-22 07:16:50 EDT
Description of problem:
the xmlbuffer self check fails on ppc, see 
http://ppc.koji.fedoraproject.org/koji/getfile?taskID=256586&name=build.log

The difference between the first exported buffer and the second exported buffer is in the lines
<page_type size="17179869184" count="0"/>
vs.
<page_type size="4294967295" count="0"/>

Version-Release number of selected component (if applicable):
hwloc-1.2-0.fc16

How reproducible:
always

Steps to Reproduce:
1. ppc-koji build --scratch dist-f16 hwloc-1.2-0.fc16.src.rpm
2.
3.
  
Actual results:
http://ppc.koji.fedoraproject.org/koji/taskinfo?taskID=256555
Comment 1 Jiri Hladky 2011-09-21 17:56:18 EDT
Just tested hwloc-1.2.1, bug is still there, contacting hwloc developers

ppc-koji build --scratch dist-f16 rpmbuild/SRPMS/hwloc-1.2.1-0.fc14.src.rpm

Please see a complete build log at
http://ppc.koji.fedoraproject.org/koji/getfile?taskID=285892&name=build.log

Thanks
Jirka

PASS: glibc-sched
exported to buffer 0x10568a30 length 1835
re-exported to buffer 0x1056d118 length 1834
### First exported buffer is:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topology SYSTEM "hwloc.dtd">
<topology>
  <object type="Machine" os_level="-1" os_index="0" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003" local_memory="16091512832">
    <page_type size="17179869184" count="0"/>
    <page_type size="65536" count="245537"/>
    <page_type size="16777216" count="0"/>
    <info name="Backend" value="Linux"/>
    <info name="OSName" value="Linux"/>
    <info name="OSRelease" value="2.6.32-131.6.1.el6.ppc64"/>
    <info name="OSVersion" value="#1 SMP Mon Jun 20 14:15:43 EDT 2011"/>
    <info name="HostName" value="ppc-comm01"/>
    <info name="Architecture" value="ppc"/>
    <object type="Socket" os_level="-1" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003">
      <object type="Cache" os_level="-1" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003" cache_size="4194304" depth="2"
cache_linesize="128">
        <object type="Cache" os_level="-1" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003" cache_size="65536" depth="1" cache_linesize="128">
          <object type="Core" os_level="-1" os_index="0" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003">
            <object type="PU" os_level="-1" os_index="0" cpuset="0x00000001"
complete_cpuset="0x00000001" online_cpuset="0x00000001"
allowed_cpuset="0x00000001"/>
            <object type="PU" os_level="-1" os_index="1" cpuset="0x00000002"
complete_cpuset="0x00000002" online_cpuset="0x00000002"
allowed_cpuset="0x00000002"/>
          </object>
        </object>
      </object>
    </object>
  </object>
</topology>
### End of first export buffer
### Second exported buffer is:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topology SYSTEM "hwloc.dtd">
<topology>
  <object type="Machine" os_level="-1" os_index="0" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003" local_memory="16091512832">
    <page_type size="4294967295" count="0"/>
    <page_type size="65536" count="245537"/>
    <page_type size="16777216" count="0"/>
    <info name="Backend" value="Linux"/>
    <info name="OSName" value="Linux"/>
    <info name="OSRelease" value="2.6.32-131.6.1.el6.ppc64"/>
    <info name="OSVersion" value="#1 SMP Mon Jun 20 14:15:43 EDT 2011"/>
    <info name="HostName" value="ppc-comm01"/>
    <info name="Architecture" value="ppc"/>
    <object type="Socket" os_level="-1" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003">
      <object type="Cache" os_level="-1" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003" cache_size="4194304" depth="2"
cache_linesize="128">
        <object type="Cache" os_level="-1" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003" cache_size="65536" depth="1" cache_linesize="128">
          <object type="Core" os_level="-1" os_index="0" cpuset="0x00000003"
complete_cpuset="0x00000003" online_cpuset="0x00000003"
allowed_cpuset="0x00000003">
            <object type="PU" os_level="-1" os_index="0" cpuset="0x00000001"
complete_cpuset="0x00000001" online_cpuset="0x00000001"
allowed_cpuset="0x00000001"/>
            <object type="PU" os_level="-1" os_index="1" cpuset="0x00000002"
complete_cpuset="0x00000002" online_cpuset="0x00000002"
allowed_cpuset="0x00000002"/>
          </object>
        </object>
      </object>
    </object>
  </object>
</topology>
### End of second export buffer
FAIL: xmlbuffer
========================================================
1 of 26 tests failed
Please report to http://www.open-mpi.org/community/help/
========================================================
Comment 2 Brice Goglin 2011-09-22 00:38:13 EDT
Looks like we cast the pagesizes to unsigned long during XML import+export. Please try this patch. It should work with your 16Go pages :)
Thanks!
Brice


Index: src/topology-xml.c
===================================================================
--- src/topology-xml.c	(révision 3812)
+++ src/topology-xml.c	(copie de travail)
@@ -280,9 +280,9 @@
       const xmlChar *value = hwloc__xml_import_attr_value(attr);
       if (value) {
 	if (!strcmp((char *) attr->name, "size"))
-	  size = strtoul((char *) value, NULL, 10);
+	  size = strtoull((char *) value, NULL, 10);
 	else if (!strcmp((char *) attr->name, "count"))
-	  count = strtoul((char *) value, NULL, 10);
+	  count = strtoull((char *) value, NULL, 10);
 	else
 	  fprintf(stderr, "ignoring unknown pagetype attribute %s\n", (char *) attr->name);
       }
Comment 3 Brice Goglin 2011-09-22 01:03:01 EDT
Ho, you'll need this too, otherwise the lines would be missordered. I reproduced and fixes the problem on x86_32 so I assume it'll work for you too.

Index: src/topology.c
===================================================================
--- src/topology.c	(révision 3828)
+++ src/topology.c	(copie de travail)
@@ -889,7 +889,12 @@
   const struct hwloc_obj_memory_page_type_s *a = _a;
   const struct hwloc_obj_memory_page_type_s *b = _b;
   /* consider 0 as larger so that 0-size page_type go to the end */
-  return b->size ? (int)(a->size - b->size) : -1;
+  if (!b->size)
+    return -1;
+  /* don't cast a-b in int since those are ullongs */
+  if (b->size == a->size)
+    return 0;
+  return a->size < b->size ? -1 : 1;
 }
Comment 4 Jiri Hladky 2011-09-23 19:16:16 EDT
Hi Brice,

I have tried to apply your patches 
https://bugzilla.redhat.com/show_bug.cgi?id=724937#c2
https://bugzilla.redhat.com/show_bug.cgi?id=724937#c3
to both hwloc-1.2 and hwloc-1.2.1
but it's failing:

===================================================================
patching file src/topology.c
Hunk #1 FAILED at 889.

patching file src/topology-xml.c
Hunk #1 FAILED at 280.
===================================================================

Could you please provide a new complete patch using version hwloc-1.2.1 as base?

http://www.open-mpi.org/software/hwloc/v1.2/downloads/hwloc-1.2.1.tar.bz2

Thanks a lot!
Jirka
Comment 5 Brice Goglin 2011-09-24 00:56:38 EDT
The patch I backported to v1.2 is
  https://svn.open-mpi.org/trac/hwloc/changeset/3834

By the way, there's a 1.2.2rc1 online, and I will do the final 1.2.2 next week.

Brice
Comment 6 Jiri Hladky 2011-09-24 16:46:24 EDT
Hi Brice,

thanks a lot for creating 1.2.2rc1. I have tested it and the issue is fixed:-)

I will wait for 1.2.2 to submit a new rpm for Fedora.

Thanks
Jiri
Comment 7 Fedora Update System 2011-10-04 19:21:34 EDT
hwloc-1.2.2-0.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/hwloc-1.2.2-0.fc16
Comment 8 Fedora Update System 2011-10-04 19:44:59 EDT
hwloc-1.2.2-0.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/hwloc-1.2.2-0.fc15
Comment 9 Fedora Update System 2011-10-05 13:16:40 EDT
Package hwloc-1.2.2-0.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing hwloc-1.2.2-0.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/hwloc-1.2.2-0.fc16
then log in and leave karma (feedback).
Comment 10 Fedora Update System 2011-10-06 20:53:31 EDT
hwloc-1.2.2-1.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/hwloc-1.2.2-1.fc15
Comment 11 Fedora Update System 2011-10-06 21:03:30 EDT
hwloc-1.2.2-1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/hwloc-1.2.2-1.fc16
Comment 12 Fedora Update System 2011-10-15 19:25:04 EDT
hwloc-1.3-0.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/hwloc-1.3-0.fc15
Comment 13 Fedora Update System 2011-11-14 19:26:37 EST
hwloc-1.3-1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/hwloc-1.3-1.fc16
Comment 14 Fedora Update System 2011-11-24 21:16:01 EST
hwloc-1.3-0.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 15 Fedora Update System 2011-12-12 16:55:31 EST
hwloc-1.3-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.