Bug 1284451

Summary: gcc on x86_64 has bug in __builtin_clz()
Product: Red Hat Enterprise Linux 6 Reporter: Håkon Bugge <Haakon.Bugge>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: qe-baseos-tools-bugs
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.6CC: mfranc, mpolacek
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-23 12:17:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Håkon Bugge 2015-11-23 11:33:55 UTC
Description of problem:

I used __builtin_clz() to implement a roundup_pwr2() function. I do get three different results when changing optimization level from zero to 2. Neither of those are correct.

Also tried this on fc22, again, three different results, but none correct.

Of the six combinations O0..O2, and 6.6 -> fc22, I get 5 different results.

Running on SPARC64 with gcc 4.5.2, all three optimization levels return the correct result.

Version-Release number of selected component (if applicable):

On 6.6, gcc 4.4.7

On fc22, gcc 5.1.1

On SPARC64 (which works correctly), gcc 4.5.2. 

How reproducible:

Program:
-----------------------
#include <stdio.h>

unsigned roundup_pwr2(unsigned x) {
  int leading_zero = __builtin_clz((x ? x - 1 : 0));

  return 1UL << (32UL - leading_zero);
}

int main () {
  unsigned n;

  for (n = 0; n < 10; ++n) {
    printf("0x%08x %2d %2d 0x%08x\n", n, __builtin_clz(n), (n ? n - 1 : 0), roundup_pwr2(n));
  }

  return 0;
}
--------------------------

Steps to Reproduce:
1.

(change gcc version and architecture as appropriate in the output file).

for O in 0 1 2; do gcc -Wall -Wextra -O$O  roundup_pwr2.c; ./a.out > gcc_4.4.7_x86_64_bug_O${O}.txt; done

2.

md5sum gcc*txt


3.

Actual results:

gcc 4.4.7_x86_64:

0b973118faeb09f7a3af99b7df9b19e1  gcc_4.4.7_x86_64_bug_O0.txt
e8c2515621198862b49be891f95d1c26  gcc_4.4.7_x86_64_bug_O1.txt
4baff5266b4593189591133962913c72  gcc_4.4.7_x86_64_bug_O2.txt

gcc 5.1.1_x86_64 (fc22):

0b973118faeb09f7a3af99b7df9b19e1  gcc_5.1.1_x86_64_bug_O0.txt
158b9254067ef5ca35f870fb485ced66  gcc_5.1.1_x86_64_bug_O1.txt
111d0183e49181a3556c3907eb795851  gcc_5.1.1_x86_64_bug_O2.txt


Expected results:

gcc 4.5.2_SPARC64 (SunOS 5.11):

0ff9225a380dc978cd6aa30a02ef922f  gcc_4.5.2_SPARC64_bug_O0.txt
0ff9225a380dc978cd6aa30a02ef922f  gcc_4.5.2_SPARC64_bug_O1.txt
0ff9225a380dc978cd6aa30a02ef922f  gcc_4.5.2_SPARC64_bug_O2.txt

Additional info:

Comment 2 Jakub Jelinek 2015-11-23 11:43:25 UTC
 -- Built-in Function: int __builtin_clz (unsigned int x)
     Returns the number of leading 0-bits in X, starting at the most
     significant bit position.  If X is 0, the result is undefined.

So, if the only differences you are seeing is on __builtin_clz (0), then the problem is in the testcase.

Comment 3 Håkon Bugge 2015-11-23 12:16:41 UTC
Just realized I did not read the documentation properly.__builtin_clz() has undefined output when input is zero. My bad.

Please close as not a bug.