733372 – three-dotted range using floats doesn't work properly on i386

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 733372 - three-dotted range using floats doesn't work properly on i386

Summary: three-dotted range using floats doesn't work properly on i386

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	ruby
Sub Component:
Version:	6.2
Hardware:	i686
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Vít Ondruch
QA Contact:	BaseOS QE Security Team
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	984653 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-08-25 15:08 UTC by Aleš Mareček
Modified:	2016-07-26 15:02 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-07-26 15:02:18 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
C reproducer (753 bytes, text/plain) 2011-08-26 09:10 UTC, Vít Ondruch	no flags	Details
updated Vit's reproducer - more generic solution (2.77 KB, text/plain) 2011-08-26 13:14 UTC, Aleš Mareček	no flags	Details
Proposed fix for reproducer (642 bytes, patch) 2011-08-30 08:20 UTC, Jaroslav Škarvada	no flags	Details \| Diff
Show Obsolete (1) View All

Description Aleš Mareček 2011-08-25 15:08:17 UTC

Description of problem:
Three-dotted range should return all values, including start value, but not the last one.
Example 1: (1...5).step(1) should return 1, 2, 3, 4
Example 2: (1.0...9.4).step(1.2) should return 1.0, 2.2, 3.4, 4.6, 5.8, 7.0, 8.2
On i386 I got the last value in some cases, it doesn't happen on 64-bit architectures.

Version-Release number of selected component (if applicable):
ruby-1.8.7.352-3

How reproducible:
Always

Steps to Reproduce:
1.
A) ruby -e 'p (1.0...9.4).step(1.2).to_a'
B) ruby -e 'p (1.0...6.4).step(1.8).to_a'
C) ruby -e 'p (1.0...7.3).step(2.1).to_a'
D) ruby -e 'p (1.0...7.6).step(2.2).to_a'
E) ruby -e 'p (1.0...128.4).step(18.2).to_a'
F) ruby -e 'p (1.0...146.6).step(18.2).to_a'

  
Actual results:
A) [1.0, 2.2, 3.4, 4.6, 5.8, 7.0, 8.2, 9.4] - FAIL
B) [1.0, 2.8, 4.6, 6.4] - FAIL
C) [1.0, 3.1, 5.2]
D) [1.0, 3.2, 5.4]
E) [1.0, 19.2, 37.4, 55.6, 73.8, 92.0, 110.2, 128.4] - FAIL
F) [1.0, 19.2, 37.4, 55.6, 73.8, 92.0, 110.2, 128.4]

Expected results:
A) [1.0, 2.2, 3.4, 4.6, 5.8, 7.0, 8.2]
B) [1.0, 2.8, 4.6]
C) [1.0, 3.1, 5.2]
D) [1.0, 3.2, 5.4]
E) [1.0, 19.2, 37.4, 55.6, 73.8, 92.0, 110.2]
F) [1.0, 19.2, 37.4, 55.6, 73.8, 92.0, 110.2, 128.4]

Additional info:

Comment 1 Vít Ondruch 2011-08-26 09:10:01 UTC

Created attachment 520039 [details]
C reproducer

Hello, I have tried reproducer in C and the behavior is wrong on some systems. Even though the last condition should not be fulfilled, the n is incremented. This seems to be related to platform, not a Ruby fault. The  -ffloat-store fixes this issue however it is not clear what is the performance impact.

Comment 2 Aleš Mareček 2011-08-26 13:14:18 UTC

Created attachment 520087 [details]
updated Vit's reproducer - more generic solution

Comment 3 Jaroslav Škarvada 2011-08-28 22:15:22 UTC

AFAIK by default on 32 bit machines GCC uses FPU instructions for better compatibility with older machines, while on 64 bit machines it uses SSE instructions for better performance. 

The SSE offers more registers and consistency - the value always retain 64 bit, while FPU offers better precision - it uses 80 bit intermediate values when possible. And that's the source of inconsistency.

The core from your stripped down reproducer:

...
  double beg = 1.0;
  double unit = 1.2;
  double end = 9.4;

  printf("%d\n", 7 * unit + beg < end);
...

On 32 bit it returns 1, while on 64 bit it returns 0. Why? Analysis:

FPU instructions (on 32 bit):
7 * 1.199999999999999956 = 8.399999999999999689
8.399999999999999689 + 1 = 9.399999999999999689
9.399999999999999689 < 9.400000000000000355

SSE instructions (on 64 bit):
7 * 1.2 = 8.4000000000000004
8.4000000000000004 + 1 =  9.4000000000000004
9.4000000000000004 == 9.4000000000000004

Please note that the intermediate number 9.400000000000000355 can be rounded to 9.4000000000000004, but the comparison is done on FPU stack with the full precision. And that's the problem.

You can force usage of the FPU on 64 bit, by compiling with -mfpmath=387 and then the results will be the same on both arches. But please note the floating points are tricky and it shouldn't be relied on internal rounding as in the reproducer above.

Comment 4 Aleš Mareček 2011-08-30 07:38:26 UTC

Jaroslav, thank you for great explanation.
What would be your suggestion to "repair" this issue?
...There is an gcc option in test "float-store" which should not store those number into registers, which solves the problem, but is it safe in case of performance / other issues?
Ruby provides some tests that can be run like "make check" and it is the problem that it fails. Also some cases IRL, I don't know any, could create some functionality issues if we didn't fix it.
Should we recompile it with -ffloat-store for i386 machines OR delete the tests in testsuite and leave it alone OR anything else?
Vitek, your suggestions?
Thanks and Regards!

Comment 5 Jaroslav Škarvada 2011-08-30 08:20:43 UTC

Created attachment 520552 [details]
Proposed fix for reproducer

Something similar to the proposed fix, will force the comparison to be done outside the FPU stack after rounding the number. But there could be more problems in the original code (e.g. more FPU operations with bigger errors that neednt' be fixed by rounding), so the revisit of the original code would be good idea.

Comment 6 Jaroslav Škarvada 2011-08-30 08:40:36 UTC

(In reply to comment #5)
> Created attachment 520552 [details]
> Proposed fix for reproducer
> 
> Something similar to the proposed fix, will force the comparison to be done
> outside the FPU stack after rounding the number.

To be precise, the comparison is probably still done on the FPU stack but after the rounding and reloading the number. Please note that the presented hack will fix only small errors and that the high optimization compiler settings (-OX) could break the order (not tested).

Comment 7 Vít Ondruch 2011-08-30 08:54:42 UTC

Thank you Jaroslav. I have prepared patch for Ruby 1.9.3 [1] and asked upstream to comment, apply and backport it for 1.8.7 [2].

[1] http://redmine.ruby-lang.org/attachments/2039/0001-Fix-the-ronding-error-causing-wrong-evaluation-of-ra.patch
[2] http://redmine.ruby-lang.org/issues/4576

Comment 8 Jaroslav Škarvada 2011-09-26 12:16:46 UTC

(In reply to comment #4)
> ...There is an gcc option in test "float-store" which should not store those
> number into registers, which solves the problem, but is it safe in case of
> performance / other issues?

I ran fbench (http://www.fourmilab.ch/fbench/fbench.html) on RHEL-6.2@T500, the results follows (avgs of 10 runs):

arch/ver        x86_64 def      x86_64 fs       i386 def        i386 fs
Tavg [ms]       191.440         193.500         211.170         211.730
stddev [ms]       4.459           6.230           2.761           1.876

The performance seems to be the same (if stddev is taken into acccount).

Comment 10 Suzanne Logcher 2012-02-14 23:14:33 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 11 Vít Ondruch 2013-07-16 07:26:44 UTC

*** Bug 984653 has been marked as a duplicate of this bug. ***

Comment 12 Vít Ondruch 2015-10-21 07:59:34 UTC

Hm, I never noticed that this was fixed/improved upstream:

https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/33285
https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/33407
https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/33811

But not sure it is worth of fixing at this point.

Comment 13 Vít Ondruch 2016-07-26 15:02:18 UTC

This issue was discovered during testing but it appears, that during the years, it did not caused any real issues to our users. Since the situation was improved in upstream and later Ruby releases, I think it's about the time to close this ticket, leaving Ruby in RHEL 6 without the fix.

Note You need to log in before you can comment on or make changes to this bug.