Bug 437701

Summary: (gcc-4.3) memtest86+ fails during test 2
Product: [Fedora] Fedora Reporter: Jimmy Rentz <jb17bsome>
Component: memtest86+Assignee: Warren Togami <wtogami>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: covex, cra, michal, pekane52, petrosyan, robatino, wtogami, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-05 19:57:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 235706    

Description Jimmy Rentz 2008-03-16 18:12:37 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9b5pre) Gecko/2008031220 Fedora/3.0-0.40.cvs20080312.fc9 Minefield/3.0b5pre

Description of problem:
Memtest fails during test 2 (Moving Inversions Ones And Zeros) after 21% is complete.


Version-Release number of selected component (if applicable):
memtest86+-2.01-2

How reproducible:
Always


Steps to Reproduce:
1.Run memtest
2.
3.

Actual Results:
It fails during test 2% after 21% is complete.  Even more, you end up getting an Unexpected Interupt - Halting message after a bit. 

Expected Results:
It should have passed with no errors.

Additional info:
I have seen this issue on two separate amdx86_64 machines (64bit, 32bit versions).   

I tried the memtest86+-1.70-4 version from fedora 8 and that works fine on both systems.

Comment 1 Michal Jaegermann 2008-03-30 22:10:56 UTC
The above was true also for memtest included on 20080329 images.
Running on my hardware memtest binaries from x86_64 and i386 images
(no idea what is the real difference but files are definitely not the same)
produces right away around 70 thousands errors and terminates with
an "Unexpected Interrupt  Type: Divide".

Morever if replaced with binaries from www.memtest.org (an unpacked
http://www.memtest.org/download/2.01/memtest86+-2.01.bin.gz
will do just fine) memtest-2.01 runs without any issues.  These
executables are still different.  Not the same compiler?

Comment 2 Pat Kane 2008-04-01 13:23:14 UTC
I just tried memtest (v2.01) from the F9beta DVD, it fails after in the first
20 seconds on all machine that I have tried it on.  I grabbed memtest v2.01
from the web, burnt a CD and it runs without problem.



Comment 3 Warren Togami 2008-04-01 15:43:09 UTC
Sounds like the gcc-4.3 might be miscompiling.

http://people.redhat.com/wtogami/temp/memtest86+-2.01-2.fc8.i386.rpm
Could you please test this one, which is the same version, built using F8's
compiler?

Comment 4 Michal Jaegermann 2008-04-01 16:33:54 UTC
> http://people.redhat.com/wtogami/temp/memtest86+-2.01-2.fc8.i386.rpm

I let this one to run through one full pass on the same hardware
as previously used for comment #1.  No issues were reported.

That memtest binary is still different from any other one I collected
so far. :-)

Comment 5 Pat Kane 2008-04-01 18:50:23 UTC
That version (memtest86+-2.01-2) works fine, thanks.


Comment 6 Jimmy Rentz 2008-04-02 03:17:08 UTC
(In reply to comment #3)
> Sounds like the gcc-4.3 might be miscompiling.
> 
> http://people.redhat.com/wtogami/temp/memtest86+-2.01-2.fc8.i386.rpm
> Could you please test this one, which is the same version, built using F8's
> compiler?

Yes, this version works fine.  No errors through a full test.



Comment 7 Warren Togami 2008-04-02 04:50:17 UTC
Reassigning to gcc.  If they are unable to fix this before Fedora9 then we'll
use one of the alternative compilers to build this for release.


Comment 8 Jakub Jelinek 2008-04-02 08:27:38 UTC
That's premature.  It is far more likely a bug in the package rather than GCC,
so before reassigning to GCC you need to prove the bug is in GCC rather than the
package, or at least cut it down to a small self-contained testcase.
Try building it with different options (e.g. replace -O2 with -O2
-fno-strict-aliasing, or -O0), if one of these work, do a binary search between
the -O2 and other option .o files to see which compilation unit is problematic,
then try to narrow it down to a particular routine, gather what argument it is
being called with and create self-contained testcase where main prepares
whatever necessary, calls that routine with the right arguments and stub any
functions the problematic function calls.
The fact that it happens to work with gcc 4.1 and doesn't with 4.3 is not a
proof there is a compiler bug, it more probably is a package bug, which just
didn't show up with the older gcc (think of e.g. aliasing violation, using
uninitialized memory, etc.).

Comment 9 Pat Kane 2008-04-02 23:07:07 UTC
I can reproduce the problem using qemu, which makes testing the
problem _much_ easier.  Because of an apparent floppy emulation bug
in qemu I had to use the cdrom iso image:

    qemu -cdrom mt201.iso

Let me know if you need help find the bug.

Pat
---




Comment 10 Jesse Keating 2008-04-03 19:20:58 UTC
Warren, are you going to get to the things Jakub suggested before the final
freeze?  It would be nice to have a working memtest and avoid the bugspam for F9...

Comment 11 Warren Togami 2008-04-03 21:20:56 UTC
http://koji.fedoraproject.org/packages/memtest86+/2.01/3.fc9/
Please test this build.


Comment 12 Pat Kane 2008-04-03 23:15:24 UTC
Works okay.


Comment 13 Jimmy Rentz 2008-04-03 23:45:09 UTC
(In reply to comment #11)
> http://koji.fedoraproject.org/packages/memtest86+/2.01/3.fc9/
> Please test this build.
> 
Works fine.


Comment 14 Michal Jaegermann 2008-04-05 18:36:01 UTC
memtest86+-2.01-3.fc9 binaries, both i386 and x86_64, are working AFAICT.

Comment 15 Pat Kane 2008-04-05 22:57:08 UTC
Works okay with the Rawhide 20080404 snapshot.  Thank you.


Comment 16 Charles R. Anderson 2008-04-06 16:00:01 UTC
I noticed on my ThinkPad T61 that I was getting tons of errors from the memtest.
 Eventually, memtest walked all over some video memory or perhaps other mmio
regions and corrupted the display, hung the computer, etc.  I'll give the above
build a try.