Bug 106685

Summary: kernel prevents proper CD checking
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: mharris, pfrields, robatino, stefan.hoelldampf
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.14-1.1632_FC5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-28 16:48:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 109462    
Bug Blocks: 100644    

Description Alexandre Oliva 2003-10-09 16:04:11 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031003

Description of problem:
Burning rescuecd.iso with cdrecord I get:
Track 01: Total bytes read/written: 71106560/71106560 (34720 sectors).

So far so good.  The CD seems to have been burned correctly.  But when I attempt
to use checkisomd5 to check it, it fails.  So I burn it again, on a different
CD-RW media, and the results are the same.  I try a different CD-RW drive, no
difference.  Both machines were running Severn.

So I reboot one of the boxes into Shrike, re-burn the CD, and then the
verification passes.  Just to be sure, I go back, burn the CD on Severn again,
then check it on Shrike, and it passes.

Just to have even more assurance the problem is in the kernel, while booted on
Severn, I chroot to the Shrike root and run checkisomd5, and it fails.  Booted
on Shrike, I chroot to the Severn root, and it passes.

So the media and the recording are good, the check program behaves consistently,
but the kernel affects the result of the verification, but not of the recording.
 When verifying, while running the 2087 kernel, /var/log/messages gets:

Oct  9 12:39:14 free kernel:  I/O error: dev 0b:00, sector 138728

No such errors when the kernel is kernel-2.4.20-20.9.

Version-Release number of selected component (if applicable):
kernel-2.4.22-1.2087.nptl

How reproducible:
Always

Steps to Reproduce:
1.Burn a 71106560-bytes long rescuecd.iso into 4x CD-RW media (I don't know
whether the exact size or the exact kind of media matters)
2.Run /usr/lib/anaconda-runtime/checkisomd5 /dev/cdrom && echo ok while running
the Shrike kernel
3.Ditto, while running the Severn kernel


Actual Results:  2 passes, 3 fails

Expected Results:  Both should pass (assuming no errors in the recording :-)

Additional info:

CD recorders I've tried are an LG DVD/CD-RW 16x48x24x28 combo, an older LG CD-RW
32x4x8.  CD drives I used for the verification were both of these, plus an LG
52x CD unit, a laptop's CD drive and another laptop's DVD drive.  ide-scsi
emulation didn't make any difference as far as the verification went.  I didn't
try recording without ide-scsi emulation, but recording is working, so I didn't
bother.

Comment 2 Dave Jones 2003-10-09 16:46:46 UTC
Try booting the kernel without the hdc=scsi parameter before you do the check.
It passes fine here. (I remove that param on every system I have, I've no idea
why anaconda insists my IDE drives need it).


Comment 3 Alexandre Oliva 2003-10-09 17:27:51 UTC
I have tried both with and without ide-scsi, no difference.

Comment 4 Alexandre Oliva 2003-10-09 17:47:28 UTC
hdparm -d 0 /dev/cdrom seems to work around it for me.  However, DMA was enabled
by default, and it has always worked for me.  In fact, it does work 

Here are some of the drives that trigger this problem:

 Model=TOSHIBA DVD-ROM SD-C2302, FwRev=1315, SerialNo=
 Config={ Fixed Removeable DTR<=5Mbs DTR>10Mbs nonMagnetic }
 
 Model=TEAC CD-ROM CD-224E, FwRev=3.7C, SerialNo=
 Config={ Fixed Removeable DTR<=5Mbs DTR>10Mbs nonMagnetic }
 
 Model=LG CD-ROM CRD-8521B, FwRev=1.02, SerialNo=
 Config={ Fixed Removeable DTR<=5Mbs DTR>10Mbs nonMagnetic }
 
I can't get the exact info on the 2 CD burners because of ide-scsi (I'd rather
not reboot my desktop atm)

Tests failed on both athlon and i686 boxes, all of them UP.


Comment 5 Dave Jones 2003-10-11 12:47:07 UTC
Did you burn these with -pad ?


Comment 6 Alexandre Oliva 2003-10-11 21:26:29 UTC
I don't know whether gtoaster and xcdroast use -pad by default, but I certainly
didn't use it when buring the CD wtih cdrecord, in the command line.

Comment 7 Dave Jones 2003-10-12 19:12:24 UTC
You should have.  See if you can repeat it with a CD burned this way.
I'll bet the problem goes away.

Comment 8 Alexandre Oliva 2003-10-13 15:09:57 UTC
The problem does indeed go away with -pad.  However, gtoaster doesn't use this
option, and xcdroast doesn't work with the current cdrecord, so people would
have to (i) know to use cdrecord and/or (ii) add the -pad option by hand.  The
CD recorded without -pad works perfectly well with Shrike's kernel, so this
looks like a regression to me.  At the very least, one that should be widely
advertised in release notes.

Comment 9 Andre Robatino 2003-10-15 23:10:51 UTC
  I started experiencing this problem with test2.  It didn't occur with the
mediacheck in RH 8.0, RH 9, or test1.  It still happens with test3.  In both
test2 and test3, using cdrecord without -pad (on a Sun Solaris machine with
Cdrecord 1.9) results in disks 1, 2, 3 failing, passing, and failing, resp.  I
reburned the test3 disks using cdrecord with -pad, and the result is pass, pass,
and fail, resp.  So it may be necessary not only to use -pad, but to use it with
some minimum amount.  In all cases I tested all disks with the dd command before
running the mediacheck and the md5 sums were correct.  Despite the bad
mediacheck, the disks work fine as far as I can tell.
  If this problem isn't fixed by the final release, it has the potential to piss
a lot of people off.  Imagine 3rd party CD vendors having thousands of Fedora
CDs returned because the mediacheck falsely reports that the disks are bad. 
Ideally it shouldn't be necessary to use -pad at all.  If for some strange
reason it is, then the necessary amount should be clearly documented, along with
the procedure for achieving this with common burning tools, since it doesn't
happen by default.

Comment 10 Alexandre Oliva 2003-10-18 20:42:07 UTC
I've just confirmed that severn-i386-disc3.iso from Severn test3 does indeed
fail to verify, even if cdrecord -pad is used for burning.

Comment 11 Michael K. Johnson 2003-10-27 16:47:58 UTC
It's not yet clear what is happening here, but we've ended up turning off DMA
for CDROMs in the installer and that is reported to work around the problem,
so we shouldn't have the mediacheck problem.

We're leaving this open as a bug, but I don't think that it is a blocker
because of this

Comment 12 Andre Robatino 2003-11-08 08:48:20 UTC
  I burned the FC1 CDs without -pad, and verified each of them with
the dd trick.  When I then checked them with the mediacheck during
install, all three passed, so turning off DMA did the trick.  However,
when I use isocheckmd5 in anaconda-runtime to check them after
installing, I get pass, fail, and fail, resp.

Comment 13 Dave Jones 2003-11-20 18:17:22 UTC

*** This bug has been marked as a duplicate of 109462 ***

Comment 14 Alexandre Oliva 2004-02-18 02:17:23 UTC
This problem is still present in FC2test1.  BTW, I don't see that this
bug is actually a dup of 109462.  The descriptions and symptoms are
*so* different!

Comment 15 Alexandre Oliva 2004-04-21 13:05:59 UTC
FWIW, this problem is still present in kernel-2.6.5-1.327.

Comment 16 Andre Robatino 2004-05-17 11:39:41 UTC
  This problem exists again in FC2.  I downloaded the ISOs for FC2
install discs 1-4 and the rescue CD, burned them, checked the
signature of the MD5SUM file, checked the MD5 sums of the CDs with dd,
everything OK.  When doing the mediacheck, disc 2 failed.  When I
started the installer with "linux cddma=off", all 5 discs passed. 
Please bump this bug up to FC2.
  P.S.  I don't know if this issue would affect the actual install - I
haven't done it yet.  I also didn't check the source CDs.

Comment 17 Andre Robatino 2004-05-17 19:29:38 UTC
  This may be relevant.  My father has a fully updated FC1 machine
(except for OpenOffice) which is severely affected by not being able
to read files correctly off the install CDs.  If he puts in one of the
FC1 install CDs and does

cd /mnt/cdrom/Fedora/RPMS
su
rpm --checksig * | less

roughly half of the files fail the signature check, and there seems to
be no correlation between failing the test vs. file size.  These CDs
passed the mediacheck when he installed and he experienced this
problem shortly after, so the CDs are almost certainly good.  If he
reboots with the kernel option "cddma=off", and tries the same thing,
the command

rpm --checksig *

fails with a segmentation fault (I was expecting all the files to
pass).  If it would be helpful, he could make a copy of some of the
corrupted files (without the kernel option) and email them.  One way
to debug this problem is to compare a corrupted CD image to the
correct one to study the corruption, and use a debugging kernel to
find out where the corruption happens.

Comment 18 Dave Jones 2005-07-15 20:16:55 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 19 Alexandre Oliva 2005-07-23 15:10:38 UTC
I haven't observed this problem myself for quite some time, since I've started
using padding consistently, but I still see people having trouble with
mediacheck on FC4 installs, so I'm moving this to FC4.

Comment 20 Andre Robatino 2005-07-23 15:24:10 UTC
  I had this problem with the mediacheck on one of my two machines.  The only
way I can do an install on that box is to use the boot option ide=nodma, and
remove it from grub.conf when done.  If I try to do the mediacheck with
ide=nodma, in both FC3 and FC4 the mediacheck hangs on one of the CDs, so I
can't verify it, but the install works anyway.
  My understanding is that this is due to a bug in the ide-scsi driver.  Is this
the sort of thing that could cause data corruption?  Normally people are advised
to only use ide=nodma during the install, and turn it off afterwards.

Comment 21 Dave Jones 2005-09-30 06:51:47 UTC
Mass update to all FC4 bugs:

An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream
kernel (2.6.13.2). As there were ~3500 changes upstream between this and the
previous kernel, it's possible your bug has been fixed already.

Please retest with this update, and update this bug if necessary.

Thanks.


Comment 22 Alexandre Oliva 2005-10-28 16:48:28 UTC
AFAICT, this bug is fixed.  Just tried it with rawhide kernels (today's and
yesterday's, just to be sure, on x86_64, i686 and athlon), and an ISO image with
the very same size as the rescuecd.iso image from the beginning of this report,
burnt without any (explicit?) padding whatsoever, passed mediacheck with DMA
enabled.  Yay!

Comment 23 Andre Robatino 2005-12-08 23:07:36 UTC
  In the following fedora-test-list message, Alan Cox seems to be unaware of any
such fix.  Can you check with him to see if there is in fact any evidence of
such a fix?  It's possible your discs passed just by luck.  Considering this bug
goes back at least to the first version of RH which included the mediacheck,
which was during the 2.4 kernel, I think that's more likely.

http://www.redhat.com/archives/fedora-test-list/2005-November/msg00596.html

Comment 24 Andre Robatino 2006-03-19 11:44:45 UTC
  This bug still exists in FC5, which I downloaded early from a mirror site
which was open (and I checked that the SHA1SUM file has a valid signature, and
that the ISOs agree with it).  Please reopen this and bump it up to FC5 (on
Monday, anyway).

[andre@localhost FC5]$ gpg --verify SHA1SUM
gpg: Signature made Wed 15 Mar 2006 12:38:22 AM EST using DSA key ID 4F2A6FD2
gpg: Good signature from "Fedora Project <fedora>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: CAB4 4B99 6F27 744E 8612  7CDF B442 69D0 4F2A 6FD2

Comment 25 Mike A. Harris 2006-03-20 19:35:34 UTC
Since Alexandre Oliva filed this report originally, and has claimed that it
is fixed now for him, if you are experiencing a similar issue, it is best
to file a new bug report for it, unless Alexandre can also reproduce the
problem he initially reported here.



Comment 26 Andre Robatino 2006-03-20 23:47:51 UTC
  I have 2 different model machines and 3 different model CD drives.  Based on
some limited swapping of drives, it appears to me that whether the bug manifests
depends only on the CD drive and not which machine it's in, and also that there
are 3 categories of CD drives (I have one of each):
1) Never affected.
2) Affected unless one passes ide=nodma to the kernel.
3) Always affected.
  So a check should use the same CD drive known to have been affected before. 
My 5 FC5 disks all pass 1), some pass 2) unless I use ide=nodma in which case
they all pass (and which ones pass without ide=nodma can depend on exactly which
peripherals are plugged into the PC, it does NOT depend only on the disc), and
always fail 3).  If I'm not mistaken, the fact that they all pass 1) is pretty
conclusive proof that the discs themselves are good.