Bug 379771 - [pata_serverworks] F8 install media won't boot on Dell 2500
[pata_serverworks] F8 install media won't boot on Dell 2500
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
8
All Linux
low Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-13 07:11 EST by Simon Andrews
Modified: 2008-05-19 10:54 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-19 10:54:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages output from 2.6.23.1-21 (3.34 KB, text/plain)
2007-11-14 07:20 EST, Simon Andrews
no flags Details
Output of modprobe pata_serverworks (756 bytes, text/plain)
2007-11-14 07:21 EST, Simon Andrews
no flags Details
Log of errors when reading CD under 2.6.23.8-63 (2.75 KB, text/plain)
2007-12-07 04:39 EST, Simon Andrews
no flags Details

  None (edit)
Description Simon Andrews 2007-11-13 07:11:20 EST
Description of problem:
The F8 rescue CD won't boot on our Dell PowerEdge 2500.  This machine is
currently running F7 fine, but can't now be upgraded to F8.

Version-Release number of selected component (if applicable):
F8 release kernel.


How reproducible:
Always (on this hardware).

Steps to Reproduce:
1.Boot the rescue CD.
2.Select NFS install
3.Start installation

Actual results:
Installation hangs indefinitely.  Console shows a loop of ATA errors.


Additional info:
The errors produced include:

SQASHFS error sb_bread failed reading block 0x1736
SQASHFS error unable to read page, block 0x2266
Buffer I/O error on device sr0, logical block 9070
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd a0/01 0:0 0:0 0:0 0:0 0:0 0:0 0:0 0:0 0:0/a0 tag 0

upon rebooting the umount of /dev/loop throws an error.

I noticed that someone else with the same model of server reported the same
problem on the Ubuntu forums, but no progress has been made on that bug:

https://bugs.launchpad.net/dru/+bug/148466

That bug report has a better log capture than I do.  The errors and symptoms
look exactly the same.
Comment 1 Simon Andrews 2007-11-13 10:59:52 EST
If it helps the smolt id of this system is:

8b099aa1-0b9b-4ad5-b8ca-f8b50101085e
Comment 2 Simon Andrews 2007-11-14 07:19:05 EST
I've tracked this down some more.  This is definitely a kernel problem, probably
in the pata_serverworks module, and was introduced with kernel 2.6.23.

I tried to read the F8 cd in the current F7 install and it failed.  If I revert
back to 2.6.22.9-91 it works, but 2.6.23.1-21 fails.

The drive seems to be identified the same under both kernels (modprobe
pata_serverworks output is the same), but under 2.6.23 I can't read any data
from it.  I'm assuming that F8 install media uses the newer kernel and therfore
won't boot once it moves to using its own driver.
Comment 3 Simon Andrews 2007-11-14 07:20:04 EST
Created attachment 257961 [details]
/var/log/messages output from 2.6.23.1-21
Comment 4 Simon Andrews 2007-11-14 07:21:09 EST
Created attachment 257971 [details]
Output of modprobe pata_serverworks
Comment 5 Simon Andrews 2007-11-19 08:27:31 EST
I made a bit of progress on this.

I did a BIOS update on the server to the latest version (A07).

When I rebooted into F7 2.6.23.1-21 I could use the CD drive (which I couldn't
before).

However, when I tried to use the drive to do the upgrade to F8 using the rescue
CD it failed in the same way as before.

There would therefore appear to be something which changed between 2.6.23.1-21
and 2.6.23.1-42 which causes this drive to stop working.
Comment 6 Simon Andrews 2007-11-20 04:00:14 EST
I've found a work round which allows this machine to boot from the F8 media. 
Appending the kernel option:

libata.pata_dma=1

Throws up a ton of "Unknown symbol: ata_[something] errors,  but does allow the
boot to proceed.

It seems therefore that this bug is a variant of BZ#242956, although it
manifests itself in a different way.

Given that Alan Cox said the 242956 was only to be used as a general tracker bug
I'll leave this bug open so it doesn't get lost in the large number of different
issues in the more general bug.
Comment 7 Chuck Ebbert 2007-11-20 15:27:43 EST
There is a fix for pata_serverworks  in kernel 2.6.23.8-62:

36beb82390235236c60eb97ca526b1cad97e2df3
pata_serverworks: Fix problem with some drive combinations
Comment 8 Simon Andrews 2007-12-07 04:39:35 EST
Created attachment 280821 [details]
Log of errors when reading CD under 2.6.23.8-63

I've tried the CD drive under the newly released 2.6.8.23.8-63 kernel and it's
still broken despite the ATA patches.  I think the errors are the same as
before, but I've attached the log from inserting the F8 rescue CD into this
machine.  The CD is known good (it's the one I installed from once I used the
libata command line option).
Comment 9 Chuck Ebbert 2007-12-07 19:06:41 EST
MMCONFIG was disabled by default in kernel -42.

Try adding

  pci=mmconf

to the boot options...
Comment 10 Simon Andrews 2007-12-10 07:20:35 EST
(In reply to comment #9)
> Try adding
>   pci=mmconf 
> to the boot options...

I've tried that but with no success.  Same errors as in comment #8.

Comment 11 Simon Andrews 2008-02-05 05:08:54 EST
Since there have been a couple of kernel updates lately I tried this again and
got a different and if anything slightly more serious failure:

Feb  5 10:04:43 bilin1 kernel: hald[2007]: segfault at b7ea0000 eip 080571b6 esp
bfacde10 error 4

This occurred when I inserted a CD into the drive.  HAL has been running fine
for ages before that.
Comment 12 Simon Andrews 2008-02-05 05:27:09 EST
Something odd is going on with this.  After the HAL crash in the last note I
restarted HAL and the drive was working again.  I mounted and ejected several
CDs in different formats and everything worked as normal.  A sample log entry was:

Feb  5 10:08:27 bilin1 kernel: UDF-fs: Partition marked readonly; forcing
readonly mount
Feb  5 10:08:27 bilin1 kernel: UDF-fs INFO UDF 0.9.8.1 (2004/29/09) Mounting
volume 'Roxio4', timestamp 2006/11/30 11:27 (1000)
Feb  5 10:08:27 bilin1 gnome-keyring-daemon[13094]: adding removable location:
volume_label_Roxio4 at /media/Roxio4
Feb  5 10:08:27 bilin1 hald: mounted /dev/sr0 on behalf of uid 13779

I then restarted the server and inserted a CD, and am back to the errors like:

Feb  5 10:21:47 bilin1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Feb  5 10:21:47 bilin1 kernel: ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0
tag 0 cdb 0x28 data 131072 in
Feb  5 10:21:47 bilin1 kernel:          res 40/00:02:00:0c:00/00:00:00:00:00/a0
Emask 0x4 (timeout)
Feb  5 10:21:47 bilin1 kernel: ata1: soft resetting port
Feb  5 10:21:47 bilin1 kernel: ata1.00: configured for UDMA/25
Feb  5 10:21:47 bilin1 kernel: ata1: EH complete
Feb  5 10:22:17 bilin1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Feb  5 10:22:17 bilin1 kernel: ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0
tag 0 cdb 0x28 data 131072 in
Feb  5 10:22:17 bilin1 kernel:          res 40/00:02:00:0c:00/00:00:00:00:00/a0
Emask 0x4 (timeout)
Feb  5 10:22:17 bilin1 kernel: ata1: soft resetting port
Feb  5 10:22:18 bilin1 kernel: ata1.00: configured for UDMA/25
Feb  5 10:22:18 bilin1 kernel: ata1: EH complete
Feb  5 10:22:48 bilin1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Feb  5 10:22:48 bilin1 kernel: ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0
tag 0 cdb 0x28 data 131072 in
Feb  5 10:22:48 bilin1 kernel:          res 40/00:02:00:0c:00/00:00:00:00:00/a0
Emask 0x4 (timeout)
Feb  5 10:22:48 bilin1 kernel: ata1: soft resetting port
Feb  5 10:22:48 bilin1 kernel: ata1.00: configured for UDMA/25
Feb  5 10:22:48 bilin1 kernel: ata1: EH complete
Feb  5 10:23:18 bilin1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Feb  5 10:23:18 bilin1 kernel: ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0
tag 0 cdb 0x28 data 131072 in
Feb  5 10:23:18 bilin1 kernel:          res 40/00:02:00:0c:00/00:00:00:00:00/a0
Emask 0x4 (timeout)
Feb  5 10:23:18 bilin1 kernel: ata1: soft resetting port
Feb  5 10:23:19 bilin1 kernel: ata1.00: configured for UDMA/25
Feb  5 10:23:19 bilin1 kernel: sr 2:0:0:0: [sr0] Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE,SUGGEST_OK
Feb  5 10:23:19 bilin1 kernel: sr 2:0:0:0: [sr0] Sense Key : Aborted Command
[current] [descriptor]
Feb  5 10:23:19 bilin1 kernel: Descriptor sense data with sense descriptors (in
hex):
Feb  5 10:23:19 bilin1 kernel:         72 0b 00 00 00 00 00 0e 09 0c 00 00 00 02
00 00 
Feb  5 10:23:19 bilin1 kernel:         00 0c 00 00 a0 40 
Feb  5 10:23:19 bilin1 kernel: sr 2:0:0:0: [sr0] Add. Sense: No additional sense
information
Feb  5 10:23:19 bilin1 kernel: end_request: I/O error, dev sr0, sector 232
Feb  5 10:23:19 bilin1 kernel: printk: 30 messages suppressed.
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 29
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 30
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 31
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 32
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 33
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 34
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 35
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 36
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 37
Feb  5 10:23:19 bilin1 kernel: Buffer I/O error on device sr0, logical block 38
Feb  5 10:23:19 bilin1 kernel: ata1: EH complete

I tried restarting HAL again but it made no difference.

Looking at my yum.log there have been no HAL or kernel changes since the machine
was last rebooted so I'm at a loss to explain what was different this time.
Comment 13 Simon Andrews 2008-05-19 10:54:44 EDT
I've just upgraded this server to Fedora 9 and everything is working again.  The
install media boots without the need for additional kernel options and CDs mount
reliably in Gnome once the machine was upgraded.  I haven't checked back to see
if any of the updates to F8 made the drive work there as well, but since I'm not
going to be able to test an older version now I'll close the bug as resolved.

Note You need to log in before you can comment on or make changes to this bug.