131254 – Only first 137GB of large disk recognized.

Bug 131254 - Only first 137GB of large disk recognized.

Summary: Only first 137GB of large disk recognized.

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	2
Hardware:	i586
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Alan Cox
QA Contact:
Docs Contact:
URL:	http://groups.google.com/groups?q=137...
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-08-30 15:32 UTC by Derek Price
Modified:	2007-11-30 22:10 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-10-15 15:54:00 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Ported patch forward. (6.96 KB, patch) 2004-09-08 14:38 UTC, Derek Price	no flags	Details \| Diff
View All

Description Derek Price 2004-08-30 15:32:02 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616

Description of problem:
With a large disk (>137GB) only the first 137GB is recognized by the
kernel.

I found this link which suggests that this is a problem with an old
IDE controller which does not support DMA transfer modes when LBA48 is
used:
<http://groups.google.com/groups?q=137GB+Linux+Bartlomiej+Maxtor&hl=en&lr=&ie=UTF-8&c2coff=1&selm=1xgvp-1pm-5%40gated-at.bofh.it&rnum=1>.
 This link also suggests a patch.

Version-Release number of selected component (if applicable):
2.6.8-1.521

How reproducible:
Always

Steps to Reproduce:
1. Attach new, large, disk.
2. Boot.
3. Run fdisk to see only 137GB of disk available for partitioning.
    

Actual Results:  Saw only 137GB of disk.

Expected Results:  Should have seen entire disk.

Additional info:

Comment 1 Alan Cox 2004-08-30 15:40:40 UTC

Please supply hardware information for the problem setup. That way I
can tell if its a bug or feature.  

Alan

Comment 2 Derek Price 2004-08-31 15:11:29 UTC

The machine is remote (8 hours drive remote) at the moment, but I
pulled this out of /proc/ide:

$ head -1 /proc/ide/ide0/config
pci bus 00 device 78 vendor 10b9 device 5229 channel 0
$ head -1 /proc/ide/ide1/config
pci bus 00 device 78 vendor 10b9 device 5229 channel 1
$ cat /proc/ide/ali
 
                                Ali M15x3 Chipset.
                                ------------------
PCI Clock: 33.
CD_ROM FIFO:No , CD_ROM DMA:Yes
FIFO Status: contains 0 Words, runs.
 
-------------------primary channel-------------------secondary
channel---------
 
channel status:       Off                               Off
both channels togth:  Yes                               Yes
Channel state:        OK                                OK
Add. Setup Timing:    8T                                8T
Command Act. Count:   8T                                8T
Command Rec. Count:   16T                               16T
 
----------------drive0-----------drive1------------drive0-----------drive1------

DMA enabled:      Yes              Yes               Yes              No
FIFO threshold:    8 Words          8 Words           4 Words        
 4 Words
FIFO mode:        FIFO On          FIFO On           FIFO Off        
FIFO Off
Dt RW act. Cnt     3T               3T                3T               8T
Dt RW rec. Cnt     1T               1T                1T              16T
 
-----------------------------------UDMA
Timings--------------------------------
 
UDMA:             OK               No                No               No
UDMA timings:     2.5T             2.5T              2.5T             1.5T
 
$ cat /proc/ide/hdb/model
WDC WD2500JB-00EVA0
$

I'm not sure what else to cat out for you.  I know that last is a new
Wesern Digital 250GB drive.  Let me know if there's anything else I
can do.

Comment 3 Derek Price 2004-08-31 15:16:56 UTC

Oh, the error message in /var/log/dmesg is:

...
hdb: max request size: 128KiB
hdb: cannot use LBA48 - full capacity 488397168 sectors (250059 MB)
hdb: 268435456 sectors (137438 MB) w/8192KiB Cache, CHS=16709/255/63,
(U)DMA
 hdb: hdb1
...

Comment 4 Alan Cox 2004-08-31 15:55:26 UTC

Confirmed - more 2.4-ac code that never made 2.6, sigh. Will queue for
Bartlomiej

Comment 5 Derek Price 2004-09-08 14:38:14 UTC

Created attachment 103586 [details]
Ported patch forward.

I've ported forward Bartlomiej's patch from the link above and attached it.

I've confirmed that the attached patch fixes the problem, at least as far as
fdisk is concerned.  After applying this patch to RedHat's 2.6.8-1.521 kernel
SRPM, rebuilding the RPM, installing, and rebooting, fdisk reports seeing a
250GB disk.

I expect this is a separate issue but, after this patch, mke2fs still has a
problem "automagically [figuring] the file system size" (quote from the man
page, not any error message).  It seems to get stuck at the same 137GB maximum
the original bug hit.  I worked around this by requesting a specific number of
blocks from mke2fs based on fdisk's report of the disk size and my requested
block size.  With this information, mke2fs did its job correctly and I
subsequently managed to mount the usable 222GB of the disk.

Comment 6 Alan Cox 2004-09-08 14:52:25 UTC

I'd guess Bartlomiej's changes don't fix up the capacity mangling.
I've got the original RH patchin my tree at the moment although Bart's
has some improvements performance wise for the earlier disk segments.
You might want to poke Bart, tell him his patch mostly works and ask
him what the merge plan is ?

Comment 7 Derek Price 2004-09-08 19:25:29 UTC

Sent off an email to Bart cross-posted to linux.kernel.

Comment 8 Derek Price 2004-09-24 17:05:05 UTC

Never heard from Bart.

Worth noting that the 250GB disk in question crashed using the kernel
with my patch applied.  I'm not certain this was related, but it is my
first suspicion.

I used the disk without errors for a day or two.  Rebooted.  Computer
suddenly stopped booting reporting bad superblocks on said disk.  I
tried to fsck using the back-up superblocks with no luck.

Reformatting appears to have repaired the damage, but my remote access
is still broken, so it will most likely be at least a month before I
can retrieve any further information from the broken machine.

Comment 9 Alan Cox 2004-10-01 22:16:39 UTC

Bart may not have replied but he has propogated LBA48 fixes for PIO
only into 2.6.9rc3 so I guess he was listening

Comment 10 Derek Price 2004-12-29 17:43:32 UTC

Bart's fix appeared to work for some weeks, then the disk crashed
hard.  It looks like a hardware failure still - the disk is no longer
even visible from the BIOS.  I hope this is not related, but I thought
I'd note it.  If it happens again after I replace the disk I'll be
back.  :)

Note You need to log in before you can comment on or make changes to this bug.