Bug 80298

Summary: mke2fs can't format huge disk(over 1TB).
Product: [Retired] Red Hat Linux Reporter: Shinya Narahara <naraha_s>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2   
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-12-24 09:46:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shinya Narahara 2002-12-24 08:38:37 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [ja] (WinNT; U)

Description of problem:
The mke2fs, can't format huge scsi disk, over 1TB.
But it can format a scsi disk which capacity is lower than 1TB.
By recognizing the output of mke2fs, drivers/scsi/sd.c may be wrong.

We tested this on  qla2200.o + fibre connected scsi disk(1.6TB Logical Unit).


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. make partition over 1TB.
2. execute mke2fs to it.
3.


Actual Results:  It seems to be normal, but dmesg has many I/O errors.

 I/O error: dev 08:30, sector 3435330528
 I/O error: dev 08:30, sector 3435330528
 I/O error: dev 08:30, sector 3435330528
 I/O error: dev 08:30, sector 3435330528
 I/O error: dev 08:30, sector 3435330526
...

Expected Results:  mke2fs finished normally.


Additional info:

We guess this may be a bug of driver/scsi/sd.c. The mk2fs log says
block size strangeness around the fs size is 1TB. 

************ normal case ************************************
[fdisk for the partition less than 1TB]---------------------------
Disk /dev/sdd: 255 heads, 63 sectors, 213839 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdd1             1    133674 1073736373+  83  Linux

[mke2fs for the above partition]---------------------------
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
134217728 inodes, 268434093 blocks   
13421704 blocks (5.00%) reserved for the super user
First data block=0
8192 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
	102400000, 214990848

************ error case *************************************
[fdisk for the partition less than 1TB]---------------------------
Disk /dev/sdd: 255 heads, 63 sectors, 213839 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdd1             1    133675 1073744406   83  Linux

[mke2fs for the above partition]---------------------------
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
403439616 inodes, 805306368 blocks
40265350 blocks (5.00%) reserved for the super user
First data block=0
24576 block groups
32768 blocks per group, 32768 fragments per group
16416 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
	102400000, 214990848, 512000000, 550731776, 644972544


We guess drivers/scsi/sd.c has many bugs for large disk (over 1TB),
because it have many 31bit(int) sector/capacity calcurations. On IA-64,
the "long int" is 8 byte, so the value defined "int" should be "long"
at least, hopefully "long long".

ex1.)
The initial message is strange, because of a bug of drivers/scsi/sd.c.
The capacity and sector number is displayed as minus number.

SCSI device sdc: -859636736 512-byte hdwr sectors (-440133 MB)
 sdd: I/O error: dev 08:20, sector 3435330528
 I/O error: dev 08:20, sector 3435330528
 sdd1

drivers/scsi/sd.c: sd_init_onedisk():
  printk( "SCSI device %s: "
         "%d %d-byte hdwr sectors (%d MB)\n",
         "%d %d-byte hdwr sectors (%u MB)\n",
          nbuff, rscsi_disks[i].capacity,
          hard_sector, (sz/2 - sz/1250 + 974)/1950);

sz is defined as int above this printk(), it should be "long long".
# at least "unsigned int".

ex2.)
drivers/scsi/sd.c: sd_init_command():

int dev, devm, block, this_count;
block = SCpnt->request.sector;

however SCpnt->request.sector is defined as "long" in include/linux/blkdev.h..


For 1TB, the sector number(512byte/sector) may be 31bit, and block nuber
(1024byte/block) may be 30bit. so it easily overflow if the value is defined "int".
The specification "Documentation/filesystems/ext2.txt" says that the
limitation of filesystem is 2TB, however we can't make it.

Comment 1 Arjan van de Ven 2002-12-24 09:46:27 UTC
Red Hat Linux 7.2 does explicitly not support disks/block devices over 1Tb.


Comment 2 Shinya Narahara 2002-12-24 12:00:04 UTC
Is there any RedHat kernel which can treat the partition over 1TB?
The kernels of RH8.0 and RHAS2.1 doesn't seem to format the partition...
How about kernel 2.5?


Comment 3 Arjan van de Ven 2002-12-24 12:08:12 UTC
well mke2fs is not part of the kernel.

The RHL 8.0 kernel can work with 2Tb storage for certain systems; not with
software raid of LVM for one; only testing will tell. 

Comment 4 Shinya Narahara 2002-12-25 00:29:46 UTC
Thank you for your advice, we'll test RHL8.0 kernel.

We know the mke2fs is not part of the kernel, but the
problem is the kernel function ioctl(dev,BLKGETSIZE,&size)
returns wrong size about the scsi disk over 1TB, to mke2fs.
Anyway, we'll explore kernel-2.4.18-18.8.0 source...

Thanks,


Comment 5 Arjan van de Ven 2002-12-25 10:34:12 UTC
there's a reason all newer apps use BLKGETSIZE64 ioctl; that's because THAT does
support bigger disks...

Comment 6 Shinya Narahara 2002-12-25 14:04:48 UTC
Unfortunatelly, it seems we can't use huge disk over 1TB even on
kernel-2.4.18-18.8.0, although we've not tested yet. 
The ioctl BLKGETSIZE64 uses same value of BLKGETSIZE,
so ioctl BLKGETSIZE64 returns wrong value if the BLKGETSIZE
returns wrong(when the partition is over 1TB).

drivers/block/blkpg.c:
case BLKGETSIZE:
case BLKGETSIZE64:
    g = get_gendisk(dev);
    if (g)
        ullval = g->part[MINOR(dev)].nr_sects;
    if (cmd == BLKGETSIZE)
        return put_user((unsigned long)ullval, (unsigned long *)arg);
    else
        return put_user(ullval << 9, (u64 *)arg);