From Bugzilla Helper: User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.4.2-2 i686; Nav) Description of problem: I have installed 3 copies of RedHat 7.1 in the last few months and each time I have had a badblock shown when I run badblocks for 1 or 3 blocks just before the last block on the "/" partition (but no others) - it must be an end condition bug since it has happened on 3 different computers - 2 of which are EXACTLY the same except the CPU Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Install RH 7.1 (/boot=50Mb, /=most of rest of a large drive) each case / is >= 4G 2. Login after install and run "badblocks -v /dev/hda?" for the "/" partition Actual Results: 3 machines results (output of "df" followed by the output of each "badblocks" I run) 1) **** Fri Oct 5 05:22:01 EST 2001 Filesystem 1k-blocks Used Available Use% Mounted on /dev/hda5 18034128 12774808 4343216 75% / /dev/hda2 54447 3489 48147 7% /boot /dev/hda1 1027856 6976 1020880 1% /dos **** Fri Oct 5 05:35:06 EST 2001 badblocks /=/dev/hda5 Checking for bad blocks in read-only mode >From block 0 to 18322101 18322100 Pass completed, 1 bad blocks found. **** Fri Oct 5 05:55:38 EST 2001 badblocks /boot=/dev/hda2 Checking for bad blocks in read-only mode >From block 0 to 56227 Pass completed, 0 bad blocks found. 2) **** Fri Oct 5 05:12:00 EST 2001 Filesystem 1k-blocks Used Available Use% Mounted on /dev/hda5 18034128 13764700 3353324 81% / /dev/hda2 54447 3489 48147 7% /boot /dev/hda1 1027856 7088 1020768 1% /dos **** Fri Oct 5 05:21:38 EST 2001 badblocks /=/dev/hda5 Checking for bad blocks in read-only mode >From block 0 to 18322101 18322100 Pass completed, 1 bad blocks found. **** Fri Oct 5 05:39:31 EST 2001 badblocks /boot=/dev/hda2 Checking for bad blocks in read-only mode >From block 0 to 56227 Pass completed, 0 bad blocks found. 3) **** Fri Oct 5 05:32:00 EST 2001 Filesystem 1k-blocks Used Available Use% Mounted on /dev/hda5 3897800 2583584 1116220 70% / /dev/hda2 54447 3849 47787 8% /boot /dev/hda1 2096160 2002656 93504 96% /dos /dev/hdb1 20063360 17683392 2379968 89% /video **** Fri Oct 5 05:36:30 EST 2001 badblocks /=/dev/hda5 Checking for bad blocks in read-only mode >From block 0 to 3959991 3959988 3959989 3959990 Pass completed, 3 bad blocks found. **** Fri Oct 5 05:41:08 EST 2001 badblocks /boot=/dev/hda2 Checking for bad blocks in read-only mode >From block 0 to 56227 Pass completed, 0 bad blocks found. Expected Results: No bad blocks Additional info: As stated above - it would appear to be an end condition error in mke2fs rather than a hard drive problem since it has happened on 3 completely separate computers with no other errors showing up I run badblocks each day on all 3 computers and no other errors have ever shown up in the last 3 months on the first 2 computers The third computer I installed yesterday and the first run of badblocks gave the error shown (3 bad blocks rather than only 1 - but in the exact same place) Coincedence is too high to be hard drive errors
OK, just to put more belief in the end condition bug idea: I have just installed ANOTHER RedHat 7.1 machine and AGAIN it has a few bad blocks at the end of the / partition: **** Sun Oct 14 05:42:00 EST 2001 Filesystem 1k-blocks Used Available Use% Mounted on /dev/hdc5 3637796 2441168 1011836 71% / /dev/hdc1 49838 3832 43433 9% /boot /dev/hda1 4208784 261284 3947500 7% /dos **** Sun Oct 14 05:53:14 EST 2001 badblocks /=/dev/hdc5 Checking for bad blocks in read-only mode >From block 0 to 3695863 3695860 3695861 3695862 Pass completed, 3 bad blocks found. **** Sun Oct 14 06:02:45 EST 2001 badblocks /boot=/dev/hdc1 Checking for bad blocks in read-only mode >From block 0 to 51471 Pass completed, 0 bad blocks found.
Compaq Deskpro. Fuji 17G drive. If you need more info let me know sarsenault. [root@... /root]# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda6 5.6G 1.1G 4.2G 20% / /dev/hda7 1.8G 33M 1.7G 2% /backup /dev/hda1 53M 9.1M 41M 18% /boot /dev/hda5 7.7G 751M 6.5G 11% /home none 109M 0 108M 0% /dev/shm [root@... /root]# badblocks -v /dev/hda1 Checking for bad blocks in read-only mode From block 0 to 56196 Pass completed, 0 bad blocks found. [root@... /root]# badblocks -v /dev/hda5 Checking for bad blocks in read-only mode From block 0 to 8193118 8193116 8193117 Pass completed, 2 bad blocks found. [root@... /root]# badblocks -v /dev/hda6 Checking for bad blocks in read-only mode From block 0 to 5919921 5919920 Pass completed, 1 bad blocks found. [root@... /root]# badblocks -v /dev/hda7 Checking for bad blocks in read-only mode From block 0 to 1951866 1951864 1951865 Pass completed, 2 bad blocks found. This is not reassuring but I am not believing the results due to this post. I sure hope someone comes up with a answer.
This is a kernel problem. I think a partial fix is in our current errata kernel, a real clean fix can only go into the development kernel 2.5.x. cu, Florian La Roche
Can you check if your partition is an odd number of sectors in size ?
Which kernel are you using, exactly? This sounds to me as if there's a block size problem manifesting on filesystems using a 1k blocksize. If the buffered IO during the badblocks test happens to use a 4k blocksize by default, you'd get exactly these symptoms.
OK - the partitions were created with the standard 7.1 install so I'd guess my current kernel is not going to explain anything, but anyway it is: 2.4.9-12 (from the redhat updates) Having looked at this again, I see two actual problems: 1) The number of blocks on each partition (reported by df) is quit a bit smaller than the number used by badblocks to check the whole partition - is this expected behaviour? Should df report the true size of the partition or does ext2 not use the whole parition and thus waste about 1% or 2% of it? Or is this extra space used for something else? 2) Badblocks is checking past the size specified by df Output of /proc/partitions (for machine 1 or 2 at top - they are the same) major minor #blocks name rio rmerge rsect ruse wio wmerge wsect wuse running use aveq 3 0 19938240 hda 4414849 15753092 160234604 7482097 324656 2487693 22511544 5061971 -20 15050850 9646918 3 1 1028128 hda1 294 0 304 210 0 0 0 0 0 210 210 3 2 56227 hda2 10907 173918 369650 10170 5 8 32 510 0 9890 10680 3 3 1 hda3 0 0 0 0 0 0 0 0 0 0 0 3 5 18322101 hda5 4403538 15578289 159856690 7469557 324595 2486748 22503568 5057831 0 14774260 12797228 3 6 530113 hda6 110 885 7960 2160 56 937 7944 3630 0 3250 5790
"df" counts usable blocks, but there are reserved blocks over and above that count for the inode tables. So you would expect "df" to show smaller than the partition size. "tune2fs -l" will list the superblock on a device, and will tell you the true total block size that the filesystem has been created with. /dev/hda5 is exactly 18322101 blocks long. That's quite large, so hda5 is going to have a blocksize of 4k. You have hda5 mounted, so the kernel has already been forced into using that 4k blocksize for all IO access to that partition. You have not given "badblocks" a block size argument, so it has assumed the smallest possible, 1k, to give the greatest possible coverage on the device. Badblocks has then tried to access the 1k blocks beyond the last complete 4k block in the partition (because the partition has an odd size), and because the kernel is already using a 4k blocksize, it tries to pad the 1k read out to a complete 4k block and fails because that extends beyond the end of the device. Solution: use the "-b 4096" option to badblocks to tell it what the blocksize really is. This is not strictly a bug, more a restriction on the kernel's ability to deal with multiple blocksizes at once on a device. Please reopen this bug report if the "-b 4096" doesn't fix things for you.