Description of problem: When attempting to upgrade an existing 7.0 system to 7.1, the installer will get up to the questions before auto or manual partitioning and starts complaining about 2 out of 3 drives having partions that do not begin or end on cylinder boundaries. It suggests passing the HD parms on the LILO command line but that does not seem to work. How reproducible: Always Steps to Reproduce: 1.Boot Linux 7.1 CD1 2.Run Install as text 3. Actual Results: Causes Data corruption of 2 out of 3 HD Expected Results: Completed install / upgrade of existiting 7.0 machine Additional info: During Install,it suggests passing the HD parms on the LILO command line but that does not seem to work. If I delete all partitions using fdisk through the setup program as well in VC2 and repartion manually it seems to be ok, but if I go back into fdisk on the drive I just finished with the partition table is all messed up. If upgrading fresh from 7.0 Fresh install, the system can read all 3 drives but then fails after formatting the volumes when it tries to mount any partition that it formatted. The 2 drives affected by this problem are hda and hde. hdb seems to not be affected. System is a Cyrix 133 w/ 48mb of Ram FIC PA-2002 motherboard Linksys Ethernet Controller LNE100TX v.4.0 using Tulip drivers Hard drives are \dev\hda Maxtor 85250D6 \dev\hdb JTS Corp. CHAMP Model C1300-2AF \dev\hde WDC WD600AB-32BVA0 hda and hdb are installed on internal IDE controller hde is installed on a CMD6xx ATA 100 controller.
Try booting with 'linux noprobe'. Does this help?
No. I tried both hda=noprobe hde=noprobe as well as linux noprobe and both don't seem to work right. If I use hda and hde noprobe it started complaining that I did not have enough space or inodes on my / partition for all the packages to be upgraded. If I use 'linux noprobe' on a clean install of RH7.0 then know it complains about the partions on hda and hdb. Is this something caused by the installer? I have had the 2.4.4 kernel on here without a problem. I was just hoping to get to a regular distro so I didn't have to compile the kernel to try and get my NIC working or something like that. Greg
Can you look on VC3 and VC4 and see if there are any kernel error messages about reading the drives?
Took a look and the only thing that I saw was it mentioned on VC4 that there were too many inodes. I saved the exact message to a drive to retrieve on reboot but alas the system is corrupted again. This time when it reboots on RH 7.0 / 2.4.4 kernel and attempts to mount the filesystems, it states that '/home: Corruption found in superblock (ionodes_per_group = 1850786)'. This is one of the many filesystems that fail e2fsck so it drops me to a maint mode. If I try to run e2fsck on the drive manually it states that 'e2fsck reports group descriptors look bad ... trying backup blocks. Bad magic number in superblock while trying to open /dev/hda5. Could this be related to the large drives? Extended partitions? Also, I noticed in one of the VC screens that it insmod what appeared to be an ext3 driver like for a new filesystem. Could this be causing the problem? Any help would be appreciated because it gets old having to reload RH7.0 from scratch. I am attaching a file with some 7.0 drive information before the upgrade occured.
Created attachment 20393 [details] Disk layout of machine in RH 7 prior to upgrade
Well, we saw a problem very similar to this in our beta cycles. The way the kernel handles disk geometry changed from one beta to another, and we were seeing this problem with people who had previously used one of the 7.1 betas. We have since seen this problem with a few people who never used any of the 7.1 betas, and I'm not sure why it's happening. I think if you used the 7.1 installer to create the partitions, then things would work. I know that may not be an option if you got data on the drive that you haven't backed up.
Tried using the 7.1 installer version of fdisk. I deleted all the partitions on hda and recreated them as per the previous attachment. I had went on to work on hdb and realized I had forgotten to set a partition to a swap type so I reentered hda using fdisk and got the message as follows: Warning: ignoring extra data in partition table 5 Warning: ignoring extra data in partition table 5 Warning: ignoring extra data in partition table 5 Warning: invalid flag 0x2020 of partition table 5 will be corrected by w(rite) Well, after entering a w to write the partition table, I re-entered the hda drive and found the same errors but without the last line. The partition table is all mangled with the first one having an id of bf and an indication that partition 1 has different physical/logical beginnings (non-Linux?): phys= (0,1,63) logical =(0,1,1). partiton 2 and 5 are messed up similarly. Any other way that you would suggest to 'use the 7.1 installer' to partition the drives. The data that was on these drives was long gone after the first install. Thank goodness I can hopefully recover most of it. I would like to settle this issue if possible so no one else suffers the same fate but would also like to proceed in getting the system back in a running state. Thanks in advance, Greg
I don't know what the problem is. It sounds like there could be hardware problems with the drive. From the errors that you are seeing in fdisk, it looks like there could be bad sectors on the drive. If the disk can't be partitioned properly, the installer doesn't have much hope of working.
But if the partitions can be formatted and installed from RH 7.0, then why is the 7.1 installer trashing the drives. The partitions were valid when I first entered them and it was only after I modified them with the fdisk that is used by the 7.1 installer that the data was erased. Why will the drives work with 7.0 just fine but if I install 7.1, they are destroyed? HDA and HDB are older drives but HDE, the 60 gig, is brand new. And it works fine in 7.0 as long as I recompile the kernel to include CMD640 controller support. Everything even works fine under the 2.4.4 kernel... Any more suggestions? Greg
I'm running out of ideas. I can't explain why this would happen, and I haven't seen it happen on other machines. My guess is that there's something weird about the cmd640 driver (or the controller itself). If you look at the comments in the header of /usr/src/linux-2.4/drivers/ide/cmd640.c, the feeling seems to be that the controller is not quite up to par.
I'm not sure either because the problem is manifesting itself on hda which is on the internal interface. I could see the CMD640 as the problem if it was the only place with the problem but the interal IDE interface is where the problem is at. hde, on the cmd interface, is still having the problem but not sure if just having the interface in the machine is causing the problem. The funny thing is that the drive works fine with a 2.4.4 kernel and the CMD640 drivers compiled in under RH7.0/2.4.4. It works fine that way. Hopefully this will be fixed in 7.2.
The entire partitioning section of the installer is being rewritten, so this bug should not be a problem in future releases.