Description of problem: Trying to install RHEL4U5 beta1 on an HP DL585g2 system
(8 Opteron cores, 32Gb of memory, installing on a 72Gb CCISS drive, no RAID,
booting from a boot.iso and installing over NFS, chose Everything under
packages). During the QA period, when partitioning the drive, I got a lot of
Assertion (cyl_size <= 255*63) at disk_dos.c:556 in
function probe_partition_for_geom() failed.
Assertion (heads < 256) at disk_dos.c:576 in ...
Assertion ((C*heads + H)*sectors +S == A) at disk_dos.c:582 in ...
I clicked [Ignore] on all of them (clicking [Cancel] did not seem
to have any effect the few times I tried the first time I went through
the exercise - the second time, I ignored every assertion).
Finally, after the interactive part of the installation was over, I got
more assertions and finally anaconda died with an unhandled exception.
I'll create an attachment with the traceback.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Described above
Actual results: Installation failed.
Expected results: Success.
Created attachment 149479 [details]
I tried the install with RHEL4.5snap1 - I get the exact same failure.
Red Hat Enterprise Linux 4
Some more information:
o I tried installing on an (older) Opteron blade with a CCISS drive: the
installation succeeded, the system is running fine.
o I copied parted and some of the libraries that it depends on from the above
system to the DL585g2 with a non-RHEL4.5snap1 installation and tried looking at
the disks with it. There was no problem (although I didn't push it very far).
o I copied the kernel from the RHEL4.5snap1 installation on the blade to an
older RHEL4U4 installation on the DL585g2, and rebooted it. There was no problem.
o The RHEL4U4 installation was using a kernel parameter "pci=nommconf", so
I tried adding it to the kernel command line when installing RHEL4.5: no joy -
the assertions still failed and anaconda died.
I was hoping that one of the experiments above would pinpoint the culprit
unambiguously, but it's still a mystery. It seems specific to the DL585g2 at
this point. I'd appreciate any suggestions of how to get past this.
Did this occur with earlier updates of RHEL4 such as U4?
No - RHEL4U4 installed and works fine.
Adding Regression keyword.
I've patched disk_dos.c with what I think will work, but without having that
particular system to reproduce the problem on and test, I'm going to post my fix
here and ask you to test it.
You will find a patch, a new SRPM for parted on RHEL4U5, and a binary RPM for
this patched parted on x86_64. Can you test out this build of parted(8) on
RHEL4U5 and see if it solves your problem. I realize it's probably difficult to
get U5 installed on the target system, but whatever you can do to test this
build on that platform on U5 would help. I would suggest installing RHEL4U4 and
then doing an upgrade to U5. Anaconda does not partition in those cases, so you
would be fine.
Let me know if this parted solves the problem or breaks in new and interesting ways.
Created attachment 149968 [details]
Patch to probe_partition_for_geom() for DL585g systems
Created attachment 149969 [details]
New parted source RPM
Created attachment 149970 [details]
New parted rpm for x86_64
Created attachment 149971 [details]
New parted-devel rpm for x86_64
Created attachment 149972 [details]
New parted-debuginfo rpm for x86_64
I have not been able to try the patch but I have new information that may make
it unnecessary. I said in comment #5 that RHEL4U4 installed and ran with no
problem: that's true but it's not the whole story. I tried installing it again
on the new disk with the intention of doing an update install to RHEL4.5snap1
and smashed into the same brick wall.
It turns out that we had added more volumes to the RAID array and there seems to
be a threshold: the original installation was done with one volume configured
(no problem there) and I've been trying to install with eight volumes configured
(problems galore here). We deleted six volumes, recreated one and tried to
install RHEL4U4 on that (the third configured volume) - that was successful. I
have not tried RHEL4.5snap1 yet and I have not tried to find exactly where the
threshold is, but I'll do that tomorrow and let you know.
BTW, RHEL5 does not hit this problem at all (most of the deleted volumes had
versions of RHEL5 on them).
I added a fourth volume to the RAID array and installed RHEL4.5snap1 with no
problem. I have not determined the threshold yet, but it is clear that the
problem is not parted: it just gets bum information. We'll try to interpose
a newer device driver at installation time, once we determine where the failure
That is good to hear [that it's not parted], but I'm interested to know what is
happening. Thanks for the feedback.
We've gone all the way back to eight volumes, step by step (i.e. adding one
volume at a time and installing RHEL4.5snap1 on each newly added volume) and I
*still* don't see the former failure - so there is no "threshold".
The only explanation that we can think of is the following:
o before blowing away (almost) all the volumes, we updated the firmware on
the box and then on the P400 controller. Neither of these solved the problem at
o we also blew away the one volume that I was trying to install on and recreated
it (with the new firmware in place). That also was unsuccessful.
o but now that we've recreated six of the volumes from scratch with the new
firmware in place, the problem has disappeared.
The situation is deeply unsatisfying but there it is.
For the record, the new firmware on the box says:
and the firmware on the P400 controller says:
The previous version of the controller firmware was v1.18.
In your opinion do you think this can be closed then?
I think so - it's almost certainly *not* a parted problem,
probably a firmware issue with the P400.
Ok closing then. Please reopen if you acquire additional information about this.
Thanks for the report.