|Summary:||Grub hangs during serial console boot|
|Product:||Red Hat Enterprise Linux 3||Reporter:||Trevin Beattie <tbeattie>|
|Component:||grub||Assignee:||Peter Jones <pjones>|
|Status:||CLOSED NOTABUG||QA Contact:|
|Version:||3.0||CC:||brian, hsuttong, kloczek, tbeattie|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2005-03-30 20:41:57 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Trevin Beattie 2004-07-07 18:39:53 UTC
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20031210 Description of problem: I've encountered a strange problem that is mostly a minor annoyance. I have a PowerEdge 1750 server that is primarily accessed remotely, so the BIOS is configured to use the serial port as a console. RedHat WS3 Update 2 has been installed via NFS, again using the serial port as the primary console during setup. When the machine is booted, Grub writes out "GRUB Loading stage2..." followed by "Press any key to continue." on both the serial port and the VGA console. Pressing a key brings up the Grub boot menu, from which point you can continue loading. If no key is pressed within a short period of time, the prompt is repeated. A few times when I wasn't paying attention, I ended up with about 7 of these prompts, then some blank lines. When I try pressing a key at that point, nothing happens. It doesn't work from either the serial port or the VGA console. Usually I would give up at that point, walk down the hall to the server room, and power-cycle the box. This last time, I just left the machine sitting while looking for info on this problem. After several minutes, it suddenly decided to start booting. Version-Release number of selected component (if applicable): grub-0.93-4 How reproducible: Sometimes Steps to Reproduce: 1. Install RHEL using the serial port as the primary console. 2. Reboot, and wait for "GRUB Loading stage2..." 3. Wait for a bunch of "Press any key to continue." messages, followed by blank lines and a pause. 4. Now try to press any key. Actual Results: Nothing happens. Loading does not continue; at least not for a few minutes. Expected Results: Should have brought up the Grub boot menu. Actually, I would much rather prefer that stage2 just time out and automatically boot the default kernel if no key is pressed. Additional info: Dell PowerEdge 1750, dual Broadcom BCM5704 NetXtreme ethernet controllers. RHEL WS3 Update 2. /boot/grub/grub.conf contains the following extra parameters: serial --unit=0 --speed=9600 terminal --timeout=10 serial console and the kernel command line includes "console=ttyS0,9600" at the end.
Comment 1 Trevin Beattie 2004-07-19 17:01:11 UTC
I recently tried this with RHEL WS3 Update 1. The problem exists there as well. It's fairly consistent, but I haven't determined exactly how long a wait is required before grub stops responding.
Comment 2 Brian Crumrine 2004-08-06 06:52:47 UTC
We just setup RHEL Update 2 on a new 1750 and experienced the same problem until we had the idea to match the bit rate between the Dell console redirection and the grub configuration (and everything else) - we evidently experienced some kind of sync up problem or dual console thing going on. Once everything was using the same bit rate, we didn't have to touch a thing and boot happened perfectly and consoles worked as they should (there was a little glitch with kudzu not accepting keyboard input). Of course, you may have set the bit rate in the Dell BIOS - we just went with the default there. So our grub configuration looks like this: serial --unit=0 --speed=115200 terminal --timeout=10 serial console with the kernel line addition being: console=ttyS0,115200 One thing you didn't mention, which is also required, because just those two changes won't direct the login tty to the serial port, is a change to the inittab file - adding something like: 0:12345:respawn:/sbin/agetty ttyS0 115200 We also added ttyS0 to the /etc/securetty file to allow root to login to the serial console - not required, but our machine is in a locked cabinet. Hope this helps. Brian
Comment 3 Trevin Beattie 2004-08-09 23:00:59 UTC
There are a couple of problems with that suggestion: 1. I don't see any option in the Dell 1750 BIOS to change the baud rate. Since we are getting valid characters at 9600 baud, I would assume that is the rate at which the BIOS is set. 2. The hang occurs before loading the kernel, so whatever we have in inittab (which, BTW, is "co:2345:respawn:/sbin/agetty ttyS0 9600 vt100") is irrelevant at that point.
Comment 4 Karl Burkett 2004-10-06 14:40:50 UTC
Eureka: I too was having the same problem. My grub.conf configuration (to keep things simple) is much like Trevin Beattie's. Brian's comments, though correct as far as they go, do not have anything to do with the problem. He's just matching the baud rates at given times in the boot process, so I do agree with Trevin's last comment about the baud rates not being a part of the problem.I'd suggest, if I may, that the serial terminal connected to the serial port may be configured to listen at 115200 and hence this is why it works for Brian. So, I went back to first principles and examined the BIOS settings for console redirection: On the 1750, there are three settings: First to enable console redirection to serial port 1. Second to pick the terminal type. Third (and most important), Redirect after boot should be "Disabled". I did have it enabled and was having the described problems. Why did this work? I suspect that there was a control argument going on in the hardware. First, boot, in this instance, refers to the end of the BIOS bootup, where at that time, grub trys to take command of the serial resource, but the BIOS won't let it take control, so there initiates a long argument over who is going to control the serial resource. This lasts untill some timeout happens and the system continues to boot as expected. Hope this help. From: burkett
Comment 5 Brian Crumrine 2004-10-06 15:14:44 UTC
We did have what sounds like the same problem as originally described by Trevin until we matched the bit rates. Since the kernel option was different than the Dell BIOS setting, we would not see the kernel boot, and kudzu, etc. because of the mismatch, so it looked like it would lock up for a while (while it was booting and trying to configure a new-found serial console). We moved everything to 115200 early on in our setup and only used 9600 briefly, so that could have been it. I don't know if we ever touched the redirect after boot option - I suspect it was left at the default which was probably Disabled. We are setting up another 1750 in a couple days, I will be able to try a 9600 boot and some of the Dell BIOS options and see if we get anything different.
Comment 6 Trevin Beattie 2004-10-06 16:27:32 UTC
Confirmed. Turning off "redirect after boot" in the BIOS solved the conflict. Grub no longer hangs on our systems now.
Comment 7 Hugh Sutton-Gee 2005-03-09 21:34:25 UTC
Also confirmed. This was happening on our Sun v20z's. Grub would just hang at: "stage2 ..." Turning off the console redirect setting in the bios solved the problem.
Comment 8 Peter Jones 2005-03-30 20:41:57 UTC
Thanks for working this out, and providing the solution.
Comment 9 kloczek 2006-02-01 15:34:03 UTC
(In reply to comment #8) > Thanks for working this out, and providing the solution. But disable serial console redirection on SP on v20z ins't solution :> Disabling redirection dissalow remote control grub boot process. I also have the same problems Hugh Sutton-Gee on v20z but on grub from Fedora devel (grub-0.97-2).