Bug 828092 - grub chain loading kills serial port in child boot loader
Summary: grub chain loading kills serial port in child boot loader
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-04 08:43 UTC by Jes Sorensen
Modified: 2013-08-01 08:55 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-01 08:55:01 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Cleaned up patch (2.48 KB, patch)
2012-06-19 09:39 UTC, Vladimir Serbinenko
no flags Details | Diff
Patch to make console_input_test module (4.33 KB, patch)
2012-06-20 10:55 UTC, Vladimir Serbinenko
no flags Details | Diff
console_input_test module binary (2.45 KB, application/octet-stream)
2012-06-20 10:56 UTC, Vladimir Serbinenko
no flags Details

Description Jes Sorensen 2012-06-04 08:43:40 UTC
Description of problem:
Having multiple distributions installed on a system, say Fedora 17 and
Rawhide as secondary. Using a chain loaded grub entry for the second 
distribution as grub2-mkconfig doesn't pick up all the command line options
from the secondary install's config file.

In this case, once the secondary grub loader is launched, serial port
support dies, making it impossible to nagivate the grub menu as if one
had booted directly into it from the BIOS. Instead it just boots the first
entry in the boot menu.

This behavior doesn't change whether the secondary grub is grub2 or grub1,
so it seems to be an issue with grub2 killing the serial ports in the
process.

This is a major showstopper for anyone using SOL to manage servers and
development/lab machines.

Version-Release number of selected component (if applicable):
grub2-2.0-0.25.beta4.fc17.x86_64

How reproducible:
Every time

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Vladimir Serbinenko 2012-06-04 14:01:14 UTC
Does serial port work in Linux? Is it a real serial port or some kind of management card? Does it happen if you comment out the init sequence in ns8250.c as per patch in this message?

=== modified file 'grub-core/term/ns8250.c'
--- grub-core/term/ns8250.c	2012-02-12 14:25:25 +0000
+++ grub-core/term/ns8250.c	2012-06-04 14:00:10 +0000
@@ -100,6 +100,7 @@ do_real_config (struct grub_serial_port
 
   divisor = serial_get_divisor (port, &port->config);
 
+#if 0
   /* Turn off the interrupt.  */
   grub_outb (0, port->port + UART_IER);
 
@@ -130,7 +131,7 @@ do_real_config (struct grub_serial_port
   /* Turn on DTR, RTS, and OUT2.  */
   grub_outb (UART_ENABLE_DTRRTS | UART_ENABLE_OUT2, port->port + UART_MCR);
 #endif
-
+#endif
   /* Drain the input buffer.  */
   while (grub_inb (port->port + UART_LSR) & UART_DATA_READY)
     grub_inb (port->port + UART_RX);

Comment 2 Jes Sorensen 2012-06-04 14:06:03 UTC
The serial port works fine once I get to the Linux kernel, it is only
while running the chain loaded boot loader I get no output.

I wouldn't know how to go about building grub - do I need a special compiler
setup for it? If you can provide a test rpm for Fedora 17 I can test it
easily.

Comment 3 Vladimir Serbinenko 2012-06-04 14:18:38 UTC
It doesn't need any special compiler. I don't know about fedora packaging enough yet. If you really need test RPM I can make one this evening.

Is it real serial port or some network serial on management card? Can you suply lspci in later case? Network serials are known to have quirks in the emulation, which we have to workaround.

Comment 4 Jes Sorensen 2012-06-04 14:26:10 UTC
Ah sorry, it's a 'fake' serial port running SOL (serial over LAN). Looking
at the lspci output, I see nothing obvious:

[root@monkeybay ~]# lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4)
00:1f.0 ISA bridge: Intel Corporation C202 Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Cedar PRO [Radeon HD 5450]
02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cedar HDMI Audio [Radeon HD 5400/6300 Series]
03:00.0 SCSI storage controller: Marvell Technology Group Ltd. 88SX7042 PCI-e 4-port SATA-II (rev 02)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit Ethernet PCI Express
05:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)

Comment 5 Vladimir Serbinenko 2012-06-19 09:38:46 UTC
Have you tested the patch I've given you the other day? Here I attach cleaned up version.

Comment 6 Vladimir Serbinenko 2012-06-19 09:39:21 UTC
Created attachment 592889 [details]
Cleaned up patch

Comment 7 Jes Sorensen 2012-06-19 15:23:38 UTC
Sorry I didn't get to it before, been swamped.

I am trying to test it now, but I forgot the process. Can you outline it for
me again?

Thanks,
Jes

Comment 8 Vladimir Serbinenko 2012-06-19 17:39:37 UTC
1) Unpack the tarball I sent you to the root of any partition. 
2) Put grub.cfg you want to use to test/grub2
3) Load it with:
root=<PUT RIGHT ONE HERE>
multiboot /test/grub2/i386-pc/core.img

Comment 9 Jes Sorensen 2012-06-20 07:57:52 UTC
I think I did got it right finally, using the version you emailed me on
Friday, but it still does the same thing when I hit enter :(

Comment 10 Vladimir Serbinenko 2012-06-20 10:54:34 UTC
It may send the reverse sequence (\n\r) or even something weirder. I attach the patch and a module to determine what exactly it sends.
Usage:
Put console_input_test.mod to /boot/grub2/i386-pc
In GRUB console:
insmod console_input_test
console_input_test
<enter>
See what it tells, times are important as well.
quit

Comment 11 Vladimir Serbinenko 2012-06-20 10:55:30 UTC
Created attachment 593177 [details]
Patch to make console_input_test module

Comment 12 Vladimir Serbinenko 2012-06-20 10:56:18 UTC
Created attachment 593178 [details]
console_input_test module binary

Comment 13 Jes Sorensen 2012-06-20 15:29:05 UTC
Output of me hitting enter a couple of times:

grub> console_input_test                                                        
Type `quit' to quit                                                             
1.705 s: 0000000d Enter (\r)                                                    3.738 s: 0000000d Enter (\r)                                                    5.112 s: 0000000d Enter (\r)                                                    
6.267 s: 0000000d Enter (\r)

If I hold down the enter key briefly:

grub> console_input_test                                                        
Type `quit' to quit                                                             
1.593 s: 0000000d Enter (\r)                                                    
2.088 s: 0000000d Enter (\r)                                                    2.253 s: 0000000d Enter (\r)                                                    
2.308 s: 0000000d Enter (\r)

Comment 14 Vladimir Serbinenko 2012-06-20 16:56:28 UTC
Hm, the output is what I'd expect from q working system. But for some reason the child still gets this spurious entry. Could you:
1) Try pressing unrelated no-action keys keys before pressing enter? Something like 20 presses of 'd' should do the job. (crazy idea that management card emulates keyboard as well)
2) add console_input_test in child after serial setup block.
3) Add console_input_test in parent in entry.
For 2 and 3 be sure to copy to right folder and not forget insmod

Comment 15 Jes Sorensen 2012-06-20 18:46:41 UTC
I am a little confused here - can you provide me an example?

Thanks,
Jes

Comment 16 Vladimir Serbinenko 2012-06-20 18:50:25 UTC
1, 2 and 3 are separate tests.

1) Just boot, press 'd' twenty times, then press enter.

2) Change grub.cfg in child to read:

< some terminal_input serial >
insmod console_input_test
console_input_test

3)

Add after initrd:

insmod console_input_test
console_input_test

Needless to say you have to do it in grub.cfg since we've already established that going to edit mode won't trigger the problem.

Comment 17 Fedora End Of Life 2013-07-04 02:37:52 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 18 Fedora End Of Life 2013-08-01 08:55:17 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.