Bug 621988 - Hardware flow control on serial console is not functional
Summary: Hardware flow control on serial console is not functional
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
(Show other bugs)
Version: 5.5
Hardware: All Linux
low
high
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-06 17:24 UTC by Tom Marshall
Modified: 2013-11-04 16:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-04 16:52:31 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 23122 None None None Never
Red Hat Bugzilla 553675 None None None Never

Description Tom Marshall 2010-08-06 17:24:18 UTC
Description of problem:

We have a hardware platform that requires hardware flow control (CTS/RTS) on the serial console.  The serial subsystem does not seem to support this.  The issues are:

1. Using "r" for flow control does not set UPF_CONS_FLOW in uart_8250_port.flags, which is necessary for wait_for_xmitr() to do flow control.

2. RedHat patch linux-2.6-serial-8250-support-for-dtr-dsr-hardware-flow-control.patch for RH bug 445215 derefs uart_8250_port.port.info, which is NULL early in the boot process, so that if UPF_CONS_FLOW does get set, the kernel OOPSes.

How reproducible:

Always

Steps to Reproduce:
1. Boot with console=ttyS0,115200n8r (and without quiet, of course).
  
Actual results:

Bits dropped all over the datacenter floor during boot.

Expected results:

Readable boot text that we all know and love.

Additional info:

We have worked around this issue by reverting the hunk in linux-2.6-serial-8250-support-for-dtr-dsr-hardware-flow-control.patch that changes wait_for_xmitr() and adding UPF_CONS_FLOW to the port flags in serial8250_console_setup().  Obviously this would not work for RH, as it would reopen bug 445215.  Unfortunately, I am not familiar enough with the inner workings of the serial and console drivers to find a better solution.

Comment 3 Aristeu Rozanski 2010-09-09 16:04:34 UTC
Can't reproduce this bug with 2.6.32-71.el6. Still investigating.

Comment 4 Aristeu Rozanski 2010-09-09 16:05:56 UTC
RHEL5 bug, that's why :)

Comment 5 Aristeu Rozanski 2010-09-09 18:52:33 UTC
Tom, after checking the code that is in RHEL5 I couldn't see any reference to
info, only flags. What kernel version are you using?

Comment 6 Tom Marshall 2010-09-20 20:32:37 UTC
We are using kernel 2.6.18-194.11.3.

In the patched drivers/serial/8250.c, uart_8250_port.port.info is dereferenced at line 1639 and 1642:


     1631         if (up->port.flags & UPF_CONS_FLOW) {
     1632                 struct uart_info *info = up->port.info;
     1633                 unsigned int msr;
     1634
     1635                 tmout = 1000000;
     1636                 while (--tmout) {
     1637                         msr = serial_in(up, UART_MSR);   
     1638 
     1639                         if ((info->flags & UIF_CTS_FLOW) &&   
     1640                             (msr & UART_MSR_CTS))
     1641                                 break;
     1642                         else if ((info->flags & UIF_DSR_FLOW) &&
     1643                                  (msr & UART_MSR_DSR))
     1644                                 break;
     1645 
     1646                         udelay(1);
     1647                         touch_nmi_watchdog();
     1648                 }
     1649         }

But, as noted, this can never execute because UPF_CONS_FLOW cannot be set.  It is only set by the code in some ARM architectures (arch/arm/mach-s3c2410/mach-*.c).  This is true even with the vanilla 2.6.18 sources.

Again, I am not terribly familiar with the serial and console drivers, so it is quite possible that I'm missing something.  But it appears this particular code path is only really critical during boot, so perhaps it's just not normally exercised and viewed at higher speeds.

Comment 7 Aristeu Rozanski 2010-09-23 16:34:05 UTC
Tom, the serial console requires hardware support in order to have flow control
and it seems yours doesn't.

Comment 8 Tom Marshall 2010-09-24 15:39:06 UTC
Aristeu, I am confused by your statement.  The (custom) hardware that we ship most certainly does have functional CTS/RTS flow control.  It is an absolute requirement for our systems, which run at 115200 baud.

Without custom patches to the RHEL 5.5 kernel, it is painfully obvious that hardware flow control in the kernel is not being used for the console at boot time.  But with the patches described above, it works great.

The only questions are whether RedHat will fix the issue and how that would be accomplished without reopening bug 445215.

Comment 9 Aristeu Rozanski 2010-09-24 15:56:04 UTC
Hm, it works great here with standard RS232 ports. May I have more details on the
hardware to find the proper fix, which probably will need to be pushed upstream
too.

Comment 10 Tom Marshall 2010-09-24 19:01:32 UTC
I just setup a whitebox with a serial console at 115200, connected it to another machine, and booted without the quiet flag.  No characters were dropped.  This seemed strange to me, given the fact that the issue is clearly reproducible on our custom hardware.

So I talked to a hardware engineer and he explained how the custom console works.  We have a custom processor that has two serial ports.  One is attached to the PC's physical serial port, and the other is attached to the physical serial port on the front of the box.  This processor is responsible for adapting to varying baud rates on the front of the box even though the PC is fixed at 115200.  Therefore, it will need to perform flow control as a part of its duties when the baud rate on the front of the box is less than 115200.

This means that when both serial ports are operating at 115200 and the machine connected to the serial port (say, a laptop) is fast enough, no flow control is typically needed.  But when either (1) the port on the front of the box is set to less than 115200, or (2) the machine connected to the serial port is not fast enough to keep up, flow control is needed.

In summary, it is unlikely that you will be able to reproduce this issue unless you can find an old machine that cannot keep up with the kernel's console output at boot.

However, that does not mean the problem does not exist.  Code inspection clearly shows that hardware flow control is not honored during boot and we have hardware that demonstrates this.

Comment 11 Prarit Bhargava 2013-11-04 16:52:31 UTC
This Bugzilla has been reviewed by Red Hat and is not planned on being
addressed in Red Hat Enterprise Linux 5, and therefore is being closed.
If this bug is critical to production systems, please contact your Red
Hat support representative and provide a sufficient business justification
in order to re-open it.


Note You need to log in before you can comment on or make changes to this bug.