|Summary:||Slow serial I/O responce|
|Product:||Red Hat Enterprise Linux 3||Reporter:||Scott Weathers <sweathers>|
|Component:||kernel||Assignee:||Arjan van de Ven <arjanv>|
|Status:||CLOSED NOTABUG||QA Contact:||Brian Brock <bbrock>|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2003-10-22 12:24:13 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Scott Weathers 2003-08-28 13:46:52 UTC
From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705) Description of problem: We have observed significant serial port slowness under Red Hat when communicating to serial devices, not modems. We currently are in the process of porting our application from QNX 4.25 to Red Hat. Our application depends heavily on high level protocol serial communications to control 1 to many of serial devices on 1 or more serial ports. This level of control can be as simple as reading/writing to a hardware register or polling a device to determine it's state (i.e. state machine) so our software can perform an action the device being polled or another device. When we run our software under Red Had we donât get the same type of serial throughput or device response that we have seen in the past from QNX. Version-Release number of selected component (if applicable): Kernel 2.4.21-1.1931.2.411.entsmp How reproducible: Always Steps to Reproduce: 1.Send text data out a serial port 2.Observer the speed at which it displays on the device LCD 3. Actual Results: Text displays slowly on the LCD screen Expected Results: There should be no observed delay of the text being displayed on the screen Additional info: Please feel free to contact me directly regarding this issue. Scott Weathers Project Lead/Senior Software Engineer Toptech Systems, Inc 280 Hunt Park Cove Longwood, FL 32750 Fax: (407) 332-1802 Phone: (407) 332-1774 x208 E-mail: firstname.lastname@example.org www.toptech.com
Comment 1 Arjan van de Ven 2003-08-28 13:50:03 UTC
Red Hat Enterprise Linux is not a hard realtime operating system like QNX is. In several places latency has been traded off in favor of throughput. It also greatly depends on the exact ways you drive the UART and what the userspace application writing to the serial port does.
Comment 2 Scott Weathers 2003-08-28 15:31:54 UTC
It appears based on the above quick response that the bug has not been researched fully. Are you suggesting that Red Hat Enterprise should not be used as the platform for a process control system? Can you please explain your statement: âIn several places latency has been traded off in favor of throughputâ? Our system relies heavily on this type of serial communication in some cases we are required based on the device we are communicating to use 9600 baud and the distance to the device could be 500 to 1000 feet. Will this be bug researched further? If not, I need to know ASAP, so I can begin to look at another platform for our system. I find it hard to believe that Microsoft Windows does a better job with its serial I/O than Red Hat Linux. We have almost 1000 existing systems plus a similarly large number of systems set to roll out in Europe and Asia; this is a very serious issue to our company and I would hope that it could be resolved quickly so we can continue to use Red Hat in the future
Comment 3 Arjan van de Ven 2003-08-28 15:38:12 UTC
I assume you have investigated all serial port settings into great detail (eg made sure all fifo settings are optimal etc etc). Red Hat *Enterprise* Linux, which is different from Red Hat Linux, does not have "Low latency" as requirement currently. Your description of your application makes it sound that it basically wants soft real time behavior (or a good approximation thereof), which mostly comes down to having good latency. QNX is an operating system that has a pretty high focus on real time behavior, while RHEL has a focus on server performance. Red Hat Linux has a focus on the consumer market, for which latency is important again. Can you try a recent RHL kernel to see if that performs better than RHEL in your application ?
Comment 4 Jennifer E. Lamb 2003-08-28 15:44:36 UTC
Customer has tried RHL 9 and RHEL 3 Beta 1 latest kernels. RHEL 3 Beta 1 has worked the best so far.
Comment 5 David Woodhouse 2003-08-29 15:04:21 UTC
Please could you confirm whether you are having problems with thoughput, latency, or both? Your 'steps to reproduce' imply only throughput -- which is odd, since in the absence of flow control we should just spew characters out the serial port at 9600 baud unconditionally -- assuming 8N1 that'll be 872 characters per second, or one character about every 1.15ms. Can you double-check that your UART is detected correctly (presumably as a 16550A) and hence that we're using the FIFO? What is the output of the command "grep ttyS /var/log/dmesg" ? What serial port are you using?
Comment 6 Scott Weathers 2003-08-29 15:49:10 UTC
throughput and latency are very closely related. We are currently trying to use ttyS0, with our testing. Output from grep ttyS /var/log/dmesg ttyS0 at 0x03f8 (irq = 4) is a 16550A ttyS1 at 0x02f8 (irq = 3) is a 16550A FYI FROM PREVIOUS E-MAIL: Please communicate to the kernel engineer that we don't only run at 9600 baud, most of our testing has been 38400 baud but there are cases where we will run as slow as 9600 baud. It all depends on the distance from the device to the PC. Most if not all the devices we communicate with in our industry use rs- 232 or rs-485/422 serial I/O with no flow control. Is the Linux serial driver expecting to see hardware or software flow control signals on the serial port (i.e. CTS, DSR, DTR etc)?
Comment 7 Scott Weathers 2003-08-29 15:52:37 UTC
Sorry first line is missing from previous comment it should read... I would consider the problem we are seeing a throughput issue; however, throughput and latency are very closely related.
Comment 8 David Woodhouse 2003-08-29 16:50:45 UTC
Throughput and latency are often related -- in the case where you're just spewing out data with no flow control, there should be no latency involved except the time it takes to go from an interrupt caused by the UART FIFO getting low to the kernel refilling the FIFO. Since the kernel keeps its own internal flip buffer in addition to the hardware FIFO, this shouldn't even require going back to userspace each time -- it just shouldn't run out of data to send. How slow _is_ it going? If you've configured the software to not use flow control, the Linux serial driver will ignore all flow control signals. I could understand latency on unblocking output becoming a problem if the receiving side is repeatedly throttling and then unthrottling -- but with no flow control I can't see how that's happening.
Comment 9 Scott Weathers 2003-08-29 17:39:53 UTC
I am sure you would agree that slow is a relative term base on the observer, so here is how we have determined we have an issue. We have connect the same serial device to a term server, that allows us to communicate to the device over a network card, when connected this way we find the device displays data at the same rate as on the QNX platform. But when the same device is connecting to the serial port the visual display of data on the device screen is noticeably slower. (i.e. waiting for user prompts to display) If you are not familiar with a term server here is a quick definition, a term servers is a network device that has multiple serial ports on it. We send the same protocoled message to the device on the term server as we do when the device is connected to the serial port; the only addition is that the message is wrapped in an IP packet. If you where to up a loop back device on the term server port and telnet to the term server on that port 2001 for example, every thing typed should be echoed on the screen just as it would with serial port loop back.
Comment 10 David Woodhouse 2003-08-29 21:56:50 UTC
I'd be very interested in getting more quantitative data, if possible. In the case where you're sending bulk data at a fixed speed with no flow control, 'slow' is very much an objective measurement, not subjective. I'd like to see if you're seeing bursts of, say, 16 or 256 characters at a time at 'full' speed interspersed with idle periods, or if it's a different pattern of sending. This would give clues as to what the problem is. Are you using IDE disk drives in PIO mode? Anything else which might disable interrupts for long periods of time?
Comment 11 Scott Weathers 2003-09-02 15:16:58 UTC
I am in the process of modifying one of our serial test programs to log some times, so I can confirm that we are see bursts of data. I am not familiar with PIO mode on a hard driver, base on my research it looks like we are in PIO mode. Is there a way to disable this? CMOS or Kernel? Boot log: Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH4: IDE controller at PCI slot 00:1f.1 PCI: Found IRQ 10 for device 00:1f.1 PCI: Sharing IRQ 10 with 00:1d.2 ICH4: chipset revision 1 ICH4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio hda: WDC WD200BB-75AUA1, ATA DISK drive blk: queue c0415e80, I/O limit 4095Mb (mask 0xffffffff) hdc: SAMSUNG CD-ROM SC-152L, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ide1 at 0x170-0x177,0x376 on irq 15 hda: attached ide-disk driver. hda: host protected area => 1 hda: 39102336 sectors (20020 MB) w/2048KiB Cache, CHS=2434/255/63, UDMA(100)
Comment 12 David Woodhouse 2003-09-02 15:24:23 UTC
Your hard drive is doing DMA. You could try running 'hdparm -u1 /dev/hda' to enable interrupt unmasking -- but that is most effective when the kernel was doing byte-at-a-time PIO transfers with interrupts disabled; I'm not sure how much it helps with DMA.
Comment 13 Arjan van de Ven 2003-09-02 16:10:13 UTC
doesn't do a thing with DMA afaik. hdparm -I /dev/hda is also a nice way to get drive (settings) info.