Bug 103275
Summary: | Slow serial I/O responce | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Scott Weathers <sweathers> |
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.0 | CC: | dwmw2, jlamb |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2003-10-22 12:24:13 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 103278 |
Description
Scott Weathers
2003-08-28 13:46:52 UTC
Red Hat Enterprise Linux is not a hard realtime operating system like QNX is. In several places latency has been traded off in favor of throughput. It also greatly depends on the exact ways you drive the UART and what the userspace application writing to the serial port does. It appears based on the above quick response that the bug has not been researched fully. Are you suggesting that Red Hat Enterprise should not be used as the platform for a process control system? Can you please explain your statement: âIn several places latency has been traded off in favor of throughputâ? Our system relies heavily on this type of serial communication in some cases we are required based on the device we are communicating to use 9600 baud and the distance to the device could be 500 to 1000 feet. Will this be bug researched further? If not, I need to know ASAP, so I can begin to look at another platform for our system. I find it hard to believe that Microsoft Windows does a better job with its serial I/O than Red Hat Linux. We have almost 1000 existing systems plus a similarly large number of systems set to roll out in Europe and Asia; this is a very serious issue to our company and I would hope that it could be resolved quickly so we can continue to use Red Hat in the future I assume you have investigated all serial port settings into great detail (eg made sure all fifo settings are optimal etc etc). Red Hat *Enterprise* Linux, which is different from Red Hat Linux, does not have "Low latency" as requirement currently. Your description of your application makes it sound that it basically wants soft real time behavior (or a good approximation thereof), which mostly comes down to having good latency. QNX is an operating system that has a pretty high focus on real time behavior, while RHEL has a focus on server performance. Red Hat Linux has a focus on the consumer market, for which latency is important again. Can you try a recent RHL kernel to see if that performs better than RHEL in your application ? Customer has tried RHL 9 and RHEL 3 Beta 1 latest kernels. RHEL 3 Beta 1 has worked the best so far. Please could you confirm whether you are having problems with thoughput, latency, or both? Your 'steps to reproduce' imply only throughput -- which is odd, since in the absence of flow control we should just spew characters out the serial port at 9600 baud unconditionally -- assuming 8N1 that'll be 872 characters per second, or one character about every 1.15ms. Can you double-check that your UART is detected correctly (presumably as a 16550A) and hence that we're using the FIFO? What is the output of the command "grep ttyS /var/log/dmesg" ? What serial port are you using? throughput and latency are very closely related. We are currently trying to use ttyS0, with our testing. Output from grep ttyS /var/log/dmesg ttyS0 at 0x03f8 (irq = 4) is a 16550A ttyS1 at 0x02f8 (irq = 3) is a 16550A FYI FROM PREVIOUS E-MAIL: Please communicate to the kernel engineer that we don't only run at 9600 baud, most of our testing has been 38400 baud but there are cases where we will run as slow as 9600 baud. It all depends on the distance from the device to the PC. Most if not all the devices we communicate with in our industry use rs- 232 or rs-485/422 serial I/O with no flow control. Is the Linux serial driver expecting to see hardware or software flow control signals on the serial port (i.e. CTS, DSR, DTR etc)? Sorry first line is missing from previous comment it should read... I would consider the problem we are seeing a throughput issue; however, throughput and latency are very closely related. Throughput and latency are often related -- in the case where you're just spewing out data with no flow control, there should be no latency involved except the time it takes to go from an interrupt caused by the UART FIFO getting low to the kernel refilling the FIFO. Since the kernel keeps its own internal flip buffer in addition to the hardware FIFO, this shouldn't even require going back to userspace each time -- it just shouldn't run out of data to send. How slow _is_ it going? If you've configured the software to not use flow control, the Linux serial driver will ignore all flow control signals. I could understand latency on unblocking output becoming a problem if the receiving side is repeatedly throttling and then unthrottling -- but with no flow control I can't see how that's happening. I am sure you would agree that slow is a relative term base on the observer, so here is how we have determined we have an issue. We have connect the same serial device to a term server, that allows us to communicate to the device over a network card, when connected this way we find the device displays data at the same rate as on the QNX platform. But when the same device is connecting to the serial port the visual display of data on the device screen is noticeably slower. (i.e. waiting for user prompts to display) If you are not familiar with a term server here is a quick definition, a term servers is a network device that has multiple serial ports on it. We send the same protocoled message to the device on the term server as we do when the device is connected to the serial port; the only addition is that the message is wrapped in an IP packet. If you where to up a loop back device on the term server port and telnet to the term server on that port 2001 for example, every thing typed should be echoed on the screen just as it would with serial port loop back. I'd be very interested in getting more quantitative data, if possible. In the case where you're sending bulk data at a fixed speed with no flow control, 'slow' is very much an objective measurement, not subjective. I'd like to see if you're seeing bursts of, say, 16 or 256 characters at a time at 'full' speed interspersed with idle periods, or if it's a different pattern of sending. This would give clues as to what the problem is. Are you using IDE disk drives in PIO mode? Anything else which might disable interrupts for long periods of time? I am in the process of modifying one of our serial test programs to log some times, so I can confirm that we are see bursts of data. I am not familiar with PIO mode on a hard driver, base on my research it looks like we are in PIO mode. Is there a way to disable this? CMOS or Kernel? Boot log: Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH4: IDE controller at PCI slot 00:1f.1 PCI: Found IRQ 10 for device 00:1f.1 PCI: Sharing IRQ 10 with 00:1d.2 ICH4: chipset revision 1 ICH4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio hda: WDC WD200BB-75AUA1, ATA DISK drive blk: queue c0415e80, I/O limit 4095Mb (mask 0xffffffff) hdc: SAMSUNG CD-ROM SC-152L, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ide1 at 0x170-0x177,0x376 on irq 15 hda: attached ide-disk driver. hda: host protected area => 1 hda: 39102336 sectors (20020 MB) w/2048KiB Cache, CHS=2434/255/63, UDMA(100) Your hard drive is doing DMA. You could try running 'hdparm -u1 /dev/hda' to enable interrupt unmasking -- but that is most effective when the kernel was doing byte-at-a-time PIO transfers with interrupts disabled; I'm not sure how much it helps with DMA. doesn't do a thing with DMA afaik. hdparm -I /dev/hda is also a nice way to get drive (settings) info. |