Digi has discovered a potential race condition in the RHEL 2.1/3
kernel tty layer. They are working with a customer with an extremely
large multiport serial configuration, and they think the TTY layer is
hitting this race.
Digi describes the race as follows:
"For example, lets say the tty isn't "real_raw", but is "raw",
and the data coming in is say 10 bytes of regular data...
We will go into the "else" clause of that real_raw statement,
which unlocks the read_lock, then it walks the 10 bytes passing each
byte to receive_char.
In receive_char, if the tty is "raw", it calls put_tty_queue().
Finally, in put_tty_queue() it locks read_lock, dumps the byte, and
But there is no check in put_tty_queue (or in the dump of the byte),
that the read_buf didn't get set to NULL by the close() routine
during the time the read_lock was not held.
(ie, the byte for byte walking of those 10 bytes)."
We need to deal with this race. :)
Tom, please open a separate bug for RHEL 3. This one
will apply only to RHEL 2.1. Thanks. -ernie
Opened BZ 131674 for RHEL 3.
*** Bug 122729 has been marked as a duplicate of this bug. ***
Created attachment 103431 [details]
clean up of digi's patch to fix tty race condition
revision on digi's patch to clean up missing race points outside of added locks
This patch still has nothing to do with the real bug. You have to fix
this at the ldisc level not within n_tty.
(joining in from Bug 122729)
So you _do_ think it is a bug in the tty/line discipline layer?
This is an area of the kernel where I severely lack experience.
Can you point to any resources where this is discussed e.g. in LKML?
Is it fixed in EL3.0, or kernel 2.6.x?
There were multiple bugs in the driver code I looked at which would
cause crashes and were driver side bugs. There is also a slightly odd
corner case in the case of a line discipline change occurring at the
exact momemt data is delivered by interrupt to the ldisc handler. In
that situation the following can occur
ldisc->close is called
While running or just after ldisc->receive.. is called
The bug appears to be in both 2.4 and 2.6
The concept for a fix I'd propose is that we create a "null" ldisc
that way we can fix the race by setting the tty ldisc to the
null_ldisc before we called ->close on the previous ldisc. [We can't
open the new one first without fairly major changes]
I am reposting my analyis from Bug 122729 here again because it seems
to fit very well what you said. I am still wondering why there seems
to be no locking code whatsoever in n_tty_close(); but that may be due
to it assuming that locking happens in the ldisc layer.
Btw our customer also uses heavy serial multiplexing.
Here is my old Oops analysis:
The crash occurs at line 730 of drivers/char/n_tty.c in the
function n_tty_receive_buf(), because the "read_buf" element
of the controlling tty structure is NULL (see detailed
The read_buf element is checked in the same function in line 723,
7 lines earlier. Thus, it must have been NULLed in the meantime.
This is possible because there is no locking at least until
the spin_lock_irqsave(&tty->read_lock) in l. 727.
However, the only function (AFAICS) that sets read_buf to NULL,
n_tty_close(), does not seem to do any locking, so even if
the spin_lock_irqsave() in n_tty_receive_buf() happened earlier,
it would apparently not help.
There are two cases you get this crash. One of them is a driver level
bug when a driver calls ->receive in parallel with or after having
notified the serial layer of a hangup. That seems to be the one people
see in the real world with buggy drivers.
The second is the ldisc switch case which will probably only bite
people doing a lot of PPP sessions at a time on SMP machines.
n_tty can't do any locking because the race is bigger than n_tty
itself and requires the locking is higher up. Which would be all well
and good if ldisc did that locking. Similar race effects exist in the
switch to n_tty from ppp and so on also.
At the point you make "->receive is never called after ->close" an
invariant for ldisc then the n_tty code and the other ldiscs I've
looked at work. This is already meant to be an invariant but the one
case got missed.
Arjan probably cares about this for 2.6...
No ppp is involved in our case. It is some specific proprietary
protocol talked through the serial line with some food balances. But
the protocol is impemented in user space and uses the normal ttySX
Are you setting line discplines - if not then I don't see how this can
be the problem you are seeing
I'll inquire, thank you.
I don't quite understand why you think we wouldn't hit this bug
without switching the line discipline. The "driver level" problem you
describe could just a swell happen with the default line discipline,
If drivers make random wrong calls to the tty layer they will crash.
This has always been the case with operating systems. The ldisc switch
race is a very very specific situation and a very narrow window when
changing between tty and ppp for example. If you don't change ldisc
you can't hit the race as the racy code isnt ever called.
> If drivers make random wrong calls to the tty layer they will crash.
Does that mean, then, that the standard serial driver is buggy?
AFAICS, in our case, n_tty_receive_buf() was called by
the serial interupt handler.
In your situation is that occurring after a hangup triggered by the
serial driver itself. It's possible I guess. Instrumenting for ldisc
close before ldisc receive (ie n_tty_receive_buf) would be interesting
in that situation.
Ok seem to be at least three bugs kernel side
1. Some drivers call ldisc->receive_buf() directly. That's a nono but
is done in the kernel with offenders including pty and rocketport.
Could be out of tree offenders
2. Ldisc switching locks. I have a test fix I'm playing with in 2.6
for cases not in #1
3. Possibly raw/nonraw switching locks although I think that may
purely be a symptom of #1 but this needs verification. The n_tty layer
believes that if it sets TTY_NO_FLIP then it won't get input. Drivers
bypassing the flip queues break this assumption.
4. TIOCSTI ioctls break the assumption too
The bypassing seems to be appropriate for some drivers so we probably
need a function for "try_to_queue_to_ldisc()" or some similar helper
for these. Pty is trickier as it wants to queue and sleep if need be.
1. Don't call ldisc.receive_buf directly, if you must call it then
only call it if TTY_DONT_FLIP is clear. Preferably switch to using the
tty flip buffer logic as that will avoid this one
2. No workaround (RHEL fix may eventually require workaround 1 anyway)
3. Maybe workaround 1.
Alan, this seems to conflict with what Ted Ts'o said back in 2000?
Also, I checked with Digi, and their driver can be operated in two modes:
If the DGRP driver global variable called "rawreadok" is on,
(which is on by default) it will call ldisc->receive_buf() directly.
If that variable is off, the driver will fill up a flip buffer,
and schedule it to be delivered later with a call to
*** Bug 133109 has been marked as a duplicate of this bug. ***
Any news on this topic?
For reference the fix that got committed upstream into 2.4.29:
The fix has been deemed too invasive for a RHEL2.1 update at this time. Closing