Red Hat Bugzilla – Bug 164002
a write to a disconnected RS-232 serial device can cause process to hang in uninterruptable state on exit()
Last modified: 2007-11-30 17:07:07 EST
Description of problem:
If a process writes to a serial RS-232 device that is switched off,
and then calls exit(), it can enter an uninterruptable state where
it cannot be kill()-ed or ptrace()-ed.
For instance mgetty ( bug 162174 ), when configured to serve eg. a serial
console on an RS-232 null modem cable connected to ttyS0, if the cable
is not connected, will be unable to exit and will prevent normal system
shutdown as its open lock file will prevent /var being unmounted.
With mgetty configured for a "direct" line on /dev/ttyS0 with
and /etc/inittab containing:
after mgetty has run for more than two minutes with the cable disconnected
or the external modem switched off, it will time out and call exit(),
after it has written a login prompt to the device. Its exit() call
never completes and it cannot be killed or ptraced.
"ps" shows it with the process name in brackets and it cannot be
killed or traced:
# ps -ef | grep mgetty
root 5680 1 0 09:48 ttyS0 00:00:00 [mgetty]
# kill -9 5680
# gcore 5680
ptrace: Operation not permitted.
I then generated this sysrq-trigger output:
mgetty S 00000000 0 5680 1 5824 5679 (L-TLB)
Call Trace: [<e004ccc8>] do_get_write_access [jbd] 0x328 (0xde6b5d74)
[<c0124144>] schedule [kernel] 0x2f4 (0xde6b5d88)
[<c013522c>] schedule_timeout [kernel] 0xbc (0xde6b5dcc)
[<c015ff04>] __pte_chain_free [kernel] 0x24 (0xde6b5dec)
[<c01b806a>] tty_wait_until_sent [kernel] 0x9a (0xde6b5e04)
[<c01ca73c>] rs_close [kernel] 0x14c (0xde6b5e60)
[<c01b355e>] release_dev [kernel] 0x6ce (0xde6b5e84)
[<c0143ca1>] handle_mm_fault [kernel] 0xd1 (0xde6b5ec0)
[<c013f2af>] free_one_pmd [kernel] 0x8f (0xde6b5eec)
[<c013f202>] __free_pte [kernel] 0x52 (0xde6b5ef8)
[<c01b39f2>] tty_release [kernel] 0x32 (0xde6b5f30)
[<c016587a>] __fput [kernel] 0xea (0xde6b5f3c)
[<c01639ee>] filp_close [kernel] 0x8e (0xde6b5f58)
[<c012d01c>] put_files_struct [kernel] 0x6c (0xde6b5f74)
[<c012d90a>] do_exit [kernel] 0x1ba (0xde6b5f90)
[<c012dc6b>] do_group_exit [kernel] 0x8b (0xde6b5fac)
The kernel should not hang the process on exit in an uninterruptable state
when closing the tty device if it is not switched on.
Can't the kernel detect if the device on the other end of the cable
is switched on / listening with the RS-232 protocol ? If so, it should
not hang the process on a close() with unwritten data to a device that
is not switched on.
When the device is switched on, then the kernel is able to write the data,
and the mgetty process is able to exit (so the kernel must have known that
it was switched off and should not have hung the process when it was trying
to do an exit).
Yes, I can probably fix this in mgetty by doing a tcflush(1,TCOFLUSH) before
the exit(), but I think it is wrong for the kernel to prevent the exit() and
leave a process where it cannot be kill()-ed or ptrace()-ed, and where it can
potentially prevent the whole system from shutting down cleanly, just because
it has written to a device that is disconnected - if this can be fixed in the
kernel, it should be.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Run mgetty in direct mode for a disconnected tty
After 2 minutes, mgetty cannot be killed and the system cannot be shut down
cleanly (/var cannot be unmounted).
mgetty should be able to exit and the system shutdown cleanly.
Probably needs every serial driver to be modified and someone to look at the
spec in detail about close delay behaviour (same problem with ldisc switch in 2.6)
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
For more information of the RHEL errata support policy, please visit:
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.