Bug 73376 - /var/lock files detection by apps using serial lines
/var/lock files detection by apps using serial lines
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.3
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
: 81025 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-09-03 14:24 EDT by giulioo
Modified: 2007-04-18 12:46 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-05-14 14:20:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description giulioo 2002-09-03 14:24:24 EDT
glibc-2.2.5-39
kernel 2.4.18-10
uucp-1.06.1-33.7.2
mgetty-1.1.28-3
HylaFAX-4.1.3

This is in glibc, but feel free to move it to kernel or whatever :-)
I didn't file it under mgetty or uucp because the problem is there with HylaFAX too.

1)
On Red Hat 6.x if you had mgetty on the line and did

cu -l ttySX
type some char

you'd see from the mgetty log that it did notice something took over the line,

09/03 16:27:05 yS3  waiting...
09/03 16:27:22 yS3    select returned 1
09/03 16:27:22 yS3   checking lockfiles, locking the line
09/03 16:27:22 yS3   makelock(ttyS3) called
09/03 16:27:22 yS3   do_makelock: lock='/var/lock/LCK..ttyS3'
09/03 16:27:22 yS3  lock not made: lock file exists (pid=16872)
09/03 16:27:22 yS3   lock file exists (dialout)!

and, after you exited "cu", mgetty would re-init the modem using init-chat.

On Red Hat 7.3 mgetty does not notice something else is using the line and when
you exit "cu" the modem is not reset with mgetty's init-chat.

2)
On Red Hat 6.x if you had faxgetty (HylaFAX) on the line and submitted an
outbound job, faxgetty would notice this and print a message to the system log
about the modem device being locked by something else; when the outgoing job was
finished faxgetty would re-init the modem.

On Red Hat 7.3 faxgetty does not notice something else is using the line and,
when the fax job has been sent, faxgetty does not re-init the modem and does not
answer the following fax calls. You have to kill it so that it respawn and works
again.

Is there an explanation for this?
I read about the /var/lock group stuff, but it seem that the lock files are
created ok, it's just that apps don't notice...
Comment 1 Need Real Name 2002-09-04 04:40:06 EDT
I have noticed the same effect between redhat versions 7.2 and 7.3 but my fax 
answers the incoming calls anyway in fact that it has the wrong status and it 
is not resetted. i have not realized why this is working so far ...
Comment 2 Nicola Migliorini 2002-09-06 05:06:26 EDT
I have the same problem between redhat versions 7.2 and 7.3 and my modem 
don't answers the incoming calls because it isn't initialized.
This situation don't permit us to use hylafax.
How we can resolve the problem ?
Comment 3 Nicola Migliorini 2002-09-14 11:49:51 EDT
I have recompiled kernel 2.4.19 with sources from www.kernel.org and I have 
resolved the problem
Comment 4 giulioo 2002-09-14 13:57:33 EDT
I can confirm the problem seems to be in the Red Hat kernel (so I changed
bugzilla component).
vanilla 2.4.18 and 2.4.19 do not cause the problem.
Comment 5 giulioo 2002-09-15 11:23:49 EDT
Red Hat kernel
== strace /sbin/mgetty ttyS0
...
write(3, "\n09/15 13:49:22 yS0   waiting fo"..., 63) = 63
ioctl(0, SNDCTL_TMR_TIMEBASE, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, SNDCTL_TMR_START, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE, {B38400 -opost -isig -icanon -echo ...}) = 0
read(0, "\r", 1)                        = 1
write(3, "[0d]", 4)                     = 4
read(0, "\n", 1)                        = 1
write(3, "[0a]", 4)                     = 4
read(0, "", 1)                          = 0
ioctl(0, SNDCTL_TMR_START, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE, {B38400 -opost -isig -icanon -echo ...}) = 0
time(NULL)                              = 1032090563
write(3, "\n09/15 13:49:23 yS0   removing l"..., 40) = 40
unlink("/var/lock/LCK..ttyS0")          = 0
time(NULL)                              = 1032090563
write(3, "\n09/15 13:49:23 yS0  waiting...", 31) = 31
rt_sigaction(SIGHUP, {SIG_IGN}, {SIG_IGN}, 8) = 0
select(1024, [0], NULL, NULL, {3600, 0}
<nothing more even when "cu -l ttyS0" and type some char>
==

== vanilla
...
...
write(3, "\n09/15 17:20:23 yS0   waiting fo"..., 63) = 63
ioctl(0, SNDCTL_TMR_TIMEBASE, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, SNDCTL_TMR_START, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE, {B38400 -opost -isig -icanon -echo ...}) = 0
read(0, "\r", 1)                        = 1
write(3, "[0d]", 4)                     = 4
read(0, "\n", 1)                        = 1
write(3, "[0a]", 4)                     = 4
read(0, "", 1)                          = 0
ioctl(0, SNDCTL_TMR_START, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE, {B38400 -opost -isig -icanon -echo ...}) = 0
time(NULL)                              = 1032103223
write(3, "\n09/15 17:20:23 yS0   removing l"..., 40) = 40
unlink("/var/lock/LCK..ttyS0")          = 0
time(NULL)                              = 1032103223
write(3, "\n09/15 17:20:23 yS0  waiting...", 31) = 31
rt_sigaction(SIGHUP, {SIG_IGN}, {SIG_IGN}, 8) = 0
select(1024, [0], NULL, NULL, {3600, 0}

< cu -l ttyS0 and type some char>  <=============================

) = 1 (in [0], left {3544, 850000})
time(NULL)                              = 1032103278
write(3, "\n09/15 17:21:18 yS0    select re"..., 40) = 40
close(3)                                = 0
munmap(0x40013000, 4096)                = 0
open("/var/log/mgetty.log.ttyS0", O_WRONLY|O_APPEND|O_CREAT, 0666) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=10855, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0
x40013000
fstat64(3, {st_mode=S_IFREG|0644, st_size=10855, ...}) = 0
_llseek(3, 10855, [10855], SEEK_SET)    = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
time(NULL)                              = 1032103278
write(3, "\n09/15 17:21:18 yS0   checking l"..., 58) = 58
time(NULL)                              = 1032103278
write(3, "\n09/15 17:21:18 yS0   makelock(t"..., 44) = 44
time(NULL)                              = 1032103278
write(3, "\n09/15 17:21:18 yS0   do_makeloc"..., 62) = 62
gettimeofday({1032103278, 645728}, NULL) = 0
getpid()                                = 2073
open("/var/lock/LCK..TM.bgpsov", O_RDWR|O_CREAT|O_EXCL, 0600) = 4
chmod("/var/lock/LCK..TM.bgpsov", 0644) = 0
...
...
==
Comment 6 Need Real Name 2002-09-16 10:21:34 EDT
Hi,

this sounds less like a lock file issue but a kernel select() thing - when
sitting in state "waiting...", mgetty does a select() on the tty device.  Only
if that call returns (due to some characters being in the read queue) mgetty
checks if there's a tty lock file.  If there is, it's an outgoing call, if
there is no lock file, it's a modem RING.

It might seem like a good idea to "hide" the outgoing call from mgetty (so
no more problems with programs that have broken tty lock files), but it will
also mean that mgetty won't notice that there was activity, and has no 
chance to re-initialize the modem - and this is BAD BAD BAD BAD.

gert
Comment 7 Jeff Johnson 2002-09-17 08:18:35 EDT
Red Hat 7.3 and later use baudboy from the lockdev
package to factor group permission to access
/var/lock onto a helper binary.

Hylafax and mgetty need to be changed to use lockdev.

Off to mgetty for a fix.
Comment 8 Need Real Name 2002-09-17 16:29:12 EDT
Device locking is not the point here.  It doesn't even get to the point
where the lock file is *checked*, no matter what kind of locking library
you are using.

Please read my comment and try to understand what's going on.  This is NOT
a mgetty issue (and it WORKS with the Linus standard kernel).

Please don't start breaking mgetty again - we have been through that loop
before.  Don't fix what isn't broken, and don't break what you do not
understand.

gert
Comment 9 Lee Howard 2002-10-07 14:52:52 EDT
HylaFAX CVS (to become 4.1.4 someday) has provided two workarounds for this
issue (which also surfaces with other OSes' and some serial drivers select
bugs).  First, it is important to realize that on any system it is possible to
sneak data in and out underneath a select()'s nose... if the retreival system
pulls the data out fast enough.

So, HylaFAX faxgetty monitors the UUCP lock directory for lockfiles - AND -
HylaFAX faxsend sends a message to HylaFAX faxgetty to go to the "LOCKWAIT"
state just before it uses the modem (to be doubly-sure).

This isn't to say, of course, that RedHat shoudn't fix their broken kernel. ;-)
Comment 10 Aidan Van Dyk 2002-10-07 15:00:07 EDT
HylaFAX guy (on of them) here...

Agreeing with Gert, device locking is not the point here.

We are reading from the device, using select to see if something goes on.  If we
see any activity on the device (read set or exception set), we then know that it
might be time to lock the device (and start locking/ processing).

But, we can sit in the select, something can open the device under us, and even
close it again, and select NEVER notifies us.  Even if RedHat's kernel mods on
purpose don't notify us of data because someone else reads it first, it should
at least give us an excpetion on the other program closing the descriptor.

a.
Comment 11 giulioo 2002-10-24 10:28:05 EDT
Latest kernel errata didn't help.

For jbj@redhat.com:
I think that moving this to mgetty was a bad idea. 
So I'm changing component to kernel again (changing assignee too), since vanilla
kernels work.
If you like, you can open a bug against mgetty for the baudboy thing, you didn't
explain how baudboy would affect the behavior this bug is about.

I emailed nalin@edhat.com (this bug assignee) 4 days ago to hear his opinion but
didn't get a response.

If this change is bad, please undo :-)

Thanks
Comment 12 giulioo 2003-05-14 14:20:18 EDT
Red Hat 9 kernel 2.4.20-6 restored correct behavior.
Hope future erratas will not break it again :)

It would be nice if you (Red Hat) could tell what was the cause so that it could
be backported by the ones who need to stay with old Red Hat kernels.
Comment 13 giulioo 2003-05-14 14:21:57 EDT
*** Bug 81025 has been marked as a duplicate of this bug. ***
Comment 14 Arjan van de Ven 2003-05-14 14:22:31 EDT
the supported kernel for all releases is now 2.4.20, as of a few minutes ago..
so no need.

Note You need to log in before you can comment on or make changes to this bug.