Bug 76959 (keyrepeat)

Summary: applications recieve too many keyboard events from X
Product: [Retired] Red Hat Linux Reporter: Gordon Messmer <gordon.messmer>
Component: XFree86Assignee: Mike A. Harris <mharris>
Status: CLOSED ERRATA QA Contact: David Lawrence <dkl>
Severity: high Docs Contact:
Priority: medium    
Version: 8.0CC: aoliva, bostjan, dab0816, jifl-bugzilla, jonkv, k.georgiou, mitr, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-02-13 08:46:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gordon Messmer 2002-10-30 06:12:45 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.5 (X11; Linux i686; U;) Gecko/20020809

Description of problem:
When the system is busy, and applications are slow to respond, particularly,
applications will receive multiple keyboard events from the X server.  This bug
has been reported to the GNOME bugzilla here:
http://bugzilla.gnome.org/show_bug.cgi?id=90996

Some additional information is there.  The problem was originally reported
against the sawfish component, but I observe it using metacity.  I observe this
problem most often when using VMware on my Dell PC, which causes applications to
respond slowly, and also on a Dell Inspiron 7000 running only software installed
from Red Hat Linux 8.0 (where the CPU causes applications to respond slowly  ;)

The problem affects at least GTK+-1.2 and GTK+-2.0 applications.  I get
erroneous keyboard repeats in Evolution's mail composer (character key repeats),
Evolution's message list (ctrl+d key combo repeats), gnome terminal (tab repeats
after ALT+TAB), and sawfish, where several key combos seem to be received many
times.

It seems like a number of years ago I read that the X server would re-send
events to applications that didn't acknowledge the initial event, but I can't
find that documented anywhere, so I could be very mistaken... just speculating.

This problem should probably be considered serious.  Evolution for me has
deleted a number of messages that it should not have, though I just undeleted
them.  I have a report from a coworker that while using 'vi' in a KDE terminal,
he deleted a number of lines from his 'smbpasswd' file when he meant to delete
one, and saved the file.  He noticed later when a number of users could not log
in.  This could be the same issue, but I can't confirm that it happens with KDE.

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. Run something that keeps the load high
2. use the system for a while, as painful as that may be.

	

Additional info:

Comment 1 Mike A. Harris 2002-11-01 02:49:50 UTC
Red Hat does not support VMware as an installation target, nor do we
support systems which are running VMware as the VMware kernel modules
effectively take over the kernel.

You'll need to talk to VMware technical support for resolution of this
issue.

Comment 2 Gordon Messmer 2002-11-01 06:34:17 UTC
I also pointed out that I see this problem on a laptop running only Red Hat
Linux 8.0.  There, I'm running GNOME 2 with Metacity, usually with Evolution,
gnome-terminal, and GNU Emacs.  The open bug report at gnome.org includes no
mention of VMware, but many users who see this problem.  I only mentioned VMware
at all because it causes the system I run it on to slow down, and exhibit the
problem more often.  The laptop, which is slow to begin with, shows the problem
much more often.

Comment 3 Mike A. Harris 2002-11-01 06:45:13 UTC
Well, it's either a kernel bug then, or some hardware issue as _nothing_
has changed in XFree86 that would cause this behaviour to occur.  Prove
to me that it is an XFree86 bug, and I'll investigate fixing it.  I'm
not going to waste my time debugging something I consider to not be a
bug in XFree86 however, and of which is not reproduceable on any of 5
machines here.

Our kernel maintainer is CC'd currently for comment in case perhaps
there is a known kernel issue.  Either way, XFree86 bug or not (which
I severely doubt), I can't do squat about it if I can't reproduce it
or have detailed technical analysis of the problem.

Comment 4 Mike A. Harris 2002-11-02 02:03:36 UTC
Reassigning to kernel component for comment.

Comment 5 Gordon Messmer 2002-11-03 00:37:00 UTC
I've just done some more testing which may or may not be useful.

On my laptop (Dell Inspiron 7000, 333Mhz Mobile Pentium II, ATI 3D Rage LT Pro
video), which is rather slow, the problem is pretty easy to reproduce.

GNOME 2 defaults to setting the keyboard repreat rate to a 500 millisecond
delay, with 30 keys per second repeat rate.  KDE defaults to a 660 millisecond
delay, with a 25 keys per second repeat rate.  I observe the problems under
GNOME's default settings if I busy the system and type into the gnome terminal.
 I do not observer the problem if I busy the system in the same way (run
glforestfire fullscreen on a different desktop) when I type into the KDE terminal.

Given that, I tried setting the keyboard repeat rate to "rate 500 30" under KDE,
busied the system, and typed into the KDE terminal.  Keys began repeating at
random, just like they did in GNOME.  I also tried setting the rate to "rate 660
25" under GNOME, and could no longer reproduce the key repeating problem.

I think this eliminates the possiblility that the problem is in the X client
software, and points strongly toward a problem in XFree86 (or the kernel, I
guess...)

The problem should be very easy to reproduce if you have a slower machine. 
Simply run glforestfire in a full screen window and type into another window,
such as a terminal or text editor.


Comment 6 Gordon Messmer 2002-11-15 19:57:59 UTC
This bug continues to affect my laptop (particularly), making it a PITA to use.
 I don't know if the problem is caused by a kernel change or not, but I'd like
to speculate about the symptoms that I see.  The problem occurs when I press a
key, and the application with focus freezes momentarily before receiving it.

So... What happens if the X server gets a keypress event, passes it on to the X
clients, but blocks in doing so for a period of time greater than the keyboard
repeat delay?  When the write() sending the keypress event off returns, what is
the state of the key?  Could the X server believe that it's still down and send
off another event before it gets the keyrelease event?

Comment 7 Need Real Name 2002-11-26 10:29:28 UTC
There are quite a few bug reports, which I believe are the same as this one
against various parts of X and Gnome. I had the same problem on my 2 laptops -
keyboards on both were locking up after using shortcut keys for switching
workspaces. After preparing a metacity log file for Havoc he saw that extra
keyboard events were being sent. The bug reports are: 74760, 74635 and probably
others. Like others have reported as well this problem only happens with Red Hat
kernels - when using stock 2.4.19 or 2.4.20-rc1/2 the problem goes away.
I've also tried to compile a custom RH 2.4.18-18 kernel, by taking out a few
things such as low latency, etc - didn't help.
Any idea what I should try? I'd be willing to try patches, etc

Comment 8 Gordon Messmer 2002-11-26 15:49:32 UTC
Thanks for the report.  I had suspected that the bug was caused by the HZ change
in Red Hat's kernel, but removing the patch entirely produced a source tree that
didn't complete building for i686.  Based on your report, I'll give reverting
that change one more try and see if the problem goes away.

Comment 9 Need Real Name 2002-11-26 16:25:47 UTC
How about just setting the HZ value to what it was before?

Comment 10 Mike A. Harris 2002-11-26 16:35:18 UTC
How about upgrading to the latest erratum kernel?



Comment 11 Need Real Name 2002-11-26 16:52:02 UTC
I have tried that (i.e. running 2.4.18-18.8.0) and I have the same problem.

Comment 12 Gordon Messmer 2002-11-30 17:47:11 UTC
I've been able to reproduce the problem using the latest erratum i686 kernel
recompiled to use a HZ value of 100.  I'll get to try an unpatched kernel in the
next week, hopefully.

Comment 13 Need Real Name 2002-12-02 13:10:43 UTC
Same here - I tried with HZ=100 as well as disabling the low latency.

Comment 14 Gordon Messmer 2002-12-04 05:58:29 UTC
I've just done a clean install of RHL 8.0 on the slow laptop, and with a new
user (all default settings), I've been able to reproduce the problem with the
default kernel, kernel-2.4.18-14.i686.rpm, as well as
kernel-2.4.18-18.8.0.i386.rpm (to test for arch-dependent problems), and
kernel-2.4.18-18.7.x.i686.rpm and kernel-2.4.18-3.i686.rpm from 7.3.

It doesn't look like this is a kernel bug.  Can we move this back to XFree86 and
ask mharris to look at it again?

Comment 15 Need Real Name 2002-12-04 08:38:57 UTC
I would still say this is a kernel bug - for me the problem goes away if I use
non-Redhat kernel. I've tried quite a few 2.4.19, 2.4.20-rc2/3 a few -ac ones -
I can't reproduce the problem with any of them.

Comment 16 Need Real Name 2002-12-04 20:11:19 UTC
I also see this problem, even when typing at a tcsh shell prompt, on a fully
"up2date" redhat 8.0 machine with 512Mb and a 1Ghz Athlon processor when it gets
loaded down.  It is extremely pronounced when using msword inside of vmware on
the machine.  It is basically unusable.  Please look into this before I am
forced to abandon 8.0

Comment 17 Mike A. Harris 2002-12-05 00:13:27 UTC
I'm unable to produce this on any kernel, with any XFree86 release.

There is nothing at all that has changed in XFree86 that could all of a
sudden cause a problem like this, however various people have noted
this problem when changing kernels.  Something is broken either in the
kernel, or in the hardware.

If someone believes this is an X problem, they'll have to convince me
by debugging it and pointing out the flaw in the X source code directly,
otherwise as far as I'm concerned it is a kernel problem.  Ultimately
someone who can actually reproduce this is going to have to debug it,
whatever the problem is.

Comment 18 Need Real Name 2002-12-05 09:29:53 UTC
I agree this is not an XFree problem - to me it seems clear it's a kernel
problem (if hardware was the problem I would imagine it would show on all the
various kernel versions used).
Any idea though how we could try and figure out which part of the kernel. The
problem is quite hard to reproduce but since stock 2.4.19 or 20 works I can only
imagine one of the RH patches or it's influence on another subsystem must be
causing the problem.

Comment 19 Gordon Messmer 2002-12-07 21:39:09 UTC
Well, damndest of all things.  After reproducing this with kernels from both 7.3
and 8.0, including the latest erratum, I built and tested stock 2.4.18 and
2.4.20.  Stock 2.4.18 has the problem, but I can't for my best efforts duplicate
the problem on 2.4.20.

It doesn't look like this is the result of any patches that Red Hat's applied to
the kernel, but I wonder what of relevance has changed in between 2.4.18 and
2.4.20...

Comment 20 Féliciano Matias 2002-12-11 03:43:19 UTC
Sorry for my english.

I use a RH 8.0 and i have the same problem.

I see that the RedHat kernel set the priority of X to -10. If i renice the X
process to 0, i don't have this problem. I don't have this problem with official
Linux kernel (2.4.19, 2.4.20).

With i switch between console and come back to the X console the kernel set
again to -10 the priority of the X process.

Can someone tell me which patch do this "fun" stuff (i don't like this).

Comment 21 Need Real Name 2003-01-03 18:09:13 UTC
I still have the same problem with Rawhide kernel 2.4.20-2 (but not with stock
2.4.20 from kernel.org).

Comment 22 Need Real Name 2003-01-10 15:26:51 UTC
After a longer testing run with both machines I had a problem with 2.4.18-??? I
can now confirm that upgrade to 2.4.20-2.2/6/9 (all from rawhide) fixes the
problem completely (what I saw in my previous post must have been something
else). 2.4.20-2.2 did lock up my machine one so far but I guess that's to be
expected when running a rawhide kernel.

Comment 23 Jonathan Larmour 2003-01-12 04:39:50 UTC
Can I just throw my hat into the ring here with something that I believe to be
related? I'm using 2.4.18-19.7.x on a 7.3 system. I had previously been using
2.4.9-something that was the previous errata kernel for 7.3 without problems.
After updating to this latest errata kernel (2.4.18-19.7.x) I noticed oddities
with my Vmware client (windows client, linux host).

In the windows client I got odd keyboard behaviour - if I type the keys repeat
of their own accord, e.g. I type "this is a test" and it comes out
"ttttttttttthis is a test"

I also noticed while VMware was running the CPU load seemed to be very very
high. When trying to do pretty much anything in the client (open a window, run a
program), the CPU load monitor showed it staying at 100% until the operation
completed, and the operation was completing *very* slowly. In fact the whole of
the system, i.e. normal linux, not just VMware, was at a deathly crawl.

I finally noticed from 'top' that one of the VMware threads was running at
priority -10 and devouring CPU. When I reniced that to 0 all of a sudden
everything worked fine as it had with 2.4.9!

I suspect a form of race condition between the threads meant that the -10 niced
portion was starving the rest of the system of CPU.

But why had it started doing this? It turns out it isn't VMware nice'ing itself
to -10, but I believe a Red Hat kernel patch to arch/i386/kernel/ioport.c.
Search for set_user_nice.

I understand this was done to improve X performance. Unfortunately it isn't just
X that is affected, and the result is that applications that rely on being close
to the hardware like VMware can also break.

Assuming I'm right, you should find some other way to improve X performance.
Since X runs as root, why don't you just adjust the XFree86 code to call nice(-10)?

Hope this helps,

Jifl


Comment 24 Mike A. Harris 2003-01-24 04:29:02 UTC
It isn't a change that was done for X, but was a change that was done
in order to give processes that do I/O more CPU to improve interactivity.

I have never liked the idea of the patch from the start, and would prefer
that the user set the priority of X themself, or have the X server do it
itself internally (or other process).

I believe that our latest kernels do not have this renicing patch any more
(good riddance IMHO), and any perceived or real problems associated with
it will be removed when the user upgrades to a newer kernel not containing
the patch.

Some users as well as developers nonetheless still believe that renicing
the X server improves interactivity, performance, or whatever, and want
the ability to easily renice the X server.   I will be investigating
adding an option to the X server itself to implement this in the future.

Arjan, if the latest 8.0 erratum kernel has this patch removed, we can
probably close this as ERRATA now.

Comment 25 Gordon Messmer 2003-02-08 08:55:56 UTC
I haven't used the laptop in a while, so it's been a while since I tested, but...

Contrary to my previous report, I can reproduce the problem on stock 2.4.20.  It
occurs less frequently, but still occurs.  I booted the same kernel that I
previously tested and ran two instances of glforestfire instead of just one to
add more load to the system, and ran gedit.  While typing, some characters came
in double.

X definitely seems more responsive, and less troublesome without the renicing
patch, but the problem still exists.

Comment 26 Jonas Kvarnström 2003-03-21 19:06:05 UTC
This happens to me on an 800 MHz P3 running Red Hat 8.0.  It started happening
around Christmas, when I upgraded from 7.3, and it's still happening using the
latest kernel (kernel-2.4.18-24.8.0).

It happens in various places, but an almost certain way of reproducing this is
creating new tabs in Mozilla.  I usually get two new tabs when I press ctrl-T --
presumably because creating a new tab is a CPU-intensive operation.  If I press
Ctrl-T when an empty tab is showing, I'm more likely to get only one new tab
(less work required to remove the old contents?).  And when I switch between
tabs I often skip ahead two or three tabs at a time.  This is not Mozilla's
fault:  It started happening when I upgraded from 7.3 to 8.0, without changing
the version of Mozilla I was using.



Comment 27 Gordon Messmer 2003-03-22 03:55:45 UTC
Based on a comment from a user on the redhat-list, I tried disabling XKEYBOARD
in the X server by uncommenting the Option "XkbDisable" in XF86Config.

When I disable that extension and test by running two full screen glforestfire
instances and typing in gedit, I can't reproduce the problem after typing the
alphabet 10 times.

When I turn the extension back on (comment out the XkbDisable option), and run
the same test, I get repeated letters every single time I type the alphabet.

This test was done on the 333 Mhz laptop described earlier, running stock Red
Hat Linux 8.0.

I think this strongly suggests that the bug is in the XKEYBOARD extension in
XFree86, and not the kernel.


Comment 28 michal maruska 2003-03-27 18:34:47 UTC
i've found & described the bug at (indeed in xfree86) at: http://lists.eazel.com/pipermail/sawfish/2003-March/004838.html

Comment 29 Alan Cox 2003-06-05 12:31:43 UTC
Appears to be an XFree86 bug in fact.


Comment 30 Alexandre Oliva 2003-10-19 19:57:46 UTC
FWIW, I don't recall having experienced this problem with Fedora Core test3,
maybe not even with test2.

Comment 31 Gordon Messmer 2003-10-19 22:41:31 UTC
I can still reproduce it on FC test 3 (same glforestfire test I always used).  I
had hoped that Mike Harris would have pushed the patch from XFree86 4.3.99.2
into Fedora's tree, but it looks like this bug hasn't been assigned back to him.

I've asked Michal to attach the patch to this bug report, since eazel.com seems
to have disappeared not long after he posted that URL, and I can't locate the
patch itself.

Comment 32 michal maruska 2003-10-20 05:57:27 UTC
here is my post to the xfree86 mailing list:
http://www.mail-archive.com/xfree86@xfree86.org/msg04354.html
 (hopefully this one will last longer than the sawfish one).   As a side note, i  have another problem with xfree86 4.3.99.6 (i don't use RH): sometimes pressing  Meta, Control, then releasing Control, and then Meta, will make X not notice the release of Meta. When i have time, i'd like to find out.

Comment 33 Mike A. Harris 2003-12-15 13:16:58 UTC
The URL
"http://lists.eazel.com/pipermail/sawfish/2003-March/004838.html" from
comment 28 above, is invalid.

I read the email from comment #32, and don't see a patch there.  Since
XFree86 4.3.0 is in a non-developmental state of stability right now
however, I'm quite reserved about applying patches that may or may
not cause unknown breakage/regression for part of the userbase in the
name of a quick-fix for a problem reported by only a few users.

Is there a proper fix available, and if so, has it been reported
in XFree86 bugzilla upstream?  If not, please report this upstream
and attach whatever patch is deemed necessary to fix it, so that
it gets fixed for 4.4.0 if it isn't already.

I'd like to see a fix for this problem get into upstream 4.3.0
stable branch also before considering it for our 4.3.0.

Please comment.

Comment 34 Gordon Messmer 2003-12-17 20:13:53 UTC
As far as I can tell, it's been reported and fixed upstream.

http://xfree86.linuxforum.net/cvs/changes.html:
60. Fix for spontaneous repeated keyboard events during sync grab
(#A.1713, Michal Maruska).

Comment 35 Mike A. Harris 2003-12-18 16:50:14 UTC
Fantastic.  Ok, I've backported the patch to 4.3.0 and it will be
included in 4.3.0-48 and later builds in rawhide, and also in future
4.3.0 erratum, assuming no regressions are observed.

Thanks for tracking this issue down, as it would have been very
difficult if not impossible to fix this without being able to
reproduce it personally.  Since I couldn't reproduce it originally,
I can't test the fix, so please test the new build from rawhide
to make sure the problem is fixed now.

Setting bug status to MODIFIED, pending confirmation that the fix
works.  Please close bug RAWHIDE once confirmed, or set to
ASSIGNED if the problem recurs.  If this patch introduces new
regressions, please file a new bug report.

Also, if you are aware of any other bugs open in bugzilla, that
are caused due to this problem, please point out the bug ID's here
for me to investigate and/or dupe them.

Thanks again.

Comment 37 Mark J. Cox 2004-02-13 08:46:52 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-059.html