Bug 863239 - xauth locking fails with ssh X11 forwarding
Summary: xauth locking fails with ssh X11 forwarding
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-xauth
Version: 19
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
Assignee: X/OpenGL Maintenance List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-04 19:04 UTC by Elliott Forney
Modified: 2014-07-10 19:33 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-23 21:17:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Elliott Forney 2012-10-04 19:04:18 UTC
Description of problem:

xauth occasionally fails during ssh X11 forwarding with "ssh -X".  This problem is much more frequent when the disk is under high load or when using NFS shared home directories.

Version-Release number of selected component (if applicable):

xorg-x11-xauth-1.0.7-1.fc17.x86_64

How reproducible:

Can always be reproduced but occurs somewhat infrequently (depending on the system and load).  Sounds like a race condition.

Steps to Reproduce:
1.  # make sure you have ssh keys set up for passwordless ssh

2.  # open three terminal emulators

3.  # in the first, load up the disk with something like
  dd if=/dev/zero of=zeroFile bs=1M

4.  # in the second terminal, repeatedly start X11 forwarding
  while true; do ssh -X localhost uptime

5.  # in the third terminal, attempt to open the display
  while true; do ssh -X localhost xeyes
  
Actual results:
xauth fail with a message like the following:

Warning: No xauth data; using fake authentication data for X11 forwarding.
/usr/bin/xauth:  error in locking authority file /home/idfah/.Xauthority
X11 connection rejected because of wrong authentication.
X11 connection rejected because of wrong authentication.
X11 connection rejected because of wrong authentication.
X11 connection rejected because of wrong authentication.
Error: Can't open display: localhost:13.0

Expected results:

Should be able to open X11 display.

Additional info:

1.  This problem is particularly bad when the disk is under load and/or with NFS shared home directories.  We have NFS home directories and users frequently complain about this error.

2.  This problem has been around for some time.

3.  Taking a glance at the xauth code, it looks like it attempts to do file locking by touching a file and checking if a concurrent process has also touched a file.  So, it doesn't seem surprising that there is a race condition in there.

4.  I presume it was done this way to support filesystems that do not support true file locking (e.g. NFS without a lock manager)?  I would make the argument, however, that it is currently broken everywhere, regardless of whether or not file locking is supported.  Since users should reasonably expect some concurrency issues on file systems that do not support file locking, I would propose that the current scheme be replaced with POSIX file locking via fcntl.

Comment 1 Fedora End Of Life 2013-01-16 21:07:24 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 2 Elliott Forney 2013-04-01 06:30:46 UTC
I have verified that this bug is still present in F18.

People seem to just be living with this.  I thought about trying to report the problem upstream but I am unsure who I might write about this?

Comment 3 Fedora End Of Life 2013-12-21 09:02:37 UTC
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 4 Elliott Forney 2013-12-23 06:28:08 UTC
Still present in F19.

Comment 5 Dr. Tilmann Bubeck 2014-06-23 20:30:59 UTC
From my tests I assume, it has nothing to do with SSH. The problem can be triggered by using two terminals executing simultaneously:

while true ; do /usr/bin/xauth  list > /dev/null ; done

I will take a look and fix upstreams, as I am one of the maintainers of xauth.

Comment 6 Dr. Tilmann Bubeck 2014-06-23 21:15:06 UTC
Found problem.

xauth uses libXau to do file writes including locking. In all version of libXau prior to 1.0.7 there is a bug in the locking code which causes the reported problem. The commit http://cgit.freedesktop.org/xorg/lib/libXau/commit/?id=5c01ef69eee7dfe925c97558153fcd5e116252c6 fixes the problem.

However, Fedora up to 19 includes 1.0.6 which is buggy in that sense. Fedora 20 and above has 1.0.8 which includes the fix.

So either update to Fedora 20 or recompile libXau and xorg-x11-xauth delivered for Fedora 20 for 19 or older versions.

Comment 7 Hans de Goede 2014-07-04 18:50:41 UTC
(In reply to Dr. Tilmann Bubeck from comment #6)
> Found problem.
> 
> xauth uses libXau to do file writes including locking. In all version of
> libXau prior to 1.0.7 there is a bug in the locking code which causes the
> reported problem. The commit
> http://cgit.freedesktop.org/xorg/lib/libXau/commit/
> ?id=5c01ef69eee7dfe925c97558153fcd5e116252c6 fixes the problem.
> 
> However, Fedora up to 19 includes 1.0.6 which is buggy in that sense. Fedora
> 20 and above has 1.0.8 which includes the fix.

Huh, Fedora 19 has 1.0.8 too:

http://koji.fedoraproject.org/koji/buildinfo?buildID=424442

Regards,

Hans

Comment 8 Elliott Forney 2014-07-10 19:33:12 UTC
I no longer have an F19 system around to test but this appears to be fixed in F20.  Thank you!!


Note You need to log in before you can comment on or make changes to this bug.