219787 – xenfb lets do_no_page race with xenfb_update_screen()

Bug 219787 - xenfb lets do_no_page race with xenfb_update_screen()

Summary: xenfb lets do_no_page race with xenfb_update_screen()

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Xen Maintainance List
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-12-15 14:19 UTC by Markus Armbruster
Modified:	2011-02-01 18:49 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-02-01 18:49:55 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Markus Armbruster 2006-12-15 14:19:09 UTC

Description of problem:
(courtesy Steven Rostedt)
When X writes to the xen frame buffer, there's no page currently
mapped. So this is the trick that Xen tries to pull off.  It maps the
page and flags it as faulted in.

There also exists a worker thread that wakes up via a timer that gets
set when we need to do a refresh.  This thread goes down all the
mappings looking for the mappings that have faulted in, and calls
zap_page_range on those mappings.  It then goes through the event
channel to get Xen to update the buffer for what the user sees.

What Xen is trying to do is that the next time X writes to that page,
it faults in again (since the mappings no longer exist) and we go
through this process again.

Now the problem is in the details.  To check for a fault, in
xenfb_vm_nopage (called by do_no_page) it increments a variable
associated to the mapping (map->fault++).  So that it knows that it
needs to update this page and zap the pages for it.

In the worker thread code, there's a loop that goes down all the
mappings looking for any that needs to be updated. This is checked by
seeing if the mapping map->fault is greater than zero, it then zaps
the pages, and clears the map->fault.

There was locking issues with this code that the patches by Markus and
Stephen address, but they don't solve this current problem.

The problem is that from the time in memory.c: do_no_page calls the
xenfb_vm_nopage to the time it actually puts the page into X's pte,
the xen worker thread can run. In which case it will zap the pages
(removing the pte from X that doesn't yet exist).  But what does
happen, is that the code in do_no_page, puts the page into the pte
anyway. And that means further writes to that page by X will not be
trapped, and then that info will not be transfered through xen so the
user sees the updates.

Basically we have this:

=====

X -> writes to some address in the Xen frame buffer

Fault - call do_no_page

grab xen page lock

get page mapping - call xenfb_vm_nopage
  map->faults++;

release xen page lock

return back to do_no_page

(preempted, or SMP)

Xen worker thread runs.

     grabs xen page lock

     Looks at mappings

     Finds this mapping

     Zaps its pages (but page not in pte yet)

     Sets map->faults = 0

     releases xen page lock

(back to X process)

Put page in X's pte

Oh well, we wont be updating the writes to this page anytime soon.

Stephen Tweedie adds:
I suspect that we need a much deeper think about the
problem, to see if we can rely better on core VM infrastructure for
this.  For example, we're marking the dirty rectangle in
xenfb_vm_nopage, even if the area is only getting read, not written.
Can we fix that?  And we should replace the zap_page_range() call, whose
locking we're unsure about, with a call to unmap_mapping_range(), which
has several advantages:

      * It is already fully exported
      * It deals with locking automatically via the address_space
        i_mmap_lock spinlock
      * It automatically iterates over all the vmas on the address_space
        without us having to loop over them ourselves

but that requires adding the address_space concept to xenfb, which I
think is desirable but which requires a bit more work.

How reproducible:
The race has not been reproduced in the wild.

Comment 2 RHEL Program Management 2007-04-25 22:48:51 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 RHEL Program Management 2007-09-07 19:57:56 UTC

This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time.  This request will be
reviewed for a future Red Hat Enterprise Linux release.

Comment 4 Markus Armbruster 2007-11-22 16:41:22 UTC

The best long-term option could be a conversion to deferred I/O (see
Documentation/fb/deferred_io.txt).  Might not be the right thing for RHEL-5, though.

Comment 6 Markus Armbruster 2011-02-01 18:49:55 UTC

Upstream kernel went with the solution mentioned in comment#4, in 2.6.26.  We could replace RHEL-5's xenfb by a backport of upstream's.  However, taking a new, untried backport to fix a race condition that hasn't been reproduced in the lab, and hasn't generated any user complaints so far seems like a really bad idea to me.  Therefore, I'm closing this WONTFIX.

If you think we should fix it, please reopen the bug.

Note You need to log in before you can comment on or make changes to this bug.