594436 – UML + kernel < 2.6.16 + x86_64 = random segfaults / unusable

Bug 594436 - UML + kernel < 2.6.16 + x86_64 = random segfaults / unusable

Summary: UML + kernel < 2.6.16 + x86_64 = random segfaults / unusable

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.8
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Red Hat Kernel Manager
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-05-20 16:51 UTC by nightstrike
Modified:	2012-06-20 16:17 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-06-20 16:17:30 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description nightstrike 2010-05-20 16:51:58 UTC

Description of problem:
http://user-mode-linux.sourceforge.net/problems.html
On x86_64, processes randomly segfault
This is also a host bug, fixed during the 2.6.16 series. It appears that these segfaults were caused by ptraced system calls on x86_64 returning via sysret, rather than iret. sysret requires that the process preserve at least %RCX because the instruction uses that as the userspace address to resume. In the case of sigreturn, that's impossible since signals happen asynchronously to the process. So, ptracing sigreturn will cause %RCX to be changed, and that seems to be the cause of the random process segfaults.

Version-Release number of selected component (if applicable):
Version: 2.6.9
Release: 89.0.25.EL

How reproducible:
Trivial

Steps to Reproduce:
1. Run a user mode linux guest on a kernel < 2.6.16 under an x86_64 arch
2. Run any significant program (like mandb -c)
3. Watch the program segfault
  
Actual results:
Programs in the guest OS segfault

Expected results:
Programs should not segfault

Additional info:
UML = user mode linux
Note that the fix for this problem caused another, noted on the same page:
'handle_trap - failed to wait at end of syscall'
The full panic is
Kernel panic - not syncing: handle_trap - failed to wait at end of
syscall, errno = 0, status = 2943
This is a host bug introduced during the 2.6.16 series which broke ptrace. In the course of fixing a different bug (see the random segfault problem below), ptrace returned two system call return notifications rather than the one it's supposed to. It's fixed in 2.6.16-rc6, so upgrade the host to at least that.

A proper backport should fix both (or fix the first without introducing the second)

Comment 1 Jiri Pallich 2012-06-20 16:17:30 UTC

Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.

Note You need to log in before you can comment on or make changes to this bug.