Bug 929339 - unable to handle kernel paging request
Summary: unable to handle kernel paging request
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 17
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-29 18:30 UTC by Stephen Rondeau
Modified: 2013-06-03 15:15 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-04-01 13:05:30 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
/var/log/messages extract of trace (4.48 KB, application/octet-stream)
2013-03-29 18:30 UTC, Stephen Rondeau
no flags Details

Description Stephen Rondeau 2013-03-29 18:30:36 UTC
Created attachment 718160 [details]
/var/log/messages extract of trace

Description of problem:

VirtualBox 4.2.10 or 4.1.24 freezes while writing to its virtual disk file for a Fedora Linux 17 VM hosted on a Fedora Linux 17 (3.8.3-103) host. Host /var/log/messages file is attached.

Version-Release number of selected component (if applicable):

Linux cn9 3.8.3-103.fc17.x86_64 #1 SMP Mon Mar 18 15:46:01 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Varies as to when the bug will appear.

Booted up host Fedora 3.6.10-2 and have no problems with writing to the VM's disk via VirtualBox. Tried earlier

Steps to Reproduce:
1. Create a VM for Fedora Linux
2. Create a 15GB dynamically expanding virtusl disk (.vdi)
3. Boot up Fedora 17 Live CD (x86)
4. Click on "Install to Hard Disk"
5. Configure
6. Fails at some point of installing packages to virtual disk
  
Actual results:

VM hangs.

Expected results:

VM to continue installation to hard disk.

Additional info:

See attached.

Comment 1 Josh Boyer 2013-04-01 13:05:30 UTC
Bug in virtualbox.  Please contact them for a fix.

Comment 2 Chris Caudle 2013-05-21 00:48:57 UTC
How was this determined to be a VirtualBox bug?  VirtualBox works with 3.6 and 3.7 kernels, and breaks with 3.8 kernels.  Since the change is in the kernel, it seems that the first assumption should be that the kernel change is what broke the system until proven otherwise (i.e. that a latent bug in VB just happened to not cause problems with the 3.6 and 3.7 kernels, but did cause a problem with the 3.8 kernel).

Comment 3 Chris Caudle 2013-05-21 01:18:42 UTC
Not to belabor the point, but the extract from messages has this:
 kernel: [270435.100247] BUG: unable to handle kernel paging request at 00007ffb860dfc04 
...
kernel: [270435.100561] Oops: 0000 [#1] SM

I think that saying a "BUG" message and an "Oops" from the kernel is the fault of another package at least requires some evidence backing up the claim.

Comment 4 Chris Caudle 2013-05-21 01:19:48 UTC
This seems to be a related bug report at virtualbox.org.  The last message there is that they have not been able to reproduce so far.
https://www.virtualbox.org/ticket/11610

Comment 5 Stephen Rondeau 2013-05-21 15:14:54 UTC
(In reply to Chris Caudle from comment #4)
> This seems to be a related bug report at virtualbox.org.  The last message
> there is that they have not been able to reproduce so far.
> https://www.virtualbox.org/ticket/11610

Yes, I reported the bug to VirtualBox since Red Hat said it wasn't theirs and they won't fix it. VirtualBox, as you say, can't reproduce it, but it was happening consistently for me. I had to revert to an earlier version of VirtualBox. When this stuff happens, I have to spend my time getting systems to work again vs. providing a reproducible test case for the various parties. All I know is that it wasn't something I did... something changed, or was sensitive to a change.

Comment 6 Chris Caudle 2013-06-03 15:15:18 UTC
(In reply to Stephen Rondeau from comment #5)
> Yes, I reported the bug to VirtualBox since Red Hat said it wasn't theirs
> and they won't fix it. 

The latest I see on the VirtualBox site is that the kernel oops looks like it is related to the CONFIG_NUMA_BALANCING option added in kernel 3.8.  Looks like some pages are being migrated between nodes, then when VirtualBox needs those pages it creates a page fault.

I would be happy to be educated if I am off base here, but I still think that when a working piece of software stops working because of a new kernel version, the onus is on the piece of software which changed to prove that the fault is actually in the other, i.e. since VirtualBox was working with kernel 3.4, 3.5, 3.6, and 3.7, but has problems with 3.8, and now it seems related specifically to a new feature added in version 3.8, the assumption should be that kernel 3.8 has a regression and broke something it should not have, unless it can be shown that VirtualBox has been doing something egregiously wrong that just happened to slip through without causing problems for the last few years.


Note You need to log in before you can comment on or make changes to this bug.