Bug 510304 - kernel oops/panic: IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8
kernel oops/panic: IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
10
All Linux
high Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-08 12:06 EDT by Jan ONDREJ
Modified: 2009-08-17 07:28 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-08-17 07:28:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Kernel oops (5.32 KB, text/plain)
2009-07-13 00:45 EDT, Jan ONDREJ
no flags Details
2nd server with kernel 2.6.29.5-84.fc10.i686.PAE (8.61 KB, text/plain)
2009-07-14 01:33 EDT, Jan ONDREJ
no flags Details
This should be applied to kernel-2.6.29.5-84, or may be newer too. (81 bytes, text/plain)
2009-07-23 05:47 EDT, Jan ONDREJ
no flags Details
This should be applied to kernel-2.6.29.5-84, or may be newer too. (1.10 KB, patch)
2009-07-23 05:59 EDT, Jan ONDREJ
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Linux Kernel 12405 None None None Never

  None (edit)
Description Jan ONDREJ 2009-07-08 12:06:17 EDT
Description of problem:
My virtual server is running some days ago and today one of my virtual machine was not responding. I was able to log only these thing:

Lots of these messages in dmesg output:
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin

My virtual machine (qemu-kvm process) was eating 100% CPU. Second virtual machine works without problems.

This problematic virtual machine stopped to respond to any data, no ping, unable to communicate using serial console, ...

After some searches on internet this message was often recalled with cpu frequency scalling, but I have no cpuspeed or cpufreq installed on host or guest.

Version-Release number of selected component (if applicable):
host is an Fedora 11
kernel-2.6.29.5-191.fc11.x86_64
qemu-kvm-0.10.5-3.fc11.x86_64
guest is an Fedora 10

How reproducible:
I can't reproduce this.
Comment 1 Jan ONDREJ 2009-07-13 00:44:11 EDT
This problem looks to be a problem of virtual machine, which creates an oops and then kernel panic. This happens too often, aprox. once per day.

Here is part of messages from virtual serial console (full log attached):

BUG: unable to handle kernel paging request at fff82000
IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8
Oops: 0002 [#1] SMP
Modules linked in: ipv6 nf_conntrack_netbios_ns virtio_balloon floppy
virtio_net pcspkr joydev i2c_piix4 i2c_core virtio_pci virtio_ring
virtio_blk virtio [last unloaded: scsi_wait_scan]

Pid: 27956, comm: httpd Not tainted (2.6.27.25-170.2.72.fc10.i686.PAE #1)
EIP: 0060:[<c048a9f8>] EFLAGS: 00210086 CPU: 2
EIP is at __bounce_end_io_read+0x88/0xf8
EAX: fff82000 EBX: e936ae00 ECX: 00000400 EDX: 00001000
ESI: ea808000 EDI: fff82000 EBP: c08c5f00 ESP: c08c5edc
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process httpd (pid: 27956, ti=c08c5000 task=e9940000 task.ti=e9571000)

Now I am trying latest fedora-testing kernel, if this still fails, I will try to switch to another disk backend (not virtio-blk).

Looks to be like this bug: http://bugzilla.kernel.org/show_bug.cgi?id=12405.
Comment 2 Jan ONDREJ 2009-07-13 00:45:18 EDT
Created attachment 351427 [details]
Kernel oops
Comment 3 Jan ONDREJ 2009-07-14 01:33:48 EDT
Created attachment 351552 [details]
2nd server with kernel 2.6.29.5-84.fc10.i686.PAE

Same problem with 2.6.29.5-84.fc10.i686.PAE. Kernel oops attached.
Comment 4 Jan ONDREJ 2009-07-14 10:15:05 EDT
Here is an possible solution:
  http://marc.info/?l=kvm&m=124757839015712&w=2

I think Cristoph says about this patch:
  http://kerneltrap.org/mailarchive/linux-kvm/2009/6/20/6063133

There are no:
 	blk_queue_max_phys_segments(vblk->disk->queue, vblk->sg_elems-2);
 	blk_queue_max_hw_segments(vblk->disk->queue, vblk->sg_elems-2);
or
 	blk_queue_max_sectors(vblk->disk->queue, -1U);
in 2.6.27 and I am not sure, if we should add these too. Anybody experienced here?
Comment 5 Jan ONDREJ 2009-07-16 03:20:13 EDT
Similar problem with only 4 IDE and 1 SCSCI virtual drives, just there is no info on serial console. Only server does not respond to ping, it's dead and I can't only destroy it.

Current KVM command line:
/usr/bin/qemu-kvm -S -M pc -m 4096 -smp 4 -name www -uuid f3c8e927-cda6-af7a-5ba5-388e5871c601 -monitor pty -pidfile /var/run/libvirt/qemu//www.pid -boot c -drive file=/dev/vg1/www_root,if=ide,index=0,boot=on -drive file=/dev/vg1/www_swap,if=ide,index=1 -drive file=/dev/vg1/www_home,if=ide,index=2 -drive file=/dev/vg1/www_log,if=ide,index=3 -drive file=/dev/vg1/www_git,if=scsi,index=4 -net nic,macaddr=00:16:3e:23:eb:23,vlan=0,model=virtio -net tap,fd=18,vlan=0 -net nic,macaddr=00:16:3e:07:c3:fe,vlan=1,model=virtio -net tap,fd=20,vlan=1 -serial pty -parallel none -usb -usbdevice tablet -vnc 127.0.0.1:1

Very curious, that this happen mostly at midnight, mostly between 23:55 - 00:05 of my local time. There is no special job at this time locally, may be some job was run over network.
Comment 6 Jan ONDREJ 2009-07-19 14:02:32 EDT
This last message with IDE drives has been cause by an disabled swap space on host, but after it was enabled, I have still problems with virtio driver. Now trying an patched 2.6.29 kernel.
Comment 7 Jan ONDREJ 2009-07-23 05:47:58 EDT
Created attachment 354832 [details]
This should be applied to kernel-2.6.29.5-84, or may be newer too.

This patch should be applied to kernel-2.6.29.5-84, or may be newer too.

My system with this patch applied is now running more than 4 days ago without problems, so looks that this patch works.
Comment 8 Jan ONDREJ 2009-07-23 05:59:59 EDT
Created attachment 354834 [details]
This should be applied to kernel-2.6.29.5-84, or may be newer too.

My system with this patch applied is now running more than 4 days ago without
problems, so looks that this patch works.
Comment 9 Jarod Wilson 2009-07-23 11:17:43 EDT
Committed upstream as of a few days ago:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4eff3cae9c9809720c636e64bc72f212258e0bd5

Tacked onto the end of our F11 and F10 2.6.29.x kernel builds, and Chuck is working on adding it to an F10 2.6.27.x kernel build.
Comment 10 Jan ONDREJ 2009-07-24 14:08:22 EDT
Chuck, can I ask you to build this fc10 kernel from CVS?

Thank you.
Comment 11 Chuck Ebbert 2009-07-31 00:30:27 EDT
Fix went in F-11 kernel-2.6.29.6-216 and F-10 kernel-2.6.27.28-170.2.74
Comment 12 Jan ONDREJ 2009-08-03 02:49:17 EDT
2.5 days uptime on my machine, looks that this bug has been fixed well with:

[root@mail ~]# uname -a
Linux mail.inver.sk 2.6.27.29-170.2.78.fc10.i686.PAE #1 SMP Fri Jul 31 04:28:25 EDT 2009 i686 i686 i386 GNU/Linux
[root@mail ~]# uptime
 08:48:20 up 2 days, 13:40,  1 user,  load average: 0.89, 0.37, 0.18

Tested on 2 machines.
Comment 13 Fedora Update System 2009-08-03 12:58:48 EDT
kernel-2.6.27.29-170.2.78.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/kernel-2.6.27.29-170.2.78.fc10
Comment 14 Fedora Update System 2009-08-04 20:30:08 EDT
kernel-2.6.27.29-170.2.78.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.