510304 – kernel oops/panic: IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8

Bug 510304 - kernel oops/panic: IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8

Summary: kernel oops/panic: IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	10
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-07-08 16:06 UTC by Jan ONDREJ
Modified:	2009-08-17 11:28 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2009-08-17 11:28:43 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Kernel oops (5.32 KB, text/plain) 2009-07-13 04:45 UTC, Jan ONDREJ	no flags	Details
2nd server with kernel 2.6.29.5-84.fc10.i686.PAE (8.61 KB, text/plain) 2009-07-14 05:33 UTC, Jan ONDREJ	no flags	Details
This should be applied to kernel-2.6.29.5-84, or may be newer too. (81 bytes, text/plain) 2009-07-23 09:47 UTC, Jan ONDREJ	no flags	Details
This should be applied to kernel-2.6.29.5-84, or may be newer too. (1.10 KB, patch) 2009-07-23 09:59 UTC, Jan ONDREJ	no flags	Details \| Diff
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Linux Kernel	12405	0	None	None	None	Never

Description Jan ONDREJ 2009-07-08 16:06:17 UTC

Description of problem:
My virtual server is running some days ago and today one of my virtual machine was not responding. I was able to log only these thing:

Lots of these messages in dmesg output:
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin

My virtual machine (qemu-kvm process) was eating 100% CPU. Second virtual machine works without problems.

This problematic virtual machine stopped to respond to any data, no ping, unable to communicate using serial console, ...

After some searches on internet this message was often recalled with cpu frequency scalling, but I have no cpuspeed or cpufreq installed on host or guest.

Version-Release number of selected component (if applicable):
host is an Fedora 11
kernel-2.6.29.5-191.fc11.x86_64
qemu-kvm-0.10.5-3.fc11.x86_64
guest is an Fedora 10

How reproducible:
I can't reproduce this.

Comment 1 Jan ONDREJ 2009-07-13 04:44:11 UTC

This problem looks to be a problem of virtual machine, which creates an oops and then kernel panic. This happens too often, aprox. once per day.

Here is part of messages from virtual serial console (full log attached):

BUG: unable to handle kernel paging request at fff82000
IP: [<c048a9f8>] __bounce_end_io_read+0x88/0xf8
Oops: 0002 [#1] SMP
Modules linked in: ipv6 nf_conntrack_netbios_ns virtio_balloon floppy
virtio_net pcspkr joydev i2c_piix4 i2c_core virtio_pci virtio_ring
virtio_blk virtio [last unloaded: scsi_wait_scan]

Pid: 27956, comm: httpd Not tainted (2.6.27.25-170.2.72.fc10.i686.PAE #1)
EIP: 0060:[<c048a9f8>] EFLAGS: 00210086 CPU: 2
EIP is at __bounce_end_io_read+0x88/0xf8
EAX: fff82000 EBX: e936ae00 ECX: 00000400 EDX: 00001000
ESI: ea808000 EDI: fff82000 EBP: c08c5f00 ESP: c08c5edc
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process httpd (pid: 27956, ti=c08c5000 task=e9940000 task.ti=e9571000)

Now I am trying latest fedora-testing kernel, if this still fails, I will try to switch to another disk backend (not virtio-blk).

Looks to be like this bug: http://bugzilla.kernel.org/show_bug.cgi?id=12405.

Comment 2 Jan ONDREJ 2009-07-13 04:45:18 UTC

Created attachment 351427 [details]
Kernel oops

Comment 3 Jan ONDREJ 2009-07-14 05:33:48 UTC

Created attachment 351552 [details]
2nd server with kernel 2.6.29.5-84.fc10.i686.PAE

Same problem with 2.6.29.5-84.fc10.i686.PAE. Kernel oops attached.

Comment 4 Jan ONDREJ 2009-07-14 14:15:05 UTC

Here is an possible solution:
  http://marc.info/?l=kvm&m=124757839015712&w=2

I think Cristoph says about this patch:
  http://kerneltrap.org/mailarchive/linux-kvm/2009/6/20/6063133

There are no:
 	blk_queue_max_phys_segments(vblk->disk->queue, vblk->sg_elems-2);
 	blk_queue_max_hw_segments(vblk->disk->queue, vblk->sg_elems-2);
or
 	blk_queue_max_sectors(vblk->disk->queue, -1U);
in 2.6.27 and I am not sure, if we should add these too. Anybody experienced here?

Comment 5 Jan ONDREJ 2009-07-16 07:20:13 UTC

Similar problem with only 4 IDE and 1 SCSCI virtual drives, just there is no info on serial console. Only server does not respond to ping, it's dead and I can't only destroy it.

Current KVM command line:
/usr/bin/qemu-kvm -S -M pc -m 4096 -smp 4 -name www -uuid f3c8e927-cda6-af7a-5ba5-388e5871c601 -monitor pty -pidfile /var/run/libvirt/qemu//www.pid -boot c -drive file=/dev/vg1/www_root,if=ide,index=0,boot=on -drive file=/dev/vg1/www_swap,if=ide,index=1 -drive file=/dev/vg1/www_home,if=ide,index=2 -drive file=/dev/vg1/www_log,if=ide,index=3 -drive file=/dev/vg1/www_git,if=scsi,index=4 -net nic,macaddr=00:16:3e:23:eb:23,vlan=0,model=virtio -net tap,fd=18,vlan=0 -net nic,macaddr=00:16:3e:07:c3:fe,vlan=1,model=virtio -net tap,fd=20,vlan=1 -serial pty -parallel none -usb -usbdevice tablet -vnc 127.0.0.1:1

Very curious, that this happen mostly at midnight, mostly between 23:55 - 00:05 of my local time. There is no special job at this time locally, may be some job was run over network.

Comment 6 Jan ONDREJ 2009-07-19 18:02:32 UTC

This last message with IDE drives has been cause by an disabled swap space on host, but after it was enabled, I have still problems with virtio driver. Now trying an patched 2.6.29 kernel.

Comment 7 Jan ONDREJ 2009-07-23 09:47:58 UTC

Created attachment 354832 [details]
This should be applied to kernel-2.6.29.5-84, or may be newer too.

This patch should be applied to kernel-2.6.29.5-84, or may be newer too.

My system with this patch applied is now running more than 4 days ago without problems, so looks that this patch works.

Comment 8 Jan ONDREJ 2009-07-23 09:59:59 UTC

Created attachment 354834 [details]
This should be applied to kernel-2.6.29.5-84, or may be newer too.

My system with this patch applied is now running more than 4 days ago without
problems, so looks that this patch works.

Comment 9 Jarod Wilson 2009-07-23 15:17:43 UTC

Committed upstream as of a few days ago:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4eff3cae9c9809720c636e64bc72f212258e0bd5

Tacked onto the end of our F11 and F10 2.6.29.x kernel builds, and Chuck is working on adding it to an F10 2.6.27.x kernel build.

Comment 10 Jan ONDREJ 2009-07-24 18:08:22 UTC

Chuck, can I ask you to build this fc10 kernel from CVS?

Thank you.

Comment 11 Chuck Ebbert 2009-07-31 04:30:27 UTC

Fix went in F-11 kernel-2.6.29.6-216 and F-10 kernel-2.6.27.28-170.2.74

Comment 12 Jan ONDREJ 2009-08-03 06:49:17 UTC

2.5 days uptime on my machine, looks that this bug has been fixed well with:

[root@mail ~]# uname -a
Linux mail.inver.sk 2.6.27.29-170.2.78.fc10.i686.PAE #1 SMP Fri Jul 31 04:28:25 EDT 2009 i686 i686 i386 GNU/Linux
[root@mail ~]# uptime
 08:48:20 up 2 days, 13:40,  1 user,  load average: 0.89, 0.37, 0.18

Tested on 2 machines.

Comment 13 Fedora Update System 2009-08-03 16:58:48 UTC

kernel-2.6.27.29-170.2.78.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/kernel-2.6.27.29-170.2.78.fc10

Comment 14 Fedora Update System 2009-08-05 00:30:08 UTC

kernel-2.6.27.29-170.2.78.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.