Bug 559717

Summary: KVM/Qemu 0.12.2 guest: WARNING: at block/blk-core.c:336 blk_start_queue+0x2e/0x47()
Product: [Fedora] Fedora Reporter: Thomas Sjolshagen <thomas.sjolshagen>
Component: kvmAssignee: Justin M. Forbes <jforbes>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: rawhideCC: berrange, clalance, ehabkost, gcosta, kraxel, markmc, mcepl, mcepl, quintela, rjones, virt-maint
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 561908 (view as bug list) Environment:
Last Closed: 2010-02-04 16:38:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 514890, 561908    
Attachments:
Description Flags
kernel log none

Description Thomas Sjolshagen 2010-01-28 20:04:18 UTC
Description of problem:

I/O errors in guest for swap and file system hosted on raw image file. Guest data disks (lvm volumes in host, presented directly to guest) are not seeing these I/O errors.

WARNING: at block/blk-core.c:336 blk_start_queue+0x2e/0x47() (Not tainted)
Hardware name: Bochs
Modules linked in: ipt_REDIRECT iptable_nat nf_nat ip_vs_rr ip_vs gfs2 dlm
configfs sunrpc ipv6 dm_multipath snd_ens1370 gameport snd_rawmidi snd_seq
snd_seq_device snd_pcm joydev snd_timer snd soundcore virtio_net
snd_page_alloc i2c_piix4 i2c_core virtio_blk virtio_pci virtio_ring virtio
[last unloaded: microcode]
Pid: 10164, comm: setfiles Not tainted 2.6.31.12-174.2.3.fc12.x86_64 #1
Call Trace:
<IRQ> [<ffffffff81051710>] warn_slowpath_common+0x84/0x9c
[<ffffffff8105173c>] warn_slowpath_null+0x14/0x16
[<ffffffff811eb51d>] blk_start_queue+0x2e/0x47
[<ffffffffa001049f>] blk_done+0xba/0xd0 [virtio_blk]
[<ffffffffa00042fb>] vring_interrupt+0x6a/0x9b [virtio_ring]
[<ffffffffa00084ca>] vp_vring_interrupt+0x5b/0x97 [virtio_pci]
[<ffffffffa000854b>] vp_interrupt+0x45/0x4a [virtio_pci]
[<ffffffff81099cd1>] handle_IRQ_event+0x60/0x121
[<ffffffff81026966>] ? apic_write+0x16/0x18
[<ffffffff8109b8dc>] handle_fasteoi_irq+0x8b/0xc6
[<ffffffff8101463c>] handle_irq+0x8b/0x93
[<ffffffff8142158c>] do_IRQ+0x5c/0xbc
[<ffffffff810126d3>] ret_from_intr+0x0/0x11
<EOI> [<ffffffff81012770>] ? retint_careful+0xe/0x32
---[ end trace 5184d03d43d73d2f ]---
------------[ cut here ]------------
WARNING: at block/blk-core.c:244 blk_remove_plug+0x2e/0x96() (Tainted: G 
W )
Hardware name: Bochs
Modules linked in: ipt_REDIRECT iptable_nat nf_nat ip_vs_rr ip_vs gfs2 dlm
configfs sunrpc ipv6 dm_multipath snd_ens1370 gameport snd_rawmidi snd_seq
snd_seq_device snd_pcm joydev snd_timer snd soundcore virtio_net
snd_page_alloc i2c_piix4 i2c_core virtio_blk virtio_pci virtio_ring virtio
[last unloaded: microcode]
Pid: 10164, comm: setfiles Tainted: G W 
2.6.31.12-174.2.3.fc12.x86_64 #1
Call Trace:
<IRQ> [<ffffffff81051710>] warn_slowpath_common+0x84/0x9c
[<ffffffff8105173c>] warn_slowpath_null+0x14/0x16
[<ffffffff811eb34d>] blk_remove_plug+0x2e/0x96
[<ffffffff811eb3cb>] __blk_run_queue+0x16/0x71
[<ffffffff811eb532>] blk_start_queue+0x43/0x47
[<ffffffffa001049f>] blk_done+0xba/0xd0 [virtio_blk]
[<ffffffffa00042fb>] vring_interrupt+0x6a/0x9b [virtio_ring]
[<ffffffffa00084ca>] vp_vring_interrupt+0x5b/0x97 [virtio_pci]
[<ffffffffa000854b>] vp_interrupt+0x45/0x4a [virtio_pci]
[<ffffffff81099cd1>] handle_IRQ_event+0x60/0x121
[<ffffffff81026966>] ? apic_write+0x16/0x18
[<ffffffff8109b8dc>] handle_fasteoi_irq+0x8b/0xc6
[<ffffffff8101463c>] handle_irq+0x8b/0x93
[<ffffffff8142158c>] do_IRQ+0x5c/0xbc
[<ffffffff810126d3>] ret_from_intr+0x0/0x11
<EOI> [<ffffffff81012770>] ? retint_careful+0xe/0x32
---[ end trace 5184d03d43d73d30 ]---

Version-Release number of selected component (if applicable):

qemu-system-x86-0.12.2-4.fc12.x86_64 from virt-preview (applies to all 0.12 based releases I've tested. Does _not_ happen in 0.11 based qemu releases.

How reproducible:

Depends on -smp setting. Higher number of vCPUs appears to result in faster reproduction. But the errors themselves are pretty consistent in my guest. 

Steps to Reproduce:

1. Host guest root/swap/usr file system on raw .img file in host
2. Boot qemu-kvm 0.12.2-4 (x86_64) guest with -smp 2 (more vCPUs results in faster appearance of problem)
3. Wait for I/O errors and the Warning/trace to appear. Takes about 24(ish) hours on my system.
  
Actual results:

I/O error on swap device ("Write error on swap device ..." and "Buffer I/O error, dev vda, sector XXXXXXXX") error messages. 

Expected results:

No I/O error.

Additional info:

This problem has also been reported against Ubuntu guest/host environment at Sourceforge: 
https://sourceforge.net/tracker/?func=detail&aid=2941282&group_id=180599&atid=893831 

The tracker report includes 2 different incidents (in the comments) from my environment.


# rpm -qa qemu*
qemu-img-0.12.2-4.fc12.x86_64
qemu-system-x86-0.12.2-4.fc12.x86_64
qemu-user-0.12.2-4.fc12.x86_64
qemu-common-0.12.2-4.fc12.x86_64

# rpm -qa libvirt*
libvirt-debuginfo-0.7.1-15.fc12.x86_64
libvirt-client-0.7.5-3.fc12.x86_64
libvirt-0.7.5-3.fc12.x86_64
libvirt-python-0.7.5-3.fc12.x86_64

# virsh dumpxml imap1-cluster
<domain type='kvm' id='22'>
  <name>imap1-cluster</name>
  <uuid>a87b1fc8-4927-223a-6271-a4b86374fbc6</uuid>
  <memory>1679360</memory>
  <currentMemory>1679360</currentMemory>
  <vcpu cpuset='0-1'>2</vcpu>
  <os>
    <type arch='x86_64' machine='pc-0.11'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writethrough'/>
      <source file='/cluster/kvm-guests/MUA-Store/imap1-cluster-root.img'/>
      <target dev='vda' bus='virtio'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' cache='none'/>
      <source dev='/dev/mapper/clvm--VG00-imap--spool'/>
      <target dev='vdb' bus='virtio'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' cache='none'/>
      <source dev='/dev/mapper/clvm--VG00-quorum'/>
      <target dev='vdc' bus='virtio'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/cluster/kvm-guests/SharedStorage/web-server-content.img'/>
      <target dev='vdd' bus='virtio'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' cache='none'/>
      <source dev='/dev/mapper/sharedVG01-www--local'/>
      <target dev='vde' bus='virtio'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
    </disk>
    <interface type='bridge'>
      <mac address='54:52:00:54:30:3f'/>
      <source bridge='kvmbr0'/>
      <target dev='vnet3'/>
      <model type='virtio'/>
    </interface>
    <interface type='bridge'>
      <mac address='54:52:00:58:49:21'/>
      <source bridge='kvmbr1'/>
      <target dev='vnet4'/>
      <model type='virtio'/>
    </interface>
    <serial type='pty'>
'/>   <source path='/dev/pts/4
      <target port='0'/>
    </serial>
'>  <console type='pty' tty='/dev/pts/4
'/>   <source path='/dev/pts/4
      <target port='0'/>
    </console>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5903' autoport='yes'/>
    <sound model='es1370'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
    </video>
  </devices>
</domain>

Comment 1 Gerd Hoffmann 2010-02-02 09:54:52 UTC
Created attachment 388233 [details]
kernel log

I'm seeing this as well.

Host: Fedora 12 + virt-preview
Guest: Fedora 12.

Trying to install Fedora 12 triggers this for me.  I get all sorts of strange errors somewhere during package installation, which essentially boils down to I/O problems.  Often goes like this: ext4 journal commit error -> filesystem goes readonly -> anaconda bombs out.  Sometimes the install manages to complete, the installed system doesn't show normal behavior then though.

Noteworthy data point is that I can install Fedora 11 without any problems, so this is likely connected to some new feature which comes only into play in case both host (qemu 0.12) and guest (kernel > 2.6.31 ?) are new enougth.

Comment 2 Gerd Hoffmann 2010-02-02 11:39:06 UTC
Zapped the virtio disk, installed a ide disk instead, tried again.
Fedora 12 installs without problems now.  Something is seriously wrong
with virtio-blk.

Comment 3 Gerd Hoffmann 2010-02-02 11:52:32 UTC
http://patchwork.ozlabs.org/patch/43700/ ?

Comment 4 Gerd Hoffmann 2010-02-02 11:59:04 UTC
Justin, can we get the patch into rawhide and virt-preview please?

Comment 5 Justin M. Forbes 2010-02-04 16:38:29 UTC
Patch has been added to qemu-0.12.2-5 in rawhide and virt-preview repositories.

Comment 6 Richard W.M. Jones 2010-02-10 22:28:12 UTC
In case anyone is still seeing massive numbers of I/O errors with
virtio-blk (even with this patch applied), I don't think this fix is the
whole story.  I'm tracking another issue with the alignment of
the user buffer in pread(2) calls here: bug 563103.