873246 – guest always call trace when resume it from EIO firewalling the iSCSI port with I/O in guest

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 873246 - guest always call trace when resume it from EIO firewalling the iSCSI port with I/O in guest

Summary: guest always call trace when resume it from EIO firewalling the iSCSI port wi...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	6.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Kevin Wolf
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-11-05 12:38 UTC by Sibiao Luo
Modified:	2012-11-27 10:20 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-11-27 10:20:56 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
guest will call trace after resume VM, this is guest dmesg log. (34.79 KB, text/plain) 2012-11-09 10:05 UTC, Sibiao Luo	no flags	Details
View All

Description Sibiao Luo 2012-11-05 12:38:40 UTC

Description of problem:
boot a guest image located in local and attach a data disk which is block device. and make file system with the data disk and do I/O in a loop, it will general EIO after i firewall the iSCSI port with iptables, then i stop the firewall and resume the VM, the guest will call trace.
BTW, if use the local file (both the image and data disk), it fail to resume the VM, please refer to bug 867401.

Version-Release number of selected component (if applicable):
host info:
# uname -r && rpm -q qemu-kvm
2.6.32-335.el6.x86_64
qemu-kvm-0.12.1.2-2.331.el6.x86_64
guest info:
kernel-2.6.32-336.el6.x86_64

How reproducible:
3/7

Steps to Reproduce:
1.prepare a guest image located in local and a block data disk on iSCSI storage.
# lvscan 
  ACTIVE            '/dev/vg-90.100-sluo/lv-90.100-data-disk' [12.00 GiB] inherit
2.boot a guest with a iSCSI lv data disk.
# /usr/libexec/qemu-kvm -M rhel6.4.0 -cpu SandyBridge -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -usb -device usb-tablet,id=input0 -name sluo_test -uuid 7818b04d-aa83-4fb5-8ae5-e6024ebf6299 -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/RHEL-Server-6.3-64-copy.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B1,bus=pci.0,addr=0x4 -spice port=5931,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x5 -device hda-duplex -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -drive file=/dev/vg-90.100-sluo/lv-90.100-data-disk,if=none,id=data-disk,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,addr=0x7,drive=data-disk,id=sluo-disk -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -nodefaults -serial unix:/tmp/ttyS0,server,nowait -boot menu=on -monitor stdio
3.mkfs.ext4 to the data disk and mount it to /mnt.
# mkfs.ext4 /dev/vdb
# mount /dev/vdb /mnt/
4.do I/O in loop.
# while true; do dd if=/dev/zero of=/mnt/sluo bs=1M count=100; done
5.firewall the iSCSI port in host.
# iptables -A OUTPUT -p tcp -d 10.66.90.0/24 --dport 3260 -j DROP
6.stop the iptables and resume the VM via 'cont' in QEMU monitor.
# service iptables stop
(qemu) info status 
VM status: paused (io-error)
(qemu) cont
(qemu) info status 
VM status: running

Actual results:
after the step 5, general EIO and the VM is in paused status.
(qemu) block I/O error in device 'data-disk': Input/output error (5)
block I/O error in device 'data-disk': Input/output error (5)
(qemu) info status 
VM status: paused (io-error)
after the step 6, resume the VM successfully, and the I/O script run smoothly in the guest, but there many call trace in guest kernel log, I will attach the log here later.

Expected results:
resume the VM successfully without any call trace, both the guest and host works well.

Additional info:

Comment 1 Sibiao Luo 2012-11-05 12:39:48 UTC

(In reply to comment #0)
> Actual results:
> after the step 5, general EIO and the VM is in paused status.
> (qemu) block I/O error in device 'data-disk': Input/output error (5)
> block I/O error in device 'data-disk': Input/output error (5)
> (qemu) info status 
> VM status: paused (io-error)
> after the step 6, resume the VM successfully, and the I/O script run
> smoothly in the guest, but there many call trace in guest kernel log, I will
> attach the log here later.
# dmesg 
Clocksource tsc unstable (delta = -137439160417 ns).  Enable clocksource failover by adding clocksource_failover kernel parameter.
INFO: task flush-252:16:2394 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-252:16  D 0000000000000001     0  2394      2 0x00000080
 ffff8800378312f0 ffffffff81a97840 00000000378312f0 0000000000000000
 ffff88011d7feed0 0007120000000000 0000000000000082 0000000000000010
 ffff88011cd16a00 ffff88000001dd80 ffff88011cd169e0 0000000000000000
Call Trace:
 [<ffffffff8112b753>] ? __alloc_pages_nodemask+0x113/0x8d0
 [<ffffffff812743f9>] ? cfq_set_request+0x329/0x560
 [<ffffffff8111c105>] ? mempool_alloc_slab+0x15/0x20
 [<ffffffff8103c7b8>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff8127d07d>] ? rb_insert_color+0x9d/0x160
 [<ffffffff8125645e>] ? elv_rb_add+0x6e/0x80
 [<ffffffff81269031>] ? blkiocg_update_io_add_stats+0x61/0x90
 [<ffffffff8116672a>] ? kmem_getpages+0xba/0x170
 [<ffffffff81268f51>] ? blkiocg_update_io_merged_stats+0x61/0x90
 [<ffffffff81255214>] ? elv_merged_request+0x84/0x90
 [<ffffffff810969e0>] ? wake_bit_function+0x0/0x50
 [<ffffffffa00cabf5>] ? ext4_mb_find_by_goal+0x175/0x2e0 [ext4]
 [<ffffffffa0077431>] ? jbd2_journal_get_write_access+0x31/0x50 [jbd2]
 [<ffffffffa00c44b8>] ? __ext4_journal_get_write_access+0x38/0x80 [ext4]
 [<ffffffffa00c62ea>] ? ext4_mb_mark_diskspace_used+0x7a/0x300 [ext4]
 [<ffffffffa00ccf81>] ? ext4_mb_new_blocks+0x301/0x580 [ext4]
 [<ffffffffa00bffeb>] ? ext4_ext_find_extent+0x2ab/0x310 [ext4]
 [<ffffffffa00c1ff2>] ? ext4_ext_get_blocks+0xab2/0x19a0 [ext4]
 [<ffffffff8127b2ea>] ? prop_norm_single+0x7a/0xc0
 [<ffffffff8127b3f6>] ? __prop_inc_single+0x46/0x60
 [<ffffffffa00a0b89>] ? ext4_get_blocks+0xf9/0x2a0 [ext4]
 [<ffffffff810652b3>] ? dequeue_entity+0x113/0x2e0
 [<ffffffffa00a5201>] ? mpage_da_map_and_submit+0xa1/0x470 [ext4]
 [<ffffffff8127c795>] ? radix_tree_gang_lookup_tag_slot+0x95/0xe0
 [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff811193f0>] ? find_get_pages_tag+0x40/0x130
 [<ffffffffa00a563d>] ? mpage_add_bh_to_extent+0x6d/0xf0 [ext4]
 [<ffffffffa00a598f>] ? write_cache_pages_da+0x2cf/0x470 [ext4]
 [<ffffffffa00a5e02>] ? ext4_da_writepages+0x2d2/0x620 [ext4]
 [<ffffffff8112dd61>] ? do_writepages+0x21/0x40
 [<ffffffff811ac2cd>] ? writeback_single_inode+0xdd/0x290
 [<ffffffff811ac6de>] ? writeback_sb_inodes+0xce/0x180
 [<ffffffff811ac83b>] ? writeback_inodes_wb+0xab/0x1b0
 [<ffffffff811acbdb>] ? wb_writeback+0x29b/0x3f0
 [<ffffffff8150c2b0>] ? thread_return+0x4e/0x76e
 [<ffffffff81081902>] ? del_timer_sync+0x22/0x30
 [<ffffffff811acec9>] ? wb_do_writeback+0x199/0x240
 [<ffffffff811acfd3>] ? bdi_writeback_task+0x63/0x1b0
 [<ffffffff81096867>] ? bit_waitqueue+0x17/0xd0
 [<ffffffff8113c7f0>] ? bdi_start_fn+0x0/0x100
 [<ffffffff8113c876>] ? bdi_start_fn+0x86/0x100
 [<ffffffff8113c7f0>] ? bdi_start_fn+0x0/0x100
 [<ffffffff81096636>] ? kthread+0x96/0xa0
 [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
 [<ffffffff810965a0>] ? kthread+0x0/0xa0
 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

Comment 2 Sibiao Luo 2012-11-05 12:40:48 UTC

my host cpu info:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
stepping	: 7
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 6784.57
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 4 Sibiao Luo 2012-11-09 10:03:30 UTC

I retest it again and got all the strace log and dmesg log for details, i will attach it later.
btw, i did not see any non-vectored pread/pwrite and preadv/pwritev syscalls from strace log.

host info:
kernel-2.6.32-339.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.334.el6.x86_64
seabios-0.6.1.2-25.el6.x86_64
guest info:
kernel-2.6.32-339.el6.x86_64

# strace -o /home/strace-log.txt /usr/libexec/qemu-kvm -M rhel6.4.0 -cpu SandyBridge -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -usb -device usb-tablet,id=input0 -name sluo_test -uuid 7818b04d-aa83-4fb5-8ae5-e6024ebf6299 -rtc base=localtime,clock=host,driftfix=slew -drive file=/dev/vg-90.100-sluo/lv-90-100-RHEL6.4-20121019.0-Copy-x86_64,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B1,bus=pci.0,addr=0x4 -spice port=5931,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x5 -device hda-duplex -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -drive file=/dev/vg-90.100-sluo/lv-90.100-data-disk-test,if=none,id=data-disk,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,addr=0x7,drive=data-disk,id=sluo-disk -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -nodefaults -serial unix:/tmp/ttyS0,server,nowait -boot menu=on -monitor stdio

after i firewall the iSCSI port in host, it will general EIO.
(qemu) info status 
VM status: running
(qemu) block I/O error in device 'drive-virtio-disk0': Input/output error (5)
block I/O error in device 'drive-virtio-disk0': Input/output error (5)
block I/O error in device 'drive-virtio-disk0': Input/output error (5)
block I/O error in device 'drive-virtio-disk0': Input/output error (5)
block I/O error in device 'data-disk': Input/output error (5)
block I/O error in device 'data-disk': Input/output error (5)
block I/O error in device 'drive-virtio-disk0': Input/output error (5)
(qemu) info status 
VM status: paused (io-error)

stop the iptables and resume the VM via 'cont' in QEMU monitor.
# service iptables stop
(qemu) cont
(qemu) info status 
VM status: running

Comment 5 Sibiao Luo 2012-11-09 10:05:04 UTC

Created attachment 641435 [details]
guest will call trace after resume VM, this is guest dmesg log.

Comment 7 Ademar Reis 2012-11-14 03:46:59 UTC

As noted by Sibiao on comment #1, appears to be related to bug 867401.

Comment 8 Kevin Wolf 2012-11-27 10:20:56 UTC

> INFO: task flush-252:16:2394 blocked for more than 120 seconds.

This is expected when the VM is stopped long enough because of an I/O error.

Note You need to log in before you can comment on or make changes to this bug.