This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1252804 - Stop nfs service during writing to data disk which is in nfs server ,qemu can not be paused and Call Trace appear in the guest
Stop nfs service during writing to data disk which is in nfs server ,qemu can...
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Stefan Hajnoczi
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-12 05:23 EDT by Pei Zhang
Modified: 2016-03-28 05:12 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-08-12 18:05:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pei Zhang 2015-08-12 05:23:28 EDT
Description of problem:
Boot a guest with data disk (it is in nfs server). Then stop nfs service during dd is writing data to data disk inside guest. We found 'Call Trace' in the guest And qemu can not be paused.

Version-Release number of selected component (if applicable):
Host:
Kernel:3.10.0-304.el7.x86_64
qemu-kvm-rhev:qemu-kvm-rhev-2.3.0-16.el7.x86_64

Guest:
kernel:3.10.0-295.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. mount from nfs server
# mount -o soft,timeo=60,retrans=2,nosharecache 10.66.9.120:/home /mnt

2. create data disk
# qemu-img create -f qcow2 /mnt/disk1.qcow2 10G

3.Boot a guest attach the data disk(in nfs server),with werror=stop,rerror=stop:
# /usr/libexec/qemu-kvm -name rhel7.2 -machine  pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu SandyBridge -m 2G,slots=256,maxmem=40G -numa node -smp 4,sockets=2,cores=2,threads=1 -uuid 12b1a01e-5f6c-4f5f-8d27-3855a74e4b6b \
-drive file=/home/rhel7.2.qcow2,format=qcow2,if=none,id=drive-virtio-blk-0,werror=stop,rerror=stop \
-device virtio-blk-pci,bus=pci.0,addr=0x8,drive=drive-virtio-blk-0,id=virtio-blk-0,bootindex=0 \
-drive file=/mnt/disk1.qcow2,format=qcow2,if=none,id=drive-virtio-blk-1,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk-1,id=virtio-blk-1  \
-netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:54:00:5c:08:6d \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0 \
-spice port=5900,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \
-monitor stdio  -serial unix:/tmp/monitor,server,nowait\

4. During dd is writing to the data disk(/dev/vda) ininside guest, stop nfs service in the nfs server
# dd if=/dev/urandom of=/dev/vda bs=1M count=2048
# service nfs stop


Actual results:
After step 4, there are 'Call Trace' in the guest.
In the guest
# dmesg
......
[  600.565732] Call Trace:
[  600.565737]  [<ffffffff8162de49>] schedule+0x29/0x70
[  600.565739]  [<ffffffff8162bb39>] schedule_timeout+0x209/0x2d0
[  600.565742]  [<ffffffff81057b7f>] ? kvm_clock_get_cycles+0x1f/0x30
[  600.565745]  [<ffffffff810d0bcc>] ? ktime_get_ts64+0x4c/0xf0
[  600.565747]  [<ffffffff8162d47e>] io_schedule_timeout+0xae/0x130
[  600.565748]  [<ffffffff8162d518>] io_schedule+0x18/0x20
[  600.565751]  [<ffffffff812cb735>] bt_get+0x135/0x1c0
[  600.565761]  [<ffffffff8109ed20>] ? wake_up_atomic_t+0x30/0x30
[  600.565764]  [<ffffffff812cbb5f>] blk_mq_get_tag+0xbf/0xf0
[  600.565765]  [<ffffffff812c744b>] __blk_mq_alloc_request+0x1b/0x200
[  600.565767]  [<ffffffff812c8e11>] blk_mq_map_request+0x191/0x1f0
[  600.565769]  [<ffffffff812ca220>] blk_sq_make_request+0x80/0x380
[  600.565772]  [<ffffffff812bb8df>] ? generic_make_request_checks+0x24f/0x380
[  600.565774]  [<ffffffff81163e09>] ? mempool_alloc+0x69/0x170
[  600.565776]  [<ffffffff812bbaf2>] generic_make_request+0xe2/0x130
[  600.565779]  [<ffffffff812bbbb1>] submit_bio+0x71/0x150
[  600.565781]  [<ffffffff8120e2ed>] ? bio_alloc_bioset+0x1fd/0x350
[  600.565783]  [<ffffffff81209303>] _submit_bh+0x143/0x210
[  600.565784]  [<ffffffff8120bf52>] __block_write_full_page+0x162/0x380
[  600.565786]  [<ffffffff8120f750>] ? I_BDEV+0x10/0x10
[  600.565788]  [<ffffffff8120f750>] ? I_BDEV+0x10/0x10
[  600.565789]  [<ffffffff8120c33b>] block_write_full_page_endio+0xeb/0x100
[  600.565791]  [<ffffffff8120c365>] block_write_full_page+0x15/0x20
[  600.565793]  [<ffffffff8120fec8>] blkdev_writepage+0x18/0x20
[  600.565795]  [<ffffffff8116b923>] __writepage+0x13/0x50
[  600.565797]  [<ffffffff8116c441>] write_cache_pages+0x251/0x4d0
[  600.565799]  [<ffffffff8116b910>] ? global_dirtyable_memory+0x70/0x70
[  600.565801]  [<ffffffff8116c70d>] generic_writepages+0x4d/0x80
[  600.565803]  [<ffffffff8116d7be>] do_writepages+0x1e/0x40
[  600.565805]  [<ffffffff811fef90>] __writeback_single_inode+0x40/0x220
[  600.565806]  [<ffffffff811ff9fe>] writeback_sb_inodes+0x25e/0x420
[  600.565808]  [<ffffffff811ffc5f>] __writeback_inodes_wb+0x9f/0xd0
[  600.565810]  [<ffffffff812004a3>] wb_writeback+0x263/0x2f0
[  600.565812]  [<ffffffff8120272b>] bdi_writeback_workfn+0x2cb/0x460
[  600.565814]  [<ffffffff81095a9b>] process_one_work+0x17b/0x470
[  600.565815]  [<ffffffff8109686b>] worker_thread+0x11b/0x400
[  600.565816]  [<ffffffff81096750>] ? rescuer_thread+0x400/0x400
[  600.565818]  [<ffffffff8109dd2f>] kthread+0xcf/0xe0
[  600.565820]  [<ffffffff8109dc60>] ? kthread_create_on_node+0x140/0x140
[  600.565823]  [<ffffffff81638d58>] ret_from_fork+0x58/0x90
[  600.565824]  [<ffffffff8109dc60>] ? kthread_create_on_node+0x140/0x140
[  720.563248] INFO: task kworker/u8:0:6 blocked for more than 120 seconds.
[  720.564093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  720.564768] kworker/u8:0    D ffff880036623610     0     6      2 0x00000000
[  720.564776] Workqueue: writeback bdi_writeback_workfn (flush-252:0)
[  720.564779]  ffff88007c08b5e0 0000000000000046 ffff88007c7f3980 ffff88007c08bfd8
[  720.564781]  ffff88007c08bfd8 ffff88007c08bfd8 ffff88007c7f3980 ffff88007fd94bc0
[  720.564784]  0000000000000000 7fffffffffffffff ffff88007fd9c980 ffff880036623610


in the host:
(qemu) info status
VM status: running


Expected results:
After step 4, there are not 'Call Trace' in the guest, and state of qemu should become paused.

Additional info:
Comment 3 Ademar Reis 2015-08-12 18:05:20 EDT
Same as in bug 1249911: this is expected behavior, given the architecture of QEMU. To workaround it, we would need a very complex change in the way QEMU deals with local images that's just not feasible.

You'll see similar behavior in most linux applications that access a NFS mount if communication is lost with the NFS server.

Closing it as WONTFIX.

Note You need to log in before you can comment on or make changes to this bug.