RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1264255 - When hot-unplug a device which is doing block-commit, guest and qemu will hang until the commit finished, and call trace appears in guest
Summary: When hot-unplug a device which is doing block-commit, guest and qemu will han...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.2
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jeff Cody
QA Contact: Qianqian Zhu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-18 02:52 UTC by Pei Zhang
Modified: 2017-08-02 03:24 UTC (History)
8 users (show)

Fixed In Version: qemu-kvm-rhev-2.9.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 23:29:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 0 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Pei Zhang 2015-09-18 02:52:57 UTC
Description of problem:
Boot guest with a second disk, then do live snapshot (length of chain >=3).  During block-commit from snapshot2 to snapshot1, hot-unplug this device, guest and qemu will hang until block-commit finished. And there are call trace in guest.

Version-Release number of selected component (if applicable):
Host:rhel7.2
Kernel:3.10.0-316.el7.x86_64
qemu-kvm-rhev:qemu-kvm-rhev-2.3.0-23.el7.x86_64

Guest:rhel6.7z
Kernel:2.6.32-573.7.1.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1. boot guest with a data disk
# /usr/libexec/qemu-kvm -name rhel6.7z -machine pc-i440fx-rhel7.1.0,accel=kvm \
-cpu SandyBridge -m 4G,slots=256,maxmem=40G -numa node \
-smp 4,sockets=2,cores=2,threads=1 \
-uuid 82b1a01e-5f6c-4f5f-8d27-3855a74e6b61 \
-netdev tap,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:61 \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 \
-vnc 0.0.0.0:1 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \
-serial unix:/tmp/monitor,server,nowait \
-drive file=/mnt/disk1/rhel6.7_virtio.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0 \
-drive file=/mnt/disk2/data1.qcow2,format=qcow2,if=none,id=drive-virtio-blk1,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk1,id=virtio-blk1 \

2. do live snapshot to this data disk
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-blk1", "snapshot-file": "/mnt/disk4/snapshot1.qcow2", "format": "qcow2", "mode": "absolute-paths" } }
{"return": {}}
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-blk1", "snapshot-file": "/mnt/disk4/snapshot2.qcow2", "format": "qcow2", "mode": "absolute-paths" } }

3. during block-commit, hot-unplug this data disk, guest and qemu both hang.
{ "execute": "block-commit", "arguments": { "device": "drive-virtio-blk1","base": "/mnt/disk4/snapshot1.qcow2", "top": "/mnt/disk4/snapshot2.qcow2", "speed": 0 } }
{"return": {}}
{"execute":"device_del","arguments":{"id":"virtio-blk1"}}
{"return": {}}


4. the device will be deleted until block-commit finished. Meanwhile, qemu and guest become responsive again. 
{"timestamp": {"seconds": 1442543658, "microseconds": 293415}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-blk1/virtio-backend"}}
{"timestamp": {"seconds": 1442543658, "microseconds": 293520}, "event": "DEVICE_DELETED", "data": {"device": "virtio-blk1", "path": "/machine/peripheral/virtio-blk1"}}
{"timestamp": {"seconds": 1442543659, "microseconds": 39327}, "event": "BLOCK_JOB_READY", "data": {"device": "", "len": 4067885056, "offset": 4067885056, "speed": 0, "type": "commit"}}
{"timestamp": {"seconds": 1442543659, "microseconds": 39415}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "", "len": 4067885056, "offset": 4067885056, "speed": 0, "type": "commit"}}

5. call trace appear in guest.
# dmesg
BUG: soft lockup - CPU#3 stuck for 67s! [udevd:567]
Clocksource tsc unstable (delta = -137436967906 ns)
Modules linked in: ext2 autofs4 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 sg microcode serio_raw virtio_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
CPU 3 
Modules linked in: ext2 autofs4 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 sg microcode serio_raw virtio_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]

Pid: 567, comm: udevd Not tainted 2.6.32-573.7.1.el6.x86_64 #1 Red Hat KVM
RIP: 0010:[<ffffffff8105435a>]  [<ffffffff8105435a>] flush_tlb_others_ipi+0x11a/0x130
RSP: 0000:ffff88013719bb38  EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88013719bb78 RCX: 0000000000000004
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff81e3e8d8
RBP: ffffffff8100bc0e R08: 0000000000000000 R09: 0000000000000000
R10: ffff8801370a8100 R11: 0000000000000246 R12: ffff88013719bb28
R13: ffffffff8100bc0e R14: 0000000100000001 R15: ffff880137223400
FS:  00007f7c25b3b7a0(0000) GS:ffff880028380000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7c275b25c8 CR3: 00000001369d0000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process udevd (pid: 567, threadinfo ffff880137198000, task ffff88013726eab0)
Stack:
 00007f7c275b25c8 ffffffff81e3e8c0 ffff88013726eab0 ffff8801370a8100
<d> 00007f7c275b25c8 ffff8801370a83c8 0000000000000008 ffffea000440ace8
<d> ffff88013719bba8 ffffffff810543e6 ffff88013719bba8 ffff8801370a8100
Call Trace:
 [<ffffffff810543e6>] ? native_flush_tlb_others+0x76/0x90
 [<ffffffff8105450e>] ? flush_tlb_page+0x5e/0xb0
 [<ffffffff810534dd>] ? ptep_set_access_flags+0x6d/0x70
 [<ffffffff811519e3>] ? do_wp_page+0x493/0x920
 [<ffffffff8115266d>] ? handle_pte_fault+0x2cd/0xb20
 [<ffffffff81456004>] ? move_addr_to_kernel+0x64/0x70
 [<ffffffff81153159>] ? handle_mm_fault+0x299/0x3d0
 [<ffffffff8104f156>] ? __do_page_fault+0x146/0x500
 [<ffffffff8123e90b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8153eb4e>] ? do_page_fault+0x3e/0xa0
 [<ffffffff8153bef5>] ? page_fault+0x25/0x30
Code: 7d f8 c9 c3 0f 1f 44 00 00 48 8b 05 69 22 bc 00 41 8d b5 f0 00 00 00 4c 89 e7 ff 90 e0 00 00 00 eb 09 0f 1f 80 00 00 00 00 f3 90 <8b> 35 20 87 bb 00 4c 89 e7 e8 58 ab 24 00 85 c0 74 ec eb 90 66 
Call Trace:
 [<ffffffff81054368>] ? flush_tlb_others_ipi+0x128/0x130
 [<ffffffff810543e6>] ? native_flush_tlb_others+0x76/0x90
 [<ffffffff8105450e>] ? flush_tlb_page+0x5e/0xb0
 [<ffffffff810534dd>] ? ptep_set_access_flags+0x6d/0x70
 [<ffffffff811519e3>] ? do_wp_page+0x493/0x920
 [<ffffffff8115266d>] ? handle_pte_fault+0x2cd/0xb20
 [<ffffffff81456004>] ? move_addr_to_kernel+0x64/0x70
 [<ffffffff81153159>] ? handle_mm_fault+0x299/0x3d0
 [<ffffffff8104f156>] ? __do_page_fault+0x146/0x500
 [<ffffffff8123e90b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8153eb4e>] ? do_page_fault+0x3e/0xa0
 [<ffffffff8153bef5>] ? page_fault+0x25/0x30
Disabling lock debugging due to kernel taint
virtio-pci 0000:00:05.0: PCI INT A disabled


Actual results:
1. guest and qemu both hang until the block-commit is completed.
2. there are call trace in the guest


Expected results:
1. guest and qemu should always be responsive
2. the block-commit should be canceled immediately
3. no call trace in the guest

Additional info:

Comment 2 Pei Zhang 2015-09-18 03:28:26 UTC
More info:
1. With above steps, block-commit to base file(at the bottom of snapshot chain) also hit the same issue. 

....
{ "execute": "block-commit", "arguments": { "device": "drive-virtio-blk1"} }
....

2. During block-stream, then hot-unplug the device, guest and qemu will hang until stream finished.

....(Steps like above)
{ "execute": "block-stream", "arguments": { "device": "drive-virtio-blk1","base": "/mnt/disk2/data1.qcow2", "speed": 1000000000,"backing-file":"/mnt/disk2/data1.qcow2"} }
{"return": {}}
{"execute":"device_del","arguments":{"id":"virtio-blk1"}}
{"return": {}}
{"timestamp": {"seconds": 1442546589, "microseconds": 477854}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-blk1", "len": 5368709120, "offset": 5368709120, "speed": 1000000000, "type": "stream"}}
{"timestamp": {"seconds": 1442546589, "microseconds": 526804}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-blk1/virtio-backend"}}
{"timestamp": {"seconds": 1442546590, "microseconds": 423785}, "event": "DEVICE_DELETED", "data": {"device": "virtio-blk1", "path": "/machine/peripheral/virtio-blk1"}}

check dmesg in guest:
# dmesg
Clocksource tsc unstable (delta = -120257117794 ns)
virtio-pci 0000:00:05.0: PCI INT A disabled

Comment 5 Jeff Cody 2017-04-19 19:15:29 UTC
On device deletion, the running block job will be cancelled, so this behavior no longer exists.

Comment 8 Qianqian Zhu 2017-05-03 10:25:42 UTC
Reproduced on:
qemu-kvm-rhev-2.3.0-22.el7.x86_64
kernel-3.10.0-640.el7.x86_64

Steps:
1. Launch guest with secondary disk:
/usr/libexec/qemu-kvm -name rhel7_4-9343 -m 1G -smp 2  -drive file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 -device virtio-blk-pci,drive=drive-0,id=virtio-blk-0,bootindex=0 -drive file=/home/test,format=qcow2,if=none,cache=none,aio=native,id=drive-virtio-blk1,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio-blk1,id=virtio-blk1,bus=pci.0,addr=0x14,serial="QEMU-DISK2"  -monitor stdio -qmp tcp:0:5555,server,nowait -vnc :3 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:15

2.Live snapshot:
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-blk1", "snapshot-file": "/home/data-sn1", "format": "qcow2", "mode": "absolute-paths" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-blk1", "snapshot-file": "/home/data-sn2", "format": "qcow2", "mode": "absolute-paths" } }

3. dd files inside guest to differ sn2 from sn1:
dd if=/dev/urandom of=/dev/vdb2 bs=1M count=2000

4. Block commit:
{ "execute": "block-commit", "arguments": { "device": "drive-virtio-blk1","base":"/home/data-sn1", "top": "/home/data-sn2", "speed": 0 } }

5. Hot unplug device:
{"execute":"device_del","arguments":{"id":"virtio-blk1"}}

Results:
Both qemu and guest hang, until block commit finishes.

Verified on:
qemu-kvm-rhev-2.9.0-1.el7.x86_64.rpm
kernel-3.10.0-640.el7.x86_64

Steps same as above.

Results:
Both qemu and guest works well and block job is cancelled after hot unplug device:
{ "execute": "block-commit", "arguments": { "device": "drive-virtio-blk1","base":"/home/data-sn1", "top": "/home/data-sn2", "speed": 0 } }
{"return": {}}
{"execute":"device_del","arguments":{"id":"virtio-blk1"}}
{"return": {}}
{"timestamp": {"seconds": 1493807017, "microseconds": 459789}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-blk1/virtio-backend"}}
{"timestamp": {"seconds": 1493807017, "microseconds": 459871}, "event": "DEVICE_DELETED", "data": {"device": "virtio-blk1", "path": "/machine/peripheral/virtio-blk1"}}
{"timestamp": {"seconds": 1493807017, "microseconds": 706040}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "drive-virtio-blk1", "len": 1048576000, "offset": 134217728, "speed": 0, "type": "commit"}}

Moving to VERIFIED per above result.

Comment 10 errata-xmlrpc 2017-08-01 23:29:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 11 errata-xmlrpc 2017-08-02 01:07:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 12 errata-xmlrpc 2017-08-02 01:59:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 13 errata-xmlrpc 2017-08-02 02:40:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 14 errata-xmlrpc 2017-08-02 03:04:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 15 errata-xmlrpc 2017-08-02 03:24:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.