Bug 1556678

Summary: Hot plug usb-ccid for the 2nd time with the same ID as the 1st time failed
Product: Red Hat Enterprise Linux 7 Reporter: Gu Nini <ngu>
Component: qemu-kvm-rhevAssignee: Serhii Popovych <spopovyc>
Status: CLOSED ERRATA QA Contact: Qianqian Zhu <qizhu>
Severity: low Docs Contact:
Priority: low    
Version: 7.5CC: jsuchane, knoel, kraxel, marcandre.lureau, michen, virt-maint, wyu, xuma
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-8.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-01 11:06:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1558351    

Description Gu Nini 2018-03-15 03:23:38 UTC
Description of problem:
Hot plug a usb-ccid device, hot unplug it; then hot plug the device for the 2nd time with the same id, it's a failure for 'Duplicate ID'.

Version-Release number of selected component (if applicable):
Host kernel: 3.10.0-858.el7.x86_64
Qemu-kvm-rhev: qemu-kvm-rhev-2.10.0-21.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot up a guest with a usb controller and a qmp monitor:

    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado1,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \


2. In qmp, hot plug a usb-ccid device, hot unplug it, then hot plug it with the same id for the 2nd time:

# nc -U /var/tmp/avocado1
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 10, "major": 2}, "package": "(qemu-kvm-rhev-2.10.0-21.el7)"}, "capabilities": []}}
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"device_add","arguments":{"driver":"usb-ccid","id":"ccid1"}}
{"return": {}}
{"timestamp": {"seconds": 1521083482, "microseconds": 198044}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "id8e5D72", "path": "/machine/peripheral/id8e5D72/virtio-backend"}}
{"timestamp": {"seconds": 1521083482, "microseconds": 204914}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "id8e5D73", "path": "/machine/peripheral/id8e5D73/virtio-backend"}}


{"execute":"device_del","arguments":{"id":"ccid1"}}
{"return": {}}
{"execute":"device_add","arguments":{"driver":"usb-ccid","id":"ccid1"}}
{"error": {"class": "GenericError", "desc": "Duplicate ID 'ccid1' for device"}}


Actual results:
It's a failure to hot plug the ccid device with the same id for the 2nd time

Expected results:
It's a success to hot plug the ccid device with the same id for the 2nd time


Additional info:
I have done test on qemu-kvm-rhev-2.10.0-5.el7.x86_64, there is no the bug problem; while  the test result on qemu-kvm-rhev-2.10.0-6.el7.x86_64 shows the bug issue. So the bug is regression and starts from qemu-kvm-rhev-2.10.0-6.el7.x86_64.

Although a regression, it's should be in low severity, so it can be postponed to rhel7.6 or rhel7.5z.

Comment 5 Gerd Hoffmann 2018-03-20 14:49:02 UTC
# git log --oneline qemu-kvm-rhev-2.10.0-5.el7..qemu-kvm-rhev-2.10.0-6.el7
c63dc342cc Update to qemu-kvm-ma-2.10.0-6.el7 / qemu-kvm-rhev-2.10.0-6.el7
d5959fcefc s390x/cpumodel: Disable unsupported CPU models
a009d0254c s390x: print CPU definitions in sorted order
13bceabad0 qdev: defer DEVICE_DEL event until instance_finalize()
8edc99e92c Revert "qdev: Free QemuOpts when the QOM path goes away"
a6dae62ef0 qdev: store DeviceState's canonical path to use when unparenting
062ffad793 monitor: fix dangling CPU pointer
de21908e4f multiboot: validate multiboot header address values

I put my money on the qdev patches ;)

Comment 6 Serhii Popovych 2018-03-21 19:17:51 UTC
Bisected to commit 8edc99e92c (Revert "qdev: Free QemuOpts when the QOM path goes away").

Not checked yet on upstream qemu, but quick look at hw/core/qdev.c git-log gives nothing promising.

Will check on upstream shortly.

Comment 7 Serhii Popovych 2018-05-30 14:14:53 UTC
This affects upstream as well as 2.12 for rhel7.6 too.

The only thing I investigated at the moment is that obj->ref counting isn't 1 when calling object_unref() in hw/core/qdev.c::device_unparent().

Now after downstream commit 8edc99e92c ("Revert "qdev: Free QemuOpts when the QOM path goes away" ") object_unref() called with obj->ref == 3, while object_finalize() only called when reference counter is 1.

So we have object reference miscounting at the moment that prevents qemu_opts_del() call in hw/core/qdev.c::device_finalize().

Comment 8 Serhii Popovych 2018-05-31 09:39:44 UTC
Ok, looks like commits added with bz1445460 and mentioned in comment 5:

  13bceabad0 qdev: defer DEVICE_DEL event until instance_finalize()
  8edc99e92c Revert "qdev: Free QemuOpts when the QOM path goes away"
  a6dae62ef0 qdev: store DeviceState's canonical path to use when unparenting

are all correct from object reference count management.

Previous assumption in comment 6 is wrong: things works correctly *without* commit 8edc99e92c Revert "qdev: Free QemuOpts when the QOM path goes away" only because we call  qemu_opts_del(dev->opts) from device_unparent() and with wrong reference counting device_finalize() is never called.

Additional reference counting comes from bus count: dev->num_child_bus isn't zero for added device, so it is considered bus owner (?), so reverting *upstream* commit 675f22c6d3b0 ("bus: do not unref hotplug handler") effectively fixes issue described in comment 0.

From this point I need ideas on how to solve this issue.

Comment 9 Marc-Andre Lureau 2018-05-31 21:47:11 UTC
I wrote a patch series to fix the regression: http://patchew.org/QEMU/20180531195119.22021-1-marcandre.lureau@redhat.com/

Comment 10 Serhii Popovych 2018-06-04 14:17:30 UTC
It seems changes from patchew/20180531195119.22021-1-marcandre.lureau tag from comment 9 fixes problem described in comment 0.

Change set is:
    641091735219 usb-hcd-xhci-test: add a test for ccid hotplug
    225f66dc96ee usb-ccid: fix bus leak
    7c8c7d3a9715 object: fix OBJ_PROP_LINK_UNREF_ON_RELEASE ambivalence
    038fcd2df199 bus: do not unref the added child bus on realize

Comment 11 Marc-Andre Lureau 2018-06-19 13:24:57 UTC
(In reply to Serhii Popovych from comment #10)
> It seems changes from
> patchew/20180531195119.22021-1-marcandre.lureau tag from comment
> 9 fixes problem described in comment 0.
> 
> Change set is:
>     641091735219 usb-hcd-xhci-test: add a test for ccid hotplug
>     225f66dc96ee usb-ccid: fix bus leak
>     7c8c7d3a9715 object: fix OBJ_PROP_LINK_UNREF_ON_RELEASE ambivalence
>     038fcd2df199 bus: do not unref the added child bus on realize

Those patches have been applied upstream but most got reverted.

I think a backport would only need:

9b5c2fd53fe Revert "usb: release the created buses"
265b578c584 object: fix OBJ_PROP_LINK_UNREF_ON_RELEASE ambivalence
1a3ff20e673 usb-hcd-xhci-test: add a test for ccid hotplug

Comment 14 Miroslav Rezanina 2018-07-24 14:17:55 UTC
Fix included in qemu-kvm-rhev-2.12.0-8.el7

Comment 16 Qianqian Zhu 2018-07-26 08:34:29 UTC
Reproduced with steps in comment 0:
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 12, "major": 2}, "package": "qemu-kvm-rhev-2.12.0-7.el7"}, "capabilities": []}}
{"execute":"device_add","arguments":{"driver":"usb-ccid","id":"ccid1"}}
{"return": {}}
{"execute":"human-monitor-command", "arguments":{"command-line":"info usb"}}
{"return": "  Device 0.1, Port 1, Speed 12 Mb/s, Product QEMU USB CCID, ID: ccid1\r\n"}
{"execute":"device_del","arguments":{"id":"ccid1"}}
{"return": {}}
{"execute":"human-monitor-command", "arguments":{"command-line":"info usb"}}
{"return": ""}
{"execute":"device_add","arguments":{"driver":"usb-ccid","id":"ccid1"}}
{"error": {"class": "GenericError", "desc": "Duplicate ID 'ccid1' for device"}}

Verified on qemu-kvm-rhev-2.12.0-8.el7.x86_64:
Result: hot unplug succeeds with DEVICE_DELETED event, and the second hot plug succeeds. And also get current result with 'lsusb' inside guest.
{"execute":"device_add","arguments":{"driver":"usb-ccid","id":"ccid1"}}
{"return": {}}
{"execute":"human-monitor-command", "arguments":{"command-line":"info usb"}}
{"return": "  Device 0.1, Port 1, Speed 12 Mb/s, Product QEMU USB CCID, ID: ccid1\r\n"}
{"execute":"device_del","arguments":{"id":"ccid1"}}
{"timestamp": {"seconds": 1532593700, "microseconds": 126251}, "event": "DEVICE_DELETED", "data": {"device": "ccid1", "path": "/machine/peripheral/ccid1"}}
{"return": {}}
{"execute":"human-monitor-command", "arguments":{"command-line":"info usb"}}
{"return": ""}
{"execute":"device_add","arguments":{"driver":"usb-ccid","id":"ccid1"}}
{"return": {}}
{"execute":"human-monitor-command", "arguments":{"command-line":"info usb"}}
{"return": "  Device 0.1, Port 2, Speed 12 Mb/s, Product QEMU USB CCID, ID: ccid1\r\n"}

So moving to VERIFIED per above result.

Comment 18 errata-xmlrpc 2018-11-01 11:06:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443