Bug 1979951
| Summary: | nodedev-API would face performance issue after define multiple mediated device and mdev of ccw with wrong type | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | bfu <bfu> |
| Component: | mdevctl | Assignee: | Virtualization Maintenance <virt-maint> |
| Status: | CLOSED DUPLICATE | QA Contact: | virt-qe-z |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 8.5 | CC: | alex.williamson, bfiuczyn, cohuck, dhorak, hannsj_uhl, jinzhao, jjongsma, juzhang, knoel, ngu, pbonzini, qzhang, ribarry, smitterl, thuth, tstaudt, virt-qe-z, yiwei |
| Target Milestone: | beta | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | s390x | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-09 14:01:47 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
bfu
2021-07-07 13:31:36 UTC
If use "# virsh nodedev-create mdev.xml" to create mdev, it will also have two minutes delay for checking the result with "virsh nodedev-list --cap mdev" [root@kernelqe2 bfu]# cat mdev.xml <device> <name>mdev_4c1da20a-7f13-4e01-b42a-d0705ba4ffb6</name> <path>/sys/devices/vfio_ap/matrix/4c1da20a-7f13-4e01-b42a-d0705ba4ffb6</path> <parent>ap_matrix</parent> <driver> <name>vfio_mdev</name> </driver> <capability type='mdev'> <type id='vfio_ap-passthrough'/> <attr name="assign_adapter" value="0x02"/> <attr name="assign_domain" value="0x0011"/> </capability> </device> *** Bug 1979949 has been marked as a duplicate of this bug. *** Does mdevctl print anything sane if you try to list the devices? IOW, is it libvirt that chokes on this? (In reply to Cornelia Huck from comment #3) > Does mdevctl print anything sane if you try to list the devices? IOW, is it > libvirt that chokes on this? As the additional information in comment1, seems yes, I think libvirt chokes on this. And as for mdevctl print: [root@kernelqe2 bfu]# mdevctl list -d --dumpjson [ { "0.0.26ab": [ { "566d63bd-8b33-4323-9f13-5d56155cd668": { "mdev_type": "fbq1", "start": "manual" } }, { "254193dd-8ffc-4594-a272-28e35d87071g": { "mdev_type": "fbq2", "start": "manual" } }, { "566d63bd-8b33-4323-9f13-5d56155cd669": { "mdev_type": "fbq1", "start": "manual" } } ], "matrix": [ { "fcbc4814-8e59-4620-a817-92c9a7724a2e": { "mdev_type": "vfio_ap-passthrough", "start": "manual", "attrs": [ { "assign_adapter": "0x02" }, { "assign_domain": "0x0011" }, { "assign_domain": "0x003a" }, { "assign_domain": "0x00ab" } ] } } ] } ] (In reply to Cornelia Huck from comment #3) > Does mdevctl print anything sane if you try to list the devices? IOW, is it > libvirt that chokes on this? Yes, I think that is caused by libvirt. 1. libvirts mdevctl polling runs into parsing errors as long as an mdev definition with an unknown mdev_type type exists. I tried it out like this creating an vfio-ccw alike mdev. # mdevctl list -d --dumpjson [ { "0.0.0033": [ { "e60cef97-3f6b-485e-ac46-0520f9f66ac2": { "mdev_type": "vfio_ccw-io", "start": "manual" } } ], "0.0.0034": [ { "ffffffff-3f6b-485e-ac46-0520f9f66ac2": { "mdev_type": "type1", "start": "manual" } } ] } ] Looking into the journal can I confirm this problem. Jul 08 09:32:18 t46lp71.lnxne.boe libvirtd[139293]: internal error: Unexpected format for parent device object Jul 08 09:32:18 t46lp71.lnxne.boe libvirtd[139293]: internal error: failed to query mdevs from mdevctl: Jul 08 09:32:56 t46lp71.lnxne.boe libvirtd[139293]: internal error: Unexpected format for parent device object Jul 08 09:32:56 t46lp71.lnxne.boe libvirtd[139293]: internal error: failed to query mdevs from mdevctl: Jul 08 09:32:56 t46lp71.lnxne.boe libvirtd[139293]: mdevctl failed to updated mediated devices Jul 08 09:32:57 t46lp71.lnxne.boe libvirtd[139293]: internal error: Unexpected format for parent device object Jul 08 09:32:57 t46lp71.lnxne.boe libvirtd[139293]: internal error: failed to query mdevs from mdevctl: Jul 08 09:32:57 t46lp71.lnxne.boe libvirtd[139293]: mdevctl failed to updated mediated devices 2. loading/unloading the vfio_ap device driver causes a long delay until the ap_matrix gets creating/deleting as nodedev object The device driver is actually loaded and available immediately. When trying this out with libvirt v7.5.0+ I had to wait more than 5 minutes until the ap_matrix nodedev object appeared in the list of objects after loading the vfio_ap device driver. It looks like the udev event parsing in libvirt gets locked up by the mdevctl polling errors. I'm pretty sure this is just the same root cause as bug 1979440 -- failing to properly parse mdevs from multiple parent devices. *** This bug has been marked as a duplicate of bug 1979440 *** |