Bug 2219693
| Summary: | [OSP 16.2][neutron] slow api reply for tenant ports listing is causing instance creation to fails | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Flavio Piccioni <fpiccion> |
| Component: | openstack-neutron | Assignee: | Rodolfo Alonso <ralonsoh> |
| Status: | NEW --- | QA Contact: | Eran Kuris <ekuris> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 16.2 (Train) | CC: | astillma, chrisw, enothen, gregraka, ihrachys, jamsmith, ralonsoh, scohen |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Flavio Piccioni
2023-07-04 19:52:47 UTC
To confirm, there's nothing specific about the ports in question used for the scenario. right? E.g. these are not trunk ports, or ports with a large number of additional SGs / address pairs... Anything particular about the ports in question? Just want to make sure we don't miss something specific about the ports. === Another question I have for the scenario is - why does nova list all (?) ports as part of a VM boot? Shouldn't it list just the ports that belong to the VM? Perhaps there's some nova backport missing to optimize it from compute side. Perhaps something to check on with Compute team. (In reply to Ihar Hrachyshka from comment #2) > To confirm, there's nothing specific about the ports in question used for > the scenario. right? E.g. these are not trunk ports, or ports with a large > number of additional SGs / address pairs... Anything particular about the > ports in question? Just want to make sure we don't miss something specific > about the ports. > from customer reply: "all these VMs have single connection and no trunks. We have very simple VMs.Regarding security groups, we do not have more than 2-3" > > Another question I have for the scenario is - why does nova list all (?) > ports as part of a VM boot? Shouldn't it list just the ports that belong to > the VM? Perhaps there's some nova backport missing to optimize it from > compute side. Perhaps something to check on with Compute team. from a very quick query to DFG:compute: there is proably a historical reason, prehaps quotas, as enforicng port quotas Hello Flavio: Some questions about this issue: 1) These 900 ports, to what subnets/networks they belong? Do these networks belong to other projects? 2) Has "tenant" (from the "port list" command) admin powers or is a regular user? 3) How many RBAC rules are there in the project? To what project these RBAC entries belong? *NOTE* I'm not asking the target project in the RBAC rule but from what project these RBAC entries were created. In particular what I'm wondering if these rules belong to the same project as "tenant". A "select * from networkrbacs" will help. 4) Did the customer enabled the "slow_query_log" in the database engine? If that is the case, the output will be very helpful. This can be enabled/disable on real-time with [1]. Regards. [1]https://access.redhat.com/solutions/321003 Reported https://bugzilla.redhat.com/show_bug.cgi?id=2222102 to track nova optimization to avoid fetching all ports on VM creation. Rodolfo, I don't think get_ports calls to ml2 drivers, does it? I believe I checked it in code when I was looking at the 16.1 counterpart of this bz, and I couldn't find where ml2 driver could be called by ml2 plugin. Perhaps I'm missing something? Hello Ihar: I can't confirm that for 'apic_aim' mechanism driver. Actually using ML2/OVS or ML2/OVN and testing in a similar environment (RBACs, users, networks, ports), the Neutron API response time is around 5-6 seconds. Because we don't provide support for this driver and I don't know what extensions this mechanism driver is loading (that could affect to the "_make_port_dict" method and most probably this is what is happening), I can't triage nor analyse this issue. Regards. |