The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2079834 - [OVN SCALE][ovn-northd] Inefficient load balancer logical flow generation logic
Summary: [OVN SCALE][ovn-northd] Inefficient load balancer logical flow generation logic
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: FDP 22.C
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: lorenzo bianconi
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-28 10:48 UTC by Dumitru Ceara
Modified: 2023-03-13 07:04 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-13 07:04:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
OCP density heavy 120 node NB database. (3.23 MB, application/gzip)
2022-04-28 10:48 UTC, Dumitru Ceara
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1920 0 None None None 2022-04-28 16:24:14 UTC

Description Dumitru Ceara 2022-04-28 10:48:58 UTC
Created attachment 1875624 [details]
OCP density heavy 120 node NB database.

Description of problem:

In a large scale deployment, e.g., during a density-heavy OpenShift
scale test running a cluster of 120 nodes, 10K pods, and 10K load
balancers applied to all node logical switches and routers, northd
spends a large amount of time processing and generating logical flows
that implement the load balancing in the logical router ppipeline.

With the attached database, focusing on a single load balancer that
corresponds to an OCP service (Service_2626e963-load-cluster-preupgrade-20220405-942/cluster-density-942-5_TCP_cluster):

$ ovn-nbctl list load_balancer Service_2626e963-load-cluster-preupgrade-20220405-942/cluster-density-942-5_TCP_cluster
_uuid               : 3de1afff-e606-41d2-b3ab-43499b8690c8
external_ids        : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="2626e963-load-cluster-preupgrade-20220405-942/cluster-density-942-5"}
health_check        : []
ip_port_mappings    : {}
name                : "Service_2626e963-load-cluster-preupgrade-20220405-942/cluster-density-942-5_TCP_cluster"
options             : {event="false", reject="true", skip_snat="false"}
protocol            : tcp
selection_fields    : []
vips                : {"172.30.46.142:443"="10.153.5.214:8443,10.160.2.79:8443", "172.30.46.142:80"="10.153.5.214:8080,10.160.2.79:8080"}

Checking the flows corresponding to one of the VIPs:

  table=6 (lr_in_dnat         ), priority=120  , match=(ct.est && ip4 && reg0 == 172.30.46.142 && tcp && reg9[16..31] == 443 && ct_mark.natted == 1), action=(flags.force_snat_for_lb = 1; next;)
  table=6 (lr_in_dnat         ), priority=120  , match=(ct.new && ip4 && reg0 == 172.30.46.142 && tcp && reg9[16..31] == 443), action=(flags.force_snat_for_lb = 1; ct_lb_mark(backends=10.153.5.214:8443,10.160.2.79:8443);)

We notice that these are almost generic, the only per-datapath component
being the "flags.force_snat_for_lb = 1".  It actually turns out that
eventually these flows get "merged" for all gateway routers and applied
on a single datapath_group:

$ ovn-sbctl list logical_flow 24b39d1b
_uuid               : 24b39d1b-b069-4412-9474-a7dd0e453f0b
actions             : "flags.force_snat_for_lb = 1; next;"
controller_meter    : []
external_ids        : {source="northd.c:10066", stage-hint="3de1afff", stage-name=lr_in_dnat}
logical_datapath    : []
logical_dp_group    : 285bc43c-85f0-4b5c-803e-172e31508efd
match               : "ct.est && ip4 && reg0 == 172.30.46.142 && tcp && reg9[16..31] == 443 && ct_mark.natted == 1"
pipeline            : ingress
priority            : 120
table_id            : 6
tags                : {}
hash                : 0

$ ovn-sbctl list logical_dp_group 285bc43c-85f0-4b5c-803e-172e31508efd
_uuid               : 285bc43c-85f0-4b5c-803e-172e31508efd
datapaths           : [00723364-3e2e-46c1-9a40-1f47518807dd, 0111c64e-1d61-4d1d-a06b-56a1e8b4a75d, 016e88f0-fbf7-4bbf-95f7-6aaf17f57db6, 039ee72e-81d3-455f-bacb-abfba4d05b52, 0484aa49-b094-45d0-aa3f-892ed43bae15, 06e5650e-a0f1-48ff-a322-149061889f6a, 091e838d-2b17-4494-af51-e2f8fbb64ace, 09a781b9-3bd5-43f2-809a-39a23449a17b, 0cfe261e-4daa-46c9-bcb8-7ba16d0f94ae, 0eb81599-ce87-4e94-80de-e3ed4126b9e5, 0fa05a39-2128-46ce-ae32-439caea2bd83, 1080922b-4bd5-4880-8916-0dbacc970d6f, 154501bc-c45c-4641-9aa7-87565b8d6f13, 17602971-6f43-4746-879f-e4358e9809aa, 177a3e92-43a7-4ae1-ade3-ac254d20d590, 19a40adf-df52-4935-b260-aade13e92279, 1b018cc6-9f52-46b1-a725-fb55657d6b11, 1b8f62b6-b992-434f-8567-d64317c0a5e3, 1bd4835e-15c2-4b40-a071-8aae676c2c08, 1c95248b-2743-402a-8ab1-cf45264e7f26, 1cc46e8f-1b8b-45f6-8a50-e9b4cb46b5b2, 1df171b8-9bfc-4f38-a29a-f826c4c3c73e, 24224513-dc87-42f0-a217-d6989fd56cef, 28c135bc-5060-4690-8283-958b3ea57177, 2aaabbf4-407f-41c7-adfb-3dde6fa03e53, 2baa6583-9ab1-4895-a826-1065e5db4771, 2fadb9ed-d176-4c13-a9b6-c254ff21e251, 306c9fcd-8eb6-49de-9c11-07ed54e3756b, 32047b58-ec28-441d-b4c2-d9dde4090637, 3394d089-4040-47d0-8266-5153be572342, 33c12226-7eb9-44be-9dcc-1f6c76c7e21d, 3698e96d-8dc3-4546-b419-eb74ff971b10, 3e5aef45-b22a-49e5-af78-6191216f7c5b, 3fc799ed-0c0e-47b8-81ed-aea8cb12112f, 4031722a-ae25-4403-94bd-d7f814568280, 406d2464-c346-4ff9-96ce-b816a033fdf3, 408769b0-aa7f-4510-a93c-e9a9919763ac, 41995cc9-009d-46dc-97cc-0c74edaaa0f4, 4523648a-c188-4a90-a7aa-79ae3ad6dba7, 45a41e03-e8de-460c-9dde-1c830347b8cd, 466a0d6e-efbe-41b2-82d2-858b4ea9901a, 47f54b54-6586-4e8b-8c96-d3c51d14bb6a, 4944a146-657e-4bda-8579-63ed5b7a9ac2, 4b76831b-bd35-4c61-9fed-053859bd98f5, 529893a6-f3db-4638-bcec-21efd84502db, 52a2201b-f50d-4ac6-a772-704d9b3cefaf, 5b6b79cf-660e-4790-ad25-24da82547c67, 5c6fa68e-4fd3-4fc9-829d-b0590f5038d2, 5cafa9fb-ef88-4f12-a262-65c3e8aefb5e, 5cc09a20-b115-47fe-973c-94c3557607fc, 5fa47239-a223-415b-b1a4-6c82dfe8e77d, 5fe138a8-9260-4ae3-80fa-0b4cc84e3eac, 6062fec6-2fea-4ba1-8b02-4257f57fbe13, 61efa618-e581-407e-8de0-4dc938906d2e, 63385570-0db4-4e83-947a-5f082821595f, 65c35109-79e6-4de9-9503-dd67e01e742e, 6760e784-41ee-4196-a02a-6514c50d4a69, 69347667-53a5-49ea-987f-c07cd1cb4f72, 6adfc7d6-badc-4201-9d37-a91ee321bcd1, 6bddaacf-6415-4112-b728-5dba8b8df8f2, 6c64bad5-d7f9-48f8-bff8-fb1cab6e3e2a, 6ffe6d4c-b6b2-4abd-90ea-65b5b6e59d19, 709ea29d-3205-4c96-bc9f-c60c7f97a061, 72437a2a-2c31-4777-a48b-e0c68cd8b149, 7314e15b-4fc0-4df3-8e23-45a104a5c50b, 73299c27-cf69-4323-8bd4-4abf78d77ad4, 7648f21d-f516-4ecf-9f83-032e2370ac38, 7832c190-7170-406f-bbe6-136d6922ec2e, 794cbe74-3d49-4465-9f06-194db10bdb1b, 7c0133eb-49b2-4b0e-8d3a-cb3ba1b6742d, 7d73aa68-f42d-4d08-967d-97f74473aae5, 7e32e54b-bb90-4014-bd03-c190ed3bdf42, 7e92d834-9136-4bd5-9ab3-67e2b9b6141f, 7f0b1aa9-f1de-4888-a184-85c3cff1cacf, 7f2469f1-618f-4b20-ae36-8cbf30b02cd9, 80445208-7d01-4e5e-9acb-a11b44bc3ffc, 805e0655-807b-4421-8a75-b762d453c600, 84d5f2f9-702d-477f-912a-38ad3d988efb, 85a588b4-ffff-4fdf-8854-b35d7fd20e52, 8715984e-a603-459f-a390-2d5480a19d88, 87fc6c6a-a95c-4bd0-b5fd-633458640a08, 889ac01b-6ae7-4663-84a2-673eaff5edaf, 88f45b38-433b-44b9-bbfe-694251f39bfa, 89150e36-08b5-4870-80a9-93a9737980f5, 904e98bf-870c-4266-91c7-8acb3a37940b, 90bf59f1-307f-43ab-b876-c44e6e7aeff9, 91204e01-9915-4966-8a2c-343888220209, 9581678f-6a44-498c-bd8c-054f2978d176, 988ee049-cc00-48b7-9f20-65c8ba466457, 9d8dc2f3-ef50-4dab-b313-11745ac02f99, 9dcb933a-3abe-4071-828c-c2394bc3d0dd, 9e2b88af-95fd-42ab-97e5-2dbcf5f2f6e7, a3289f9b-c21d-4bcd-8064-efa772328c43, a528d4d1-5534-4c8b-b7fd-ccf60dfbbcf6, a860e919-fadd-440b-a433-0974a706878e, a8f959de-ef15-44a7-9eb6-1fbb194689ac, adf9a254-1073-4665-b942-418afca79c33, b59b8867-b856-43bc-b042-085506e5f79b, b885c73e-e678-419d-99a8-3ab11a8f7ad4, bb065da6-44ab-4d52-baba-2104c17ceabd, bb47566a-670d-460d-b095-a8932a865461, c2aa77c8-6f9f-4575-a4dc-c4860fbe48a6, c382b3c4-793d-42a7-92a5-5f8842964069, c4f453a6-918d-4098-b05b-49faa5ac398a, c76b58ea-6d90-45d9-9e1a-7e248b235039, cc9d8595-faa2-4664-9ac0-0e45de62a89a, cd359af0-bce9-4e41-89c5-d527db64b412, d2a8c164-8e05-493d-acd6-04f78a0e4ebf, d3a72fbe-05ee-4ac5-ba9e-8d27f9d92fd9, d9196167-f375-4a98-9925-2337ce78f87f, db0fb7c1-455c-4444-885b-cd7211c101dc, db1bd0dd-4a6f-4180-b70b-f098f4dcdfb0, dc58f2a0-e7b2-4a72-a1d7-3a956cd488e5, df0473ef-7bb5-4c03-8b0b-d2d51f92641b, df1f19eb-fa07-4086-9cbc-cb136c0df953, e6b5aeef-6fb8-46c8-b5b0-bc54d07a76db, eb660488-4fa0-4a6f-a847-4d21f20b8b76, ef8c1276-96d5-4ad9-a0de-e65da539af42, f24d89bf-86b2-4f5d-847b-1df0619f351b, f2759e84-7c45-49e4-a8ea-17a7064e871d, f7c08540-9d72-4775-8c29-0a9d33baa58b, fa3c4a96-ef23-461a-84ea-4ca19c706edd, fa55205a-0549-4de6-829c-dad29dd5e4e6, fa6abd41-c99c-4f4f-849f-4eb36000535d, fb0aa3d1-8a45-4275-a879-d912fafa18d7, fb1755c6-1235-4595-98ce-be7bbe3c0ec4, fe8781af-fbe1-4c4c-aa50-1f58abdd0afe]

The problem is that the code in ovn-northd that generates these flows
doesn't take care of the fact that multiple routers may share the same
load balancer and force_snat configuration.  This makes it inefficient.

Measuring how long it takes to build these flows on a test machine we
see that:

(1) the time to prepend inside the per-logical-router loop the
    "flags.force_snat_for_lb = 1" to 'action' (which is precomputed for
    all routers) adds up to ~300ms out of the total of ~4000ms to compute
    logical flows.
    https://github.com/ovn-org/ovn/blob/34f29acdfe216899bbdb51a74859af62b0c75d6c/northd/northd.c#L9983

(2) for routers that don't have "distributed gateway ports" (gateway
    routers), the 'new_match_p' and 'est_match_p' strings are actually
    the ones that were precomputed outside the loop.
    https://github.com/ovn-org/ovn/blob/34f29acdfe216899bbdb51a74859af62b0c75d6c/northd/northd.c#L9959

    In such cases, calling ovn_lflow_add_with_hint() and
    ovn_lflow_add_with_hint__() for every router on which the LB is
    applied is inefficient and adds up to ~1500ms out of the total of
    ~4000ms to compute logical flows.

A potential solution that might duplicate some code but should behave
in a better way is to first walk the list of routers on which a LB
is applied and partition it into:
a) gateway routers (od->n_l3dgw_ports == 0) with snat_type == SKIP_SNAT
b) gateway routers (od->n_l3dgw_ports == 0) with snat_type == FORCE_SNAT
c) gateway routers (od->n_l3dgw_ports == 0) with snat_type == NO_FORCE_SNAT
d) non-gateway routers.

We can then write dedicated functions to generate the LB-VIP related
logical flows for all 4 sub-cases above.  For cases a)-c) because the
match and action can be precomputed for all applicable routers, we could
just use ovn_lflow_add_at_with_hash() combined with ovn_dp_group_add_with_reference()
like we currently do for load balancers in the logical switch pipeline.

Comment 1 lorenzo bianconi 2022-05-03 21:38:45 UTC
upstream series: https://patchwork.ozlabs.org/project/ovn/list/?series=298225


Note You need to log in before you can comment on or make changes to this bug.