Bug 2108026
| Summary: | [OVNK][LIBOVSDB] EnsureAddressSet doesn't seem to be able to find things in the libovsdb cache | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Surya Seetharaman <surya> |
| Component: | Networking | Assignee: | ffernand <ffernand> |
| Networking sub component: | ovn-kubernetes | QA Contact: | Anurag saxena <anusaxen> |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | low | ||
| Priority: | medium | CC: | ffernand |
| Version: | 4.12 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-28 20:43:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem: Observed this in KIND upstream latest: I0716 19:28:28.304473 53 model_client.go:348] Create operations generated as: [{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996201}] I0716 19:28:28.304512 53 transact.go:41] Configuring OVN: [{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996201}] I0716 19:28:28.304580 53 client.go:781] "msg"="transacting operations" "database"="OVN_Northbound" "operations"="[{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996201}]" I0716 19:28:28.304724 53 model_client.go:222] SURYA &{ [] map[name:egressip-served-pods_v4] a12749576804119081385} I0716 19:28:28.304768 53 model_client.go:348] Create operations generated as: [{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996202}] I0716 19:28:28.304984 53 transact.go:41] Configuring OVN: [{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996202}] I0716 19:28:28.305085 53 client.go:781] "msg"="transacting operations" "database"="OVN_Northbound" "operations"="[{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996202}]" BELOW LOG shows cache update and insert: I0716 19:28:28.305088 53 cache.go:982] cache "msg"="processing update" "database"="OVN_Northbound" "table"="Address_Set" "uuid"="78a24956-65ea-4c98-a3ee-6a665eb211f7" I0716 19:28:28.305204 53 cache.go:1002] cache "msg"="inserting row" "database"="OVN_Northbound" "table"="Address_Set" "uuid"="78a24956-65ea-4c98-a3ee-6a665eb211f7" "model"="&{UUID:78a24956-65ea-4c98-a3ee-6a665eb211f7 Addresses:[] ExternalIDs:map[name:egressip-served-pods_v4] Name:a12749576804119081385}" I0716 19:28:28.305304 53 model_client.go:207] SURYA &{78a24956-65ea-4c98-a3ee-6a665eb211f7 [] map[name:egressip-served-pods_v4] a12749576804119081385} I0716 19:28:28.305332 53 model_client.go:298] SURYA &{78a24956-65ea-4c98-a3ee-6a665eb211f7 [] map[name:egressip-served-pods_v4] a12749576804119081385} I0716 19:28:28.305476 53 model_client.go:222] SURYA &{ reroute map[] inport == "rtos-ovn-worker2" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 <nil> [10.244.1.2] map[] 101} I0716 19:28:28.305493 53 model_client.go:222] SURYA &{ reroute map[] inport == "rtos-ovn-control-plane" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 <nil> [10.244.0.2] map[] 101} I0716 19:28:28.305526 53 model_client.go:348] Create operations generated as: [{Op:insert Table:Logical_Router_Policy Row:map[action:reroute match:inport == "rtos-ovn-worker2" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 nexthops:{GoSet:[10.244.1.2]} priority:101] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996203}] The first EnsureAddressSet call for node ovn-worker2 goes fine ^ and its able to create the address set from cache and create policy. The second EnsureAddressSet that happens for ovn-worker node doesn't go well and it fails to fetch the entry from cache, however the timestamp is after the cache update: E0716 19:28:28.305547 53 obj_retry.go:1496] Failed to create *factory.egressNode ovn-worker, error: unable to construct match for node ovn-worker's policy: cannot ensure that addressSet for cluster egressip-served-pods exists failed to create address set {UUID:u2596996202 Addresses:[] ExternalIDs:map[name:egressip-served-pods_v4] Name:a12749576804119081385}: error in transact with ops [{Op:insert Table:Address_Set Row:map[external_ids:{GoMap:map[name:egressip-served-pods_v4]} name:a12749576804119081385] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996202}] results [{Count:0 Error: Details: UUID:{GoUUID:e3ceb1ea-ffda-47a2-9ac3-44fd0a3060b0} Rows:[]} {Count:0 Error:constraint violation Details:Transaction causes multiple rows in "Address_Set" table to have identical values (a12749576804119081385) for index on column "name". First row, with UUID e3ceb1ea-ffda-47a2-9ac3-44fd0a3060b0, was inserted by this transaction. Second row, with UUID 78a24956-65ea-4c98-a3ee-6a665eb211f7, existed in the database before this transaction and was not modified by the transaction. UUID:{GoUUID:} Rows:[]}] and errors []: constraint violation: Transaction causes multiple rows in "Address_Set" table to have identical values (a12749576804119081385) for index on column "name". First row, with UUID e3ceb1ea-ffda-47a2-9ac3-44fd0a3060b0, was inserted by this transaction. Second row, with UUID 78a24956-65ea-4c98-a3ee-6a665eb211f7, existed in the database before this transaction and was not modified by the transaction. I0716 19:28:28.305605 53 model_client.go:373] Mutate operations generated as: [{Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:policies Mutator:insert Value:{GoSet:[{GoUUID:u2596996203}]}}] Timeout:<nil> Where:[where column _uuid == {8a7687a2-417f-4170-aa47-2bf0730ff394}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}] I0716 19:28:28.305633 53 model_client.go:298] SURYA &{8a7687a2-417f-4170-aa47-2bf0730ff394 <nil> <nil> map[] [] [] ovn_cluster_router [] map[] [u2596996203] [] []} Not sure how that happens?? The third EnsureAddressSet that happens for ovn-control-plane goes fine and its able to get the entry from the libovsdb cache and create the policy: I0716 19:28:28.305526 53 model_client.go:348] Create operations generated as: [{Op:insert Table:Logical_Router_Policy Row:map[action:reroute match:inport == "rtos-ovn-control-plane" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 nexthops:{GoSet:[10.244.0.2]} priority:101] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996204}] I0716 19:28:28.305646 53 transact.go:41] Configuring OVN: [{Op:wait Table:Logical_Router_Policy Row:map[] Rows:[map[match:inport == "rtos-ovn-worker2" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 priority:101]] Columns:[priority match] Mutations:[] Timeout:0xc0021d41c0 Where:[where column priority == 101 where column match == inport == "rtos-ovn-worker2" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446] Until:!= Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:} {Op:insert Table:Logical_Router_Policy Row:map[action:reroute match:inport == "rtos-ovn-worker2" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 nexthops:{GoSet:[10.244.1.2]} priority:101] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996203} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:policies Mutator:insert Value:{GoSet:[{GoUUID:u2596996203}]}}] Timeout:<nil> Where:[where column _uuid == {8a7687a2-417f-4170-aa47-2bf0730ff394}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}] I0716 19:28:28.306651 53 cache.go:1002] cache "msg"="inserting row" "database"="OVN_Northbound" "table"="Logical_Router_Policy" "uuid"="b8a05594-6f28-41e0-b953-87224f447219" "model"="&{UUID:b8a05594-6f28-41e0-b953-87224f447219 Action:reroute ExternalIDs:map[] Match:inport == \"rtos-ovn-worker2\" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 Nexthop:<nil> Nexthops:[10.244.1.2] Options:map[] Priority:101}" I0716 19:28:28.306878 53 cache.go:1002] cache "msg"="inserting row" "database"="OVN_Northbound" "table"="Logical_Router_Policy" "uuid"="51d51554-5f36-4d91-9deb-262b8e08ab8a" "model"="&{UUID:51d51554-5f36-4d91-9deb-262b8e08ab8a Action:reroute ExternalIDs:map[] Match:inport == \"rtos-ovn-control-plane\" && ip4.src == $a12749576804119081385 && ip4.dst == $a11079093880111560446 Nexthop:<nil> Nexthops:[10.244.0.2] Options:map[] Priority:101}" I guess question is why the update that happened at 19:28:28.305204 wasn't reflected at E0716 19:28:28.305547 53 obj_retry.go:1496] Failed to create ? Maybe this isn't a bug and is simply a race... However the way I countered this was to introduce a `GetAddressSet` utility and instead of calling EnsureAddressSet that does a createorUpdateNone I simply try to fetch the entry from cache and so far that has worked out well. So wanted to open this bug to make sure we investigate if this is something expected with libovsdb or if there is a way to avoid these in flight transaction issues and maybe even not return failure but have a way to move on. If an entry already exists in DB then use that... why return error for duplicate transactions, we could just terminate that transaction and move on...? Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: