Bug 1944121

Summary: OVN-kubernetes references AddressSets after deleting them, causing ovn-controller errors
Product: OpenShift Container Platform Reporter: Andy Bartlett <andbartl>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: anbhat, anusaxen, astoycos, cdc, cpassare, fiezzi, joboyer, openshift-bugs-escalate, zzhao
Version: 4.6   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2037221 (view as bug list) Environment:
Last Closed: 2021-07-27 22:56:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1951552, 2037221    

Description Andy Bartlett 2021-03-29 11:03:47 UTC
Description of problem:

Hi,
 My customer is seeing the following errors in the OVN logs:

oc logs ovnkube-node-dl9rf -n openshift-ovn-kubernetes -c ovn-controller
~~~
2021-03-29T09:03:11Z|00023|lflow|WARN|Dropped 125 log messages in last 3608 seconds (most recently, 3604 seconds ago) due to excessive rate
2021-03-29T09:03:11Z|00024|lflow|WARN|error parsing match "reg0[8] == 1 && (ip4.dst == {$a10309787208889436959, $a10365219111236117829, $a10819938825290322714, $a10951847637192197900, $a11406730146667044950, $a11488600306090122364, $a11639418112040726272, $a11679209097797522690, $a11792605607458231430, $a11840947999323393980, $a12008966417577843415, $a12118860184270314321, $a12154940767642941581, $a12439030798087402947, $a13007429802480051810, $a13107763654225623860, $a13190512685297411187, $a13201061909865993731, $a13372283725684734569, $a13488436752166948783, $a1351790758523410338, $a13729220396487021194, $a13781166516790073860, $a13945312121057736541, $a14107835402720776790, $a14251709134719960830, $a14521909410631803261, $a14529579844747687031, $a14561768756155400207, $a14645179476519238814, $a14778536506385295319, $a14984764264663054966, $a15035842609893410826, $a15091597671988575549, $a1526648491631204789, $a15296080318842514484, $a15470091589984456145, $a15717357575191736570, $a1580195430150997937, $a15980954813698967384, $a16196029566881918112, $a16235039932615691331, $a16696894757989404559, $a16827882760058655782, $a17106607209580660163, $a17392117145536614473, $a17512017134359671539, $a18084239623829554092, $a18100547566674084158, $a18181855203662190716, $a18363165982804349389, $a18367586167066130605, $a202214916133963101, $a2269173857227433350, $a2480367723304591635, $a2505134621825326058, $a2532452745942987758, $a2596498882697482933, $a2608411444094720729, $a2626637162768014744, $a3250614850508762292, $a329654696166302970, $a3778783347055226282, $a3811068855386776286, $a3826097561732631257, $a3913954643872337278, $a4024032052162464345, $a4159773708941219606, $a4507119548603116395, $a4862355331019525072, $a4910645986623324574, $a4924844817631943359, $a5154718082306775057, $a5461954474119153551, $a5717617069769647295, $a5725778633432857673, $a6480381032413798865, $a676301189363102062, $a6937002112706621489, $a6966165646670463182, $a7228108612096671536, $a7360869138558469588, $a7709789839164300938, $a8335123182382710849, $a8449276449561422499, $a859969115225903002, $a868704743532850555, $a8796347983972862164, $a8865309346839741844, $a9031429284635959044, $a9059336463628850699, $a914642378985782992, $a9249499220598688301, $a9361912555461938690, $a9614653021011329238, $a9769903554508400075} && tcp && tcp.dst==5353 && inport == @a3121302962899173096)": Syntax error at `$a16696894757989404559' expecting address set name.
2021-03-29T09:35:22Z|00025|lflow|WARN|Dropped 319 log messages in last 1931 seconds (most recently, 1917 seconds ago) due to excessive rate
2021-03-29T09:35:22Z|00026|lflow|WARN|error parsing match "reg0[7] == 1 && (ip4.dst == {$a10309787208889436959, $a10365219111236117829, $a10819938825290322714, $a10951847637192197900, $a11406730146667044950, $a11488600306090122364, $a11639418112040726272, $a11679209097797522690, $a11792605607458231430, $a11840947999323393980, $a12008966417577843415, $a12118860184270314321, $a12154940767642941581, $a12439030798087402947, $a13007429802480051810, $a13107763654225623860, $a13190512685297411187, $a13201061909865993731, $a13372283725684734569, $a13488436752166948783, $a1351790758523410338, $a13729220396487021194, $a13781166516790073860, $a13945312121057736541, $a14107835402720776790, $a14251709134719960830, $a14521909410631803261, $a14529579844747687031, $a14561768756155400207, $a14645179476519238814, $a14778536506385295319, $a14984764264663054966, $a15035842609893410826, $a15091597671988575549, $a1526648491631204789, $a15296080318842514484, $a15470091589984456145, $a15717357575191736570, $a1580195430150997937, $a15980954813698967384, $a16196029566881918112, $a16235039932615691331, $a16827882760058655782, $a17392117145536614473, $a17512017134359671539, $a18084239623829554092, $a18100547566674084158, $a18181855203662190716, $a18363165982804349389, $a18367586167066130605, $a202214916133963101, $a2269173857227433350, $a2480367723304591635, $a2505134621825326058, $a2532452745942987758, $a2596498882697482933, $a2608411444094720729, $a2626637162768014744, $a3250614850508762292, $a329654696166302970, $a3778783347055226282, $a3811068855386776286, $a3826097561732631257, $a3913954643872337278, $a4159773708941219606, $a4507119548603116395, $a4862355331019525072, $a4910645986623324574, $a4924844817631943359, $a5154718082306775057, $a5461954474119153551, $a5585560803499416081, $a5717617069769647295, $a5725778633432857673, $a6480381032413798865, $a676301189363102062, $a6937002112706621489, $a6966165646670463182, $a7228108612096671536, $a7360869138558469588, $a7709789839164300938, $a8335123182382710849, $a8449276449561422499, $a859969115225903002, $a868704743532850555, $a8796347983972862164, $a8865309346839741844, $a9031429284635959044, $a9059336463628850699, $a914642378985782992, $a9249499220598688301, $a9361912555461938690, $a9369327724028753197, $a9614653021011329238, $a9769903554508400075} && tcp && tcp.dst==53 && inport == @a7955728082500497664)": Syntax error at `$a5585560803499416081' expecting address set name.


They are experiencing some weird dns issues in their cluster and they think it's related to these error messages.
They have a DNS networkpolicy, so we think it might be related to this:
~~~
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"networkpolicies","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"networkpolicies-config","app.kubernetes.io/version":"1.16.0","helm.sh/chart":"networkpolicies-config-0.1.0"},"name":"stex-rd-networkpolicies-config-allow-dns","namespace":"stex-rd"},"spec":{"egress":[{"ports":[{"port":53,"protocol":"TCP"},{"port":53,"protocol":"UDP"},{"port":5353,"protocol":"TCP"},{"port":5353,"protocol":"UDP"}],"to":[{"namespaceSelector":{}}]}],"podSelector":{},"policyTypes":["Egress"]}}
  creationTimestamp: null
  generation: 1
  labels:
    app.kubernetes.io/instance: networkpolicies
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: networkpolicies-config
    app.kubernetes.io/version: 1.16.0
    helm.sh/chart: networkpolicies-config-0.1.0
  managedFields:
  - apiVersion: networking.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
        f:labels:
          .: {}
          f:app.kubernetes.io/instance: {}
          f:app.kubernetes.io/managed-by: {}
          f:app.kubernetes.io/name: {}
          f:app.kubernetes.io/version: {}
          f:helm.sh/chart: {}
      f:spec:
        f:egress: {}
        f:policyTypes: {}
    manager: argocd-application-controller
    operation: Update
    time: "2021-03-29T09:08:48Z"
  name: stex-rd-networkpolicies-config-allow-dns
  selfLink: /apis/networking.k8s.io/v1/namespaces/stex-rd/networkpolicies/stex-rd-networkpolicies-config-allow-dns
spec:
  egress:
  - ports:
    - port: 53
      protocol: TCP
    - port: 53
      protocol: UDP
    - port: 5353
      protocol: TCP
    - port: 5353
      protocol: UDP
    to:
    - namespaceSelector: {}
  podSelector: {}
  policyTypes:
  - Egress
~~~

Is this normal?

Version-Release number of selected component (if applicable):

Openshift 4.6.22 on Openstack Train

How reproducible:

100%


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Casey Callendrello 2021-03-31 12:30:42 UTC
On second thought, I take my statement back; 10 seconds is not an acceptable amount of time for traffic to drop.

It looks like we need to be more careful around deleting the address set. Let me take a look.

As a workaround, they can allow DNS traffic to a static set of namespaces, rather than all of them, and this should be more effective.

Comment 8 Casey Callendrello 2021-04-12 15:30:00 UTC
Upstream PR https://github.com/ovn-org/ovn-kubernetes/pull/2168

Comment 12 zhaozhanqi 2021-04-20 04:28:31 UTC

this issue can be reproduced on 4.8.0-0.nightly-2021-04-19-175100 and before with the following steps:

oc new-project z1
oc create -f list.json -n z1

oc create -f policy.yaml -n z1

oc delete namespace z1

oc  -n openshift-ovn-kubernetes -c ovn-controller logs ovnkube-node-tbjtt | grep error

$ oc  -n openshift-ovn-kubernetes -c ovn-controller logs ovnkube-node-tbjtt | grep error
2021-04-20T03:31:15Z|00012|lflow|WARN|error parsing match "reg0[7] == 1 && (ip4.dst == {$a10309787208889436959, $a10365219111236117829, $a10819938825290322714, $a10951847637192197900, $a10989964789905973848, $a11679209097797522690, $a11840947999323393980, $a12008966417577843415, $a12154940767642941581, $a12439030798087402947, $a12442592456685404899, $a12676868499161577573, $a13190512685297411187, $a13201061909865993731, $a13240627709167346629, $a1348114338668042846, $a13488436752166948783, $a13781166516790073860, $a13945312121057736541, $a14107835402720776790, $a14251709134719960830, $a1440651458415708593, $a14685874349853149463, $a14778536506385295319, $a14853475258048435235, $a14984764264663054966, $a15035842609893410826, $a15091597671988575549, $a1524509087118018451, $a15498572541984179350, $a15717357575191736570, $a1580195430150997937, $a16196029566881918112, $a16235039932615691331, $a16535584809086930420, $a16827882760058655782, $a16947928209580517504, $a17945288690632224513, $a18084239623829554092, $a18100547566674084158, $a18148441154061044714, $a18181855203662190716, $a18363165982804349389, $a18367586167066130605, $a202214916133963101, $a2269173857227433350, $a2464725673981131758, $a2480367723304591635, $a2532452745942987758, $a2548583302683441616, $a2596498882697482933, $a2608411444094720729, $a2626637162768014744, $a2945744646617718812, $a3028971819481556012, $a3443446865985225755, $a3568460697379707690, $a3826097561732631257, $a3913954643872337278, $a411509175204078706, $a4287555272933708487, $a4507119548603116395, $a4910645986623324574, $a4924844817631943359, $a5154718082306775057, $a5725778633432857673, $a5915752004141595053, $a6290201146921788953, $a6379087888064551080, $a6480381032413798865, $a6937002112706621489, $a7228108612096671536, $a7360869138558469588, $a7709789839164300938, $a7750122077141603738, $a7970571692240646921, $a8335123182382710849, $a8449276449561422499, $a859969115225903002, $a868704743532850555, $a8796347983972862164, $a8865309346839741844, $a9031429284635959044, $a9055623764263513107, $a914642378985782992, $a9550124085778963684, $a9638343899169163746, $a9737436360988937620, $a9769903554508400075} && tcp && tcp.dst==53 && inport == @a14458084718464326979)": Syntax error at `$a12442592456685404899' expecting address set name.


and Verified this bug on 4.8.0-0.nightly-2021-04-19-225513

Comment 13 zhaozhanqi 2021-04-20 04:47:45 UTC
attach the policy.yaml

$ cat policy.yaml 
---
# Source: networkpolicies-config-values/charts/networkpolicies-config/templates/networkpolicies.yaml
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: stex-rd-networkpolicies-config-default-deny
spec:
  podSelector: {}
  policyTypes:
    - Ingress
---
# Source: networkpolicies-config-values/charts/networkpolicies-config/templates/networkpolicies.yaml
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: stex-rd-networkpolicies-config-allow-dns
spec:
  podSelector: {}
  egress:
  - to:
    - namespaceSelector: {}
    ports:
    - protocol: TCP
      port: 53
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 5353
    - protocol: UDP
      port: 5353
  policyTypes:
    - Egress

Comment 14 Casey Callendrello 2021-04-20 12:27:31 UTC
@Andy I don't think the large number of flows causes this per se - this bug is simple enough to be triggered by a small amount of flows -- it's the flows that are wrong!

If you're still seeing something like this in the future after the fixes are rolled out, we can investigate.

Comment 18 errata-xmlrpc 2021-07-27 22:56:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438