Bug 1968151 - After scale-in and scale-out of application pods: unable to add external gwStr src-ip route to GR router, stderr:"ovn-nbctl: duplicate nexthop for the same ECMP route\n"
Summary: After scale-in and scale-out of application pods: unable to add external gwSt...
Keywords:
Status: CLOSED DUPLICATE of bug 1959909
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Mohamed Mahmoud
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-06 08:56 UTC by Andreas Karis
Modified: 2021-06-09 15:39 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-09 15:39:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andreas Karis 2021-06-06 08:56:42 UTC
Description of problem:

While troubleshooting a failing ICNI 2.0 lab setup, I scaled in an out my application pods.

After scale-in and scale-out of application pods I see: unable to add external gwStr src-ip route to GR router, stderr:"ovn-nbctl: duplicate nexthop for the same ECMP route\n"

I hit this in my lab while playing around with ICNI 2 and would like to report this for sanity checking:
~~~
[root@openshift-jumpserver-0 ~]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.12    True        False         10d     Cluster version is 4.7.12
~~~

In/output in this public part is sanitized, replace name- with the actual name of components:

~~~
[root@openshift-jumpserver-0 f5setup]# oc get pod -n name-ingress -o json name-8687f85d5b-fg4qv | jq '.metadata.annotations'
{
(...)
  "k8s.ovn.org/routing-namespaces": "spk-app",
  "k8s.ovn.org/routing-network": "spk-ingress/int-vlan-fg4qv",
(...)
~~~

~~~
[root@openshift-jumpserver-0 ~]# oc project
Using project "name-app" on server "https://api.ipi-cluster.example.com:6443".
~~~

~~~
oc scale deployment nginx-web-app --replicas=0
oc scale deployment nginx-web-app --replicas=1
~~~

Events:
~~~
[root@openshift-jumpserver-0 ~]# oc get events
LAST SEEN   TYPE      REASON                   OBJECT                               MESSAGE
7m2s        Normal    Killing                  pod/nginx-web-app-55498df695-446wn   Stopping container nginx
1s          Warning   ErrorAddingLogicalPort   pod/nginx-web-app-55498df695-hhsld   unable to add external gwStr src-ip route to GR router, stderr:"ovn-nbctl: duplicate nexthop for the same ECMP route\n", err:&{%!!(MISSING)g(string=OVN command '/usr/bin/ovn-nbctl --timeout=15 --may-exist --policy=src-ip --ecmp-symmetric-reply lr-route-add GR_openshift-worker-0 172.24.3.208/32 192.168.131.11' failed: exit status 1)}w
8m4s        Normal    Scheduled                pod/nginx-web-app-55498df695-hhsld   Successfully assigned name-app/nginx-web-app-55498df695-hhsld to openshift-worker-0
6m30s       Warning   FailedCreatePodSandBox   pod/nginx-web-app-55498df695-hhsld   Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_nginx-web-app-55498df695-hhsld_name-app_c0774c27-a8c8-48c9-bc76-d0ec21764028_0(8f37a6712f142d8a0da3e4260b50af2275f10d617a1d5f96543bbe296e70386a): [name-app/nginx-web-app-55498df695-hhsld:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[name-app/nginx-web-app-55498df695-hhsld 8f37a6712f142d8a0da3e4260b50af2275f10d617a1d5f96543bbe296e70386a] [name-app/nginx-web-app-55498df695-hhsld 8f37a6712f142d8a0da3e4260b50af2275f10d617a1d5f96543bbe296e70386a] failed to configure pod interface: error while waiting on flows for pod: timed out waiting for OVS flows
'
6m8s        Warning   FailedCreatePodSandBox   pod/nginx-web-app-55498df695-hhsld   Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_nginx-web-app-55498df695-hhsld_name-app_c0774c27-a8c8-48c9-bc76-d0ec21764028_0(383676b1de599e83f27b7c3f0ae13a2c0f9f0e5b6f99524a55de44de5c1c9519): [name-app/nginx-web-app-55498df695-hhsld:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[name-app/nginx-web-app-55498df695-hhsld 383676b1de599e83f27b7c3f0ae13a2c0f9f0e5b6f99524a55de44de5c1c9519] [name-app/nginx-web-app-55498df695-hhsld 383676b1de599e83f27b7c3f0ae13a2c0f9f0e5b6f99524a55de44de5c1c9519] failed to configure pod interface: error while waiting on flows for pod: timed out waiting for OVS flows
'
'
(...)
~~~



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Mohamed Mahmoud 2021-06-09 15:39:45 UTC
unable to add external gwStr src-ip route to GR router, stderr:"ovn-nbctl: duplicate nexthop for the same ECMP route\n", err:&{%!!(MISSING)g(string=OVN command '/usr/bin/ovn-nbctl --timeout=15 --may-exist --policy=src-ip --ecmp-symmetric-reply lr-route-add GR_openshift-worker-0 172.24.3.208/32 192.168.131.11' failed: exit status 1)}w

should have been handled by this fix https://bugzilla.redhat.com/show_bug.cgi?id=1959909

*** This bug has been marked as a duplicate of bug 1959909 ***


Note You need to log in before you can comment on or make changes to this bug.