Bug 1923979 - kubernetes-nmstate: nmstate-handler pod crashes when configuring bridge device using ip tool
Summary: kubernetes-nmstate: nmstate-handler pod crashes when configuring bridge devic...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 2.6.0
Assignee: Petr Horáček
QA Contact: Yossi Segev
URL:
Whiteboard:
: 1923978 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-02 11:11 UTC by Yossi Segev
Modified: 2021-03-10 11:24 UTC (History)
2 users (show)

Fixed In Version: kubernetes-nmstate-handler-container-v2.6.0-18
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-10 11:23:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nmstate-handler.log (8.69 KB, text/plain)
2021-02-02 11:12 UTC, Yossi Segev
no flags Details
journalctl.log (93.06 KB, text/plain)
2021-02-02 11:13 UTC, Yossi Segev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:0799 0 None None None 2021-03-10 11:24:48 UTC

Description Yossi Segev 2021-02-02 11:11:43 UTC
Description of problem:
When configuring a bridge device using ip (netlink) tool - nmastate-handler on the node enters CrahsLoopBackOff.


Version-Release number of selected component (if applicable):
OCP Version: 4.7.0-fc.4
Kubernetes Version: v1.20.0+f0a2ec9
CNV Vesion: 2.6.0
nmstate verion: nmstate-0.3.4-17.el8_3.noarch


How reproducible:
Always


Steps to Reproduce:
1. In the cluster - login to one of the worker nodes.

[cnv-qe-jenkins@network02-khphv-executor ~]$ oc get nodes -l node-role.kubernetes.io/worker
NAME                             STATUS   ROLES    AGE    VERSION
network02-khphv-worker-0-4zmxs   Ready    worker   2d4h   v1.20.0+d9c52cc
network02-khphv-worker-0-7q8sr   Ready    worker   2d4h   v1.20.0+d9c52cc
network02-khphv-worker-0-cmqsb   Ready    worker   2d4h   v1.20.0+d9c52cc
[cnv-qe-jenkins@network02-khphv-executor ~]$ 
[cnv-qe-jenkins@network02-khphv-executor ~]$ oc debug node/network02-khphv-worker-0-cmqsb
Starting pod/network02-khphv-worker-0-cmqsb-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.3.248
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# 

2. Add a bridge device using ip tool:
sh-4.4# ip link add name br-test type bridge


Actual results:
nmstate-handler pod on the node enters CrashLoopBackOff state.
[cnv-qe-jenkins@network02-khphv-executor ~]$ oc get pod -n openshift-cnv -l component=kubernetes-nmstate-handler  -o wide
NAME                    READY   STATUS             RESTARTS   AGE     IP              NODE                             NOMINATED NODE   READINESS GATES
nmstate-handler-2s6ff   1/1     Running            0          2d3h    192.168.2.23    network02-khphv-master-2         <none>           <none>
nmstate-handler-5g4jj   1/1     Running            0          2d3h    192.168.0.132   network02-khphv-master-1         <none>           <none>
nmstate-handler-9c8hs   1/1     Running            1          2d3h    192.168.0.14    network02-khphv-worker-0-7q8sr   <none>           <none>
nmstate-handler-ggfn7   1/1     Running            1          2d3h    192.168.2.46    network02-khphv-worker-0-4zmxs   <none>           <none>
nmstate-handler-r2kcd   1/1     Running            0          2d3h    192.168.1.146   network02-khphv-master-0         <none>           <none>
nmstate-handler-vt2ck   0/1     CrashLoopBackOff   18         3h42m   192.168.3.248   network02-khphv-worker-0-cmqsb   <none>           <none>


Additional info:
1. nmstate-handler log and journalctl (from the node) are attached.
2. Workaround - restart the damaged pod by deleting it:
[cnv-qe-jenkins@network02-khphv-executor yossi]$ oc delete pod -n openshift-cnv nmstate-handler-vt2ck
pod "nmstate-handler-vt2ck" deleted

Comment 1 Yossi Segev 2021-02-02 11:12:55 UTC
Created attachment 1754342 [details]
nmstate-handler.log

Comment 2 Yossi Segev 2021-02-02 11:13:35 UTC
Created attachment 1754343 [details]
journalctl.log

Comment 3 Yossi Segev 2021-02-03 19:38:50 UTC
Verified on:
CNV 2.6.0
kubernetes-nmstate-handler-container-v2.6.0-18 (sha256:ed5fc663a75c40c878da3ce85e85e2a0b27f4e8cf02f93d9822cfb627a8266b2).
by following the same scenario as in the bug description.
The nmstate-handler pod remained stable, without restarting or crashing.

Comment 4 Yossi Segev 2021-02-14 09:57:27 UTC
*** Bug 1923978 has been marked as a duplicate of this bug. ***

Comment 7 errata-xmlrpc 2021-03-10 11:23:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0799


Note You need to log in before you can comment on or make changes to this bug.