Description of problem: Follow https://docs.google.com/document/d/1pdEQnX1FXHP1h89lwZeEOt5m3uU-OyD5PqIUuHdo--8/edit#, if I miss step 1 to create net-attach-def, and just using step 2 to create create a pod, then this pod will be in ContainerCreating forever, but I saw 1 for network_attachment_definition_enabled_instance_up{networks="any"} and network_attachment_definition_enabled_instance_up{networks="any"}, because this pod is not really in running state, it should be 0 for both metrics. Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2019-11-08-094604 How reproducible: Always Steps to Reproduce: [root@dhcp-41-193 FILE]# oc get net-attach-def --all-namespaces NAMESPACE NAME AGE test1 macvlan-bridge 68m [root@dhcp-41-193 FILE]# oc delete net-attach-def macvlan-bridge networkattachmentdefinition.k8s.cni.cncf.io "macvlan-bridge" deleted [root@dhcp-41-193 FILE]# oc login -u testuser-0 -p OC_IT3-uRzrF Login successful. You have one project on this server: "test1" Using project "test1". [root@dhcp-41-193 FILE]# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/multus-cni/Pods/1interface-macvlan-bridge.yaml pod/macvlan-bridge-pod-pd4ng created [root@dhcp-41-193 FILE]# oc login -u kubeadmin -p 5CvKS-2xJay-TX85m-AyfXe Login successful. You have access to 54 projects, the list has been suppressed. You can list all projects with 'oc projects' Using project "test1". [root@dhcp-41-193 FILE]# oc rsh -n openshift-multus multus-admission-controller-5rh6d sh-4.2# curl localhost:9091/metrics | grep network_attachment_definition % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1202 100 1202 0 0 1016k 0 --:--:-- --:--:-- --:--:-- 1173k # HELP network_attachment_definition_enabled_instance_up Metric to identify clusters with network attachment definition enabled instances. # TYPE network_attachment_definition_enabled_instance_up gauge network_attachment_definition_enabled_instance_up{networks="any"} 1 network_attachment_definition_enabled_instance_up{networks="sriov"} 0 # HELP network_attachment_definition_instances Metric to get number of instance using network attachment definition in the cluster. # TYPE network_attachment_definition_instances gauge network_attachment_definition_instances{networks="any"} 1 network_attachment_definition_instances{networks="macvlan"} 0 network_attachment_definition_instances{networks="sriov"} 0 sh-4.2# exit exit [root@dhcp-41-193 FILE]# oc get pods NAME READY STATUS RESTARTS AGE macvlan-bridge-pod-pd4ng 0/1 ContainerCreating 0 72s [root@dhcp-41-193 FILE]# oc get net-attach-def --all-namespaces No resources found. [root@dhcp-41-193 FILE]# [root@dhcp-41-193 FILE]# oc delete pod macvlan-bridge-pod-pd4ng pod "macvlan-bridge-pod-pd4ng" deleted [root@dhcp-41-193 FILE]# oc rsh -n openshift-multus multus-admission-controller-5rh6d sh-4.2# curl localhost:9091/metrics | grep network_attachment_definition % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1202 100 1202 0 0 970k 0 --:--:-- --:--:-- --:--:-- 1173k # HELP network_attachment_definition_enabled_instance_up Metric to identify clusters with network attachment definition enabled instances. # TYPE network_attachment_definition_enabled_instance_up gauge network_attachment_definition_enabled_instance_up{networks="any"} 0 network_attachment_definition_enabled_instance_up{networks="sriov"} 0 # HELP network_attachment_definition_instances Metric to get number of instance using network attachment definition in the cluster. # TYPE network_attachment_definition_instances gauge network_attachment_definition_instances{networks="any"} 0 network_attachment_definition_instances{networks="macvlan"} 0 network_attachment_definition_instances{networks="sriov"} 0 sh-4.2# Actual results: network_attachment_definition_enabled_instance_up{networks="any"} 1 network_attachment_definition_instances{networks="any"} 1 Expected results: network_attachment_definition_enabled_instance_up{networks="any"} 0 network_attachment_definition_instances{networks="any"} 0 Additional info:
By looking at metrics I think this might be a problem with some network component - reassigning.
Yes, Checking for Pod status periodically is expensive, So not implemented to check the pod status. The event is captured when the pod is created and metrics are incremented. It doesn't consider the state of the metrics. (Creating forever, error ). The metrics decrease count when delete event is fired.
Verification failed on 4.3.0-0.nightly-2019-12-04-054458. 1. Create a pod oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/multus-cni/Pods/1interface-macvlan-bridge.yaml 2. check pod: pod is in ContainerCreating state 3. Metrics show: network_attachment_definition_enabled_instance_up{networks="any"} 0 network_attachment_definition_instances{networks="any"} 0 4. create net-attach-def curl -s https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/multus-cni/NetworkAttachmentDefinitions/macvlan-bridge.yaml | sed s/eth0/ens5/g | oc create -f- 5. check pod: pod is in Running state 6. Metrics show: network_attachment_definition_enabled_instance_up{networks="any"} 0 network_attachment_definition_instances{networks="any"} 0 Expect: network_attachment_definition_instances{networks="macvlan"} 1
Tried with AWS cluster and could not recreate. 1. oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/multus-cni/Pods/1interface-macvlan-bridge.yaml 2. oc get pods NAME READY STATUS RESTARTS AGE macvlan-bridge-pod-9swhw 0/1 ContainerCreating 0 4s 3. curl -s https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/multus-cni/NetworkAttachmentDefinitions/macvlan-bridge.yaml | sed s/eth0/ens3/g | oc create -f- 4. oc get pods NAME READY STATUS RESTARTS AGE macvlan-bridge-pod-9swhw 1/1 Running 0 3m20s 5. oc logs -f multus-admission-controller-nzl2p -n openshift-multus .... I1204 17:54:36.245547 1 webhook.go:142] AdmissionReview request allowed: Network Attachment Definition '{"cniVersion":"0.3.0","ipam":{"gateway":"10.1.1.1","rangeEnd":"10.1.1.200","rangeStart":"10.1.1.100","routes":[{"dst":"0.0.0.0/0"}],"subnet":"10.1.1.0/24","type":"host-local"},"master":"ens3","mode":"bridge","type":"macvlan"}' is valid I1204 17:54:49.247354 1 localmetrics.go:50] UPdating net-attach-def metrics for macvlan with value 1 I1204 17:54:49.247388 1 localmetrics.go:50] UPdating net-attach-def metrics for any with value 1 6. oc rsh -n openshift-multus multus-admission-controller-nzl2p curl localhost:9091/metrics ... ..... # HELP network_attachment_definition_enabled_instance_up Metric to identify clusters with network attachment definition enabled instances. # TYPE network_attachment_definition_enabled_instance_up gauge network_attachment_definition_enabled_instance_up{networks="any"} 1 network_attachment_definition_enabled_instance_up{networks="sriov"} 0 # HELP network_attachment_definition_instances Metric to get number of instance using network attachment definition in the cluster. # TYPE network_attachment_definition_instances gauge network_attachment_definition_instances{networks="any"} 1 network_attachment_definition_instances{networks="macvlan"} 1 network_attachment_definition_instances{networks="sriov"} 0
(In reply to Aneesh Puttur from comment #7) > Tried with AWS cluster and could not recreate. > 1. oc create -f > https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/ > networking/multus-cni/Pods/1interface-macvlan-bridge.yaml > > 2. oc get pods > NAME READY STATUS RESTARTS AGE > macvlan-bridge-pod-9swhw 0/1 ContainerCreating 0 4s > > 3. curl -s > https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/ > networking/multus-cni/NetworkAttachmentDefinitions/macvlan-bridge.yaml | sed > s/eth0/ens3/g | oc create -f- > > 4. oc get pods > NAME READY STATUS RESTARTS AGE > macvlan-bridge-pod-9swhw 1/1 Running 0 3m20s > > 5. oc logs -f multus-admission-controller-nzl2p -n openshift-multus > .... > I1204 17:54:36.245547 1 webhook.go:142] AdmissionReview request > allowed: Network Attachment Definition > '{"cniVersion":"0.3.0","ipam":{"gateway":"10.1.1.1","rangeEnd":"10.1.1.200", > "rangeStart":"10.1.1.100","routes":[{"dst":"0.0.0.0/0"}],"subnet":"10.1.1.0/ > 24","type":"host-local"},"master":"ens3","mode":"bridge","type":"macvlan"}' > is valid > I1204 17:54:49.247354 1 localmetrics.go:50] UPdating net-attach-def > metrics for macvlan with value 1 > I1204 17:54:49.247388 1 localmetrics.go:50] UPdating net-attach-def > metrics for any with value 1 > > 6. oc rsh -n openshift-multus multus-admission-controller-nzl2p curl > localhost:9091/metrics > ... > ..... > # HELP network_attachment_definition_enabled_instance_up Metric to identify > clusters with network attachment definition enabled instances. > # TYPE network_attachment_definition_enabled_instance_up gauge > network_attachment_definition_enabled_instance_up{networks="any"} 1 > network_attachment_definition_enabled_instance_up{networks="sriov"} 0 > # HELP network_attachment_definition_instances Metric to get number of > instance using network attachment definition in the cluster. > # TYPE network_attachment_definition_instances gauge > network_attachment_definition_instances{networks="any"} 1 > network_attachment_definition_instances{networks="macvlan"} 1 > network_attachment_definition_instances{networks="sriov"} 0 Worked with Aneesh, when creating the pod and NAD under a new project, the problem will be shown up
https://github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pull/36
Tested and verified on 4.3.0-0.nightly-2019-12-10-120829
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062