Bug 1758702 - Multus admission controller not detecting syntax issues in pod and net-attach-def CRDs
Summary: Multus admission controller not detecting syntax issues in pod and net-attach...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.5.0
Assignee: Douglas Smith
QA Contact: Anurag saxena
URL:
Whiteboard: SDN-CI-IMPACT,SDN-STALE
Depends On:
Blocks: 1837638
TreeView+ depends on / blocked
 
Reported: 2019-10-04 22:29 UTC by Anurag saxena
Modified: 2020-07-13 17:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1782301 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:11:31 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 339 0 'None' closed Bug 1758702: The Multus admission controller should validate & gate on UPDATE events 2021-01-18 07:27:27 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:12:02 UTC

Description Anurag saxena 2019-10-04 22:29:28 UTC
Description of problem: Following net-attach-def say "macvlan-bridge" is defined

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "macvlan",
      "master": "eth0",
      "mode": "bridge",
      "ipam": {
        "type": "dhcp"
      }
    }'

But following oc patch operations didn't complain anything about wrong syntax

1) It took empty config

$ oc patch networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config":" " }}' --type=merge
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge patched

2) Its not detecting wrong config parameters

oc patch networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config":"{"cniVersion": "0.3.0", "type": "cvlan", "master": "eth", "mode": "bridge", "ipam": { "type": "dhcp" }}"}}' --type=merge
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge patched

type:cvlan and master:eth above shouldn't be an acceptable standard

Version-Release number of selected component (if applicable):4.2.0-0.nightly-2019-10-03-224032


How reproducible: Always


Steps to Reproduce:
1.Create a net-attach-def
2.perform oc patch operations as defined in description above
3.

Actual results: Multus-admission-controller could't catch syntax issues in CRD


Expected results: Multus-admission-controller should catch syntax issues in CRD


Additional info:
$ oc get pods -n openshift-multus
NAME                                READY   STATUS    RESTARTS   AGE
multus-22rpt                        1/1     Running   0          6h4m
multus-admission-controller-9b6tx   1/1     Running   0          6h10m
multus-admission-controller-b4g5l   1/1     Running   0          6h10m
multus-admission-controller-rh4bd   1/1     Running   0          6h10m
multus-b9n6r                        1/1     Running   0          6h10m
multus-g4bh7                        1/1     Running   0          6h10m
multus-sgrtj                        1/1     Running   0          6h10m
multus-xdc4s                        1/1     Running   0          6h4m

Comment 1 Douglas Smith 2019-10-07 12:54:10 UTC
Looks like what we're currently doing is only gating on create events -- so when you issue `oc patch` we don't catch that change.

We may need to implement a watch on update events, as well.

Comment 2 Douglas Smith 2019-10-07 17:22:40 UTC
I have a pull request open on the cluster-network-operator to change the ValidatingWebhookConfiguration to add the UPDATE operation to the list of operations on which it performs validation.

Just one thing that's a slight tweak to what you're doing for testing... This is how I tested:

```
oc patch network-attachment-definitions.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config":asdf }}' --type=merge
```

In the provided example `oc patch` -- we actually do support an empty spec.config field. This is used by KubeVirt/CNV (among other cases). So, the best way to verify this functionality is to use some bad JSON.

Thanks for the detailed information in the report, made it easy to diagnose.

Pull request available @ https://github.com/openshift/cluster-network-operator/pull/339

Comment 3 Anurag saxena 2019-10-08 00:04:57 UTC
Thanks,Doug for oc patch fix. Also, i am not sure if oc create is actually catching those errors. See below (despite of having errors in def, oc create just ceated the net-attach)


$cat def.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: '{
      "ersion": "0.3.0",  <<<<<<<<<<<<<<<<<<<<<<<it was CNIVersion
      "type": "macvlan",
      "aster": "eth0",    <<<<<<<<<<<<<<<<<<<<<<<it was master
      "ode": "bridge",    <<<<<<<<<<<<<<<<<<<<<<<it was mode
      "ipam": {
        "type": "dhcp"
      }
    }'
$ oc create -f def.yaml
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge created

Comment 5 Douglas Smith 2019-10-09 00:20:37 UTC
Passing it invalid JSON is a fair enough test to make sure that it's gating properly.

Unfortunately the Multus admission controller can't pick up on typographical errors like those you've reference in comment 3. Since the fields in CNI configurations are free-form and can be specified however they want for each CNI plugin (with the exception of a few fields, such as "type" and "cniversion", which can also be optional at times), we can't specifically pinpoint which fields needs to be present. So, this change of the names is not something this admission controller can pick up on, and therefore is expected functionality.

The flip side of this equation, however, is that we offer (for example) the "simpleMacVlan" capability from the cluster network operator, which automates the creation of these. So, while the manual creation of the custom resources can only provide so many fail safes on what it checks.

Comment 6 Anurag saxena 2019-10-09 21:07:24 UTC
Thanks Doug for clarifications regarding typographical errors. Make sense to me.

Also, I tested your PR on 4.3.0-0.ci-2019-10-09-161823. Seems like it still takes bad config via create and patch.

$ cat def.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: 'asdf'

$ oc create -f def.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge created

----------------------------------------------------------------------------------

$ cat mac.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "macvlan",
      "master": "ens3",
      "mode": "bridge",
      "ipam": {
        "type": "host-local",
        "subnet": "20.2.2.0/24",
        "rangeStart": "20.2.2.100",
        "rangeEnd": "20.2.2.200"
      }
    }'

$ oc create -f mac.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge created

$ oc patch networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config": "asdf"}}' --type=merge
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge patched

Comment 7 Douglas Smith 2019-10-10 18:05:21 UTC
Anurag -- thanks for the thorough testing on this one! I can also replicate the second portion where the patch doesn't correctly gate. Apparently what I saw is an error message that is before it's validated 

```
[centos@kube-singlehost-master hack]$ kubectl create -f def.yaml 
Error from server: error when creating "def.yaml": admission webhook "net-attach-def-admission-controller-validating-config.k8s.io" denied the request: invalid config: error parsing configuration: invalid character 'a' looking for beginning of value
[centos@kube-singlehost-master hack]$ kubectl patch networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config": asd}}' --type=merge
Error from server: Invalid JSON Patch
```

That is, the `Error from server: Invalid JSON Patch` is not from the admission controller (note in the first error message that it specifically calls out the admission controller).

I was not however able to replicate the first scenario where it failed on create (as above)

I'll be back in touch as soon as I know more about the patch scenario and/or your first problem regarding create.

Comment 8 Anurag saxena 2019-10-10 18:13:07 UTC
(In reply to Douglas Smith from comment #7)
> Anurag -- thanks for the thorough testing on this one! I can also replicate
> the second portion where the patch doesn't correctly gate. Apparently what I
> saw is an error message that is before it's validated 
> 
> ```
> [centos@kube-singlehost-master hack]$ kubectl create -f def.yaml 
> Error from server: error when creating "def.yaml": admission webhook
> "net-attach-def-admission-controller-validating-config.k8s.io" denied the
> request: invalid config: error parsing configuration: invalid character 'a'
> looking for beginning of value
> [centos@kube-singlehost-master hack]$ kubectl patch
> networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p
> '{"spec":{"config": asd}}' --type=merge
> Error from server: Invalid JSON Patch
> ```
> 
> That is, the `Error from server: Invalid JSON Patch` is not from the
> admission controller (note in the first error message that it specifically
> calls out the admission controller).
> 
> I was not however able to replicate the first scenario where it failed on
> create (as above)
> 
> I'll be back in touch as soon as I know more about the patch scenario and/or
> your first problem regarding create.

Thanks Doug for looking into this. I will also try to reproduce `oc create` issue again to make sure

Comment 9 Anurag saxena 2019-10-10 18:14:09 UTC
@weliang, Can you try to see if you are also seeing this?

$ cat def.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: 'asdf'

$ oc create -f def.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge created

Comment 10 Douglas Smith 2019-10-10 20:16:37 UTC
I have some more information... Apparently there may be a problem with TLS certificates. This may cause the admission controller to not gate at all. 

Looking at the logs I can see an error from an admission controller like:

```
./oc logs multus-admission-controller-hrdct --namespace=openshift-multus
I1010 19:47:15.851135       1 main.go:41] starting net-attach-def-admission-controller webhook server
2019/10/10 20:11:40 http: TLS handshake error from 10.130.0.1:35136: remote error: tls: bad certificate
```

This was from a recent CI build, 4.3.0-0.ci-2019-10-10-161743

```
./oc version
Client Version: v4.2.0-alpha.0-192-gab30f90
Server Version: 4.3.0-0.ci-2019-10-10-161743
Kubernetes Version: v1.16.0-beta.2+1e20c13
```

Comment 11 Douglas Smith 2019-10-11 14:19:14 UTC
I believe that the TLS certificate error is limited to 4.3 and may likely need its own BZ if this particular BZ is targeted at fixing the `oc patch` problems.

Using Anurag's def.yaml from above, with a 4.2 CI build, I successfully get:

```
$ ./oc version
Client Version: v4.2.0-alpha.0-87-gd5465d7
Server Version: 4.2.0-0.ci-2019-10-10-215838

$ ./oc create -f def.yaml 
Error from server: error when creating "def.yaml": admission webhook "multus-validating-config.k8s.io" denied the request: invalid config: error parsing configuration: invalid character 'a' looking for beginning of value
```

Also, for what it's worth, I hadn't back ported the referenced PR above into 4.2, I have a new PR open for that @ https://github.com/openshift/cluster-network-operator/pull/350

Comment 12 Anurag saxena 2019-10-11 18:16:51 UTC
Thanks Doug for figuring out the TLS cert issue involved with it. I will file a separate bug on that on 4.3. 
Meanwhile, i guess this can be verified on 4.2 only once PR get merged which i believe will land up in 4.2.z?

Comment 13 Douglas Smith 2019-10-11 18:35:55 UTC
Thanks for filing the 4.3 bug, appreciate it.

I believe this should land in the 4.2.z stream. Also, CNO PR #350 is complaining this doesn't target the correct release. So I'm going to change to 4.2.z -- although, anyone who knows BZ better than I do, please feel free to correct this change and/or inform me of how it should be setup.

Comment 14 Anurag saxena 2019-10-11 19:35:32 UTC
Sure, Doug. I guess i would wait for 4.3 nightly to land for proper bug filing or i guess can file on CI only

Comment 15 Casey Callendrello 2019-10-14 11:25:43 UTC
Hey Doug,
You should leave this bug as targeted for 4.3, and clone it for a 4.2.z backport. Once *this* bug is VERIFIED, you can merge the 4.2.z PR. Your 4.2 PR will need to reference the cloned bug.

Comment 16 Douglas Smith 2019-10-14 17:18:37 UTC
Appreciate the pointer there Casey, changed back to version 4.3.0

Comment 17 Casey Callendrello 2019-10-17 17:19:17 UTC
PR https://github.com/openshift/cluster-network-operator/pull/339 is merged.

Comment 19 Anurag saxena 2019-10-22 17:23:11 UTC
I can't verify this bug until resolution of TLS handshake error Bug 1762145. See comment 6 and comment 10. Bug 1762145 blocks this bug.

Comment 20 Douglas Smith 2019-10-25 12:10:01 UTC
Just an FYI that Bug 1762145 has a merged PR

Comment 21 Anurag saxena 2019-10-28 15:07:19 UTC
Verified on 4.3.0-0.nightly-2019-10-28-083944. Thanks for the fixes, this looks great now. Throwing expected errors


$ cat def.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: 'asdf'


$ oc create -f def.yaml 
Error from server: error when creating "def.yaml": admission webhook "multus-validating-config.k8s.io" denied the request: invalid config: error parsing configuration: invalid character 'a' looking for beginning of value

$ cat def.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge@$
spec:
  config: ''


$ oc create -f def.yaml 
The NetworkAttachmentDefinition "macvlan-bridge@$" is invalid: metadata.name: Invalid value: "macvlan-bridge@$": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')

Comment 22 Anurag saxena 2019-12-11 00:13:25 UTC
re-opening this as we are again taking bad config on 4.3.0-0.nightly-2019-12-09-035405

$ cat def.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-bridge
spec:
  config: 'asdf'


$ oc create -f def.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-bridge created

Comment 23 Anurag saxena 2019-12-11 14:48:58 UTC
Logs from admission controller while validating bad config

I1210 23:29:55.740963       1 webhook.go:111] validating network config spec: asdf
2019/12/10 23:29:55 http: panic serving 10.129.0.1:53696: assignment to entry in nil map
goroutine 31041 [running]:
net/http.(*conn).serve.func1(0xc000672000)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/net/http/server.go:1769 +0x139
panic(0x11bd820, 0x14aac10)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:522 +0x1b5
github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pkg/webhook.preprocessCNIConfig(0xc000608cb0, 0xe, 0xc000608d80, 0x4, 0x8, 0x8, 0x1, 0xc0006c1950, 0x710299, 0xc000672140)
	/go/src/github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/gopath/src/github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pkg/webhook/webhook.go:90 +0x197
github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pkg/webhook.validateNetworkAttachmentDefinition(0xc0007e78c0, 0x1b, 0xc0007e78a0, 0x12, 0xc000608cb0, 0xe, 0x0, 0x0, 0xc000608c98, 0x7, ...)
	/go/src/github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/gopath/src/github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pkg/webhook/webhook.go:118 +0x27e
github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pkg/webhook.ValidateHandler(0x14e5020, 0xc0004520e0, 0xc000254a00)
	/go/src/github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/gopath/src/github.com/k8snetworkplumbingwg/net-attach-def-admission-controller/pkg/webhook/webhook.go:403 +0x146
net/http.HandlerFunc.ServeHTTP(0x139e438, 0x14e5020, 0xc0004520e0, 0xc000254a00)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/net/http/server.go:1995 +0x44
net/http.(*ServeMux).ServeHTTP(0x2137ae0, 0x14e5020, 0xc0004520e0, 0xc000254a00)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/net/http/server.go:2375 +0x1d6
net/http.serverHandler.ServeHTTP(0xc0003ad5f0, 0x14e5020, 0xc0004520e0, 0xc000254a00)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/net/http/server.go:2774 +0xa8
net/http.(*conn).serve(0xc000672000, 0x14edc60, 0xc0004e8400)
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/net/http/server.go:1878 +0x851
created by net/http.(*Server).Serve
	/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/net/http/ser

Comment 26 Ben Bennett 2020-05-08 20:53:31 UTC
@doug - Can you take a look please?

Comment 27 Abhishek Jain 2020-05-09 16:37:22 UTC
Hi Team,

I am also getting similar error while pulling the image :

[root@oc-w3 ~]# podman image pull --authfile /var/lib/kubelet/kubeconfig quay.io/repository/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc
Trying to pull quay.io/repository/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc...
  invalid character 'a' looking for beginning of value
Error: error pulling image "quay.io/repository/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc": unable to pull quay.io/repository/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc: unable to pull image: Error initializing source docker://quay.io/repository/openshift-release-dev/ocp-release@sha256:039a4ef7c128a049ccf916a1d68ce93e8f5494b44d5a75df60c85e9e7191dacc: error getting username and password: error reading JSON file "/var/lib/kubelet/kubeconfig": error unmarshaling JSON at "/var/lib/kubelet/kubeconfig": invalid character 'a' looking for beginning of value

My OCP version is : 4.4.0

[root@oc-w3 ~]# oc version
Client Version: 4.4.0-202004250654-2576e48
Server Version: 4.4.0
Kubernetes Version: v1.17.1

Comment 29 Douglas Smith 2020-05-19 18:08:11 UTC
Setting to ON_QA. I believe this should be fixed in 4.5

Quick note: This BZ was moved from 4.4 to 4.5, so this should be validated on 4.5. Thanks!

Comment 30 Weibin Liang 2020-05-20 20:09:11 UTC
Tested and verified in 4.5.0-0.nightly-2020-05-20-053050

[weliang@weliang FILE]$ oc patch networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config":" " }}' --type=merge
Error from server: admission webhook "multus-validating-config.k8s.io" denied the request: configuration string is not in JSON format

[weliang@weliang FILE]$ oc patch networkattachmentdefinition.k8s.cni.cncf.io macvlan-bridge -p '{"spec":{"config":"{"cniVersion": "0.3.0", "type": "cvlan", "master": "eth", "mode": "bridge", "ipam": { "type": "dhcp" }}"}}' --type=merge
Error from server: Invalid JSON Patch

Comment 33 errata-xmlrpc 2020-07-13 17:11:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.