Bug 1968625
| Summary: | Pods using sr-iov interfaces failign to start for Failed to create pod sandbox | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Federico Paolinelli <fpaoline> |
| Component: | Networking | Assignee: | Douglas Smith <dosmith> |
| Networking sub component: | multus | QA Contact: | zhaozhanqi <zzhao> |
| Status: | CLOSED ERRATA | Docs Contact: | jfrye |
| Severity: | urgent | ||
| Priority: | urgent | CC: | dosmith, jfrye, yjoseph, ykashtan |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
RN content:
Previously, refactoring for a shadowed variable caused a regression related to the use of the checkpoint file, and SR-IOV pod sandboxes would not start up. A check for the path of the kubelet socket was not properly accounted for during the refactor. The fix properly restores the check for the kubelet socket path, and now the SR-IOV pod sandboxes are properly created.
------
Cause: Refactoring for a shadowed variable caused a regression related to the use of the checkpoint file.
Consequence: SRIOV pods sandboxes wouldn't come up.
Fix: Posted upstream @ https://github.com/k8snetworkplumbingwg/multus-cni/pull/683/files
Result: SRIOV pod sandboxes are properly created.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 23:11:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Federico Paolinelli
2021-06-07 17:07:41 UTC
This sure smells a lot like this upstream issue regarding the checkpoint file in k8s 1.21+: https://github.com/k8snetworkplumbingwg/multus-cni/issues/665 This k8s version may have just hit the nightlies. Turns out this was caused by a regression after other changes were merged for Multus, Federico pointed out this commit in particular: Big thanks to Federico and Peng for identifying it, and Peng for posting the fix. upstream PR posted @ https://github.com/k8snetworkplumbingwg/multus-cni/pull/683/files Tomo working to get it merged upstream and we'll backport it downstream. Verified this bug on 4.8.0-0.nightly-2021-06-10-014052
# oc exec -n l044j testpod1sgzn2 -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if44: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
link/ether 0a:58:0a:83:00:10 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.131.0.16/23 brd 10.131.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fd01:0:0:1::10/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::858:aff:fe83:10/64 scope link
valid_lft forever preferred_lft forever
29: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether ca:fe:c0:ff:ee:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.2.206/24 brd 192.168.2.255 scope global net1
valid_lft forever preferred_lft forever
inet6 2001::2/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::c8fe:c0ff:feff:ee01/64 scope link
valid_lft forever preferred_lft forever
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |