Bug 1846922

Summary: "downloads" pod does not work on the node which is disabled IPv6
Product: OpenShift Container Platform Reporter: Daein Park <dapark>
Component: Management ConsoleAssignee: Daein Park <dapark>
Status: CLOSED ERRATA QA Contact: Yadan Pei <yapei>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4CC: aos-bugs, jokerman, spadgett, wking, yapei
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: when IPv6 is disabled, "downloads" pod socket cannot bind. Consequence: "downloads" pod crashed. Fix: If IPv6 is not enabled, IPv4 will use for the socket. Result: "downloads" pod can work regardless of enabling IPv4 and IPv6.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:07:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1847523    

Description Daein Park 2020-06-15 08:53:38 UTC
Description of problem:

When the "downloads" pod scheduled on the worker node which is disabled IPv6, the pods does not work with the following log messages.

~~~
$ oc get pod -n openshift-console -o wide
NAME                             READY   STATUS             RESTARTS   AGE     IP            NODE                      NOMINATED NODE   READINESS GATES
pod/console-1111111111-aaaaa     1/1     Running            0          8d      10.129.0.7    node02.ocp4.example.com   <none>           <none>
pod/console-1111111111-bbbbb     1/1     Running            0          8d      10.128.0.12   node03.ocp4.example.com   <none>           <none>
pod/downloads-aaaaaaaaaa-xxxxx   0/1     CrashLoopBackOff   5          1m54s   10.129.2.18   node03.ocp4.example.com   <none>           <none>
pod/downloads-aaaaaaaaaa-yyyyy   0/1     CrashLoopBackOff   5          3m14s   10.129.2.68   node02.ocp4.example.com   <none>           <none>

$ oc logs pod/downloads-aaaaaaaaaa-xxxxx --container=download-server --timestamps
2020-06-11T00:47:48.113162338Z serving from /tmp/tmpC8Eoy8
2020-06-11T00:47:48.11326402Z Traceback (most recent call last):
2020-06-11T00:47:48.11326402Z   File "/tmp/serve.py", line 59, in <module>
2020-06-11T00:47:48.11326402Z     sock = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
2020-06-11T00:47:48.11326402Z   File "/usr/lib64/python2.7/socket.py", line 187, in __init__
2020-06-11T00:47:48.11326402Z     _sock = _realsocket(family, type, proto)
2020-06-11T00:47:48.11326402Z socket.error: [Errno 97] Address family not supported by protocol
~~~

Version-Release number of selected component (if applicable):

OCP 4.4.4

How reproducible:

You can always reproduce this issue after the worker node is disabled IPv6 for running the "downloads" pods.

Steps to Reproduce:
1.
2.
3.

Actual results:

"downloads" pod does not work when the node is disabled IPv6.

Expected results:

"downloads" pod work well regardless of the IPv6 enabled or disabled.

Additional info:

Comment 4 Samuel Padgett 2020-06-15 17:33:44 UTC
*** Bug 1795325 has been marked as a duplicate of this bug. ***

Comment 7 Yadan Pei 2020-07-27 05:41:40 UTC
Hi,

As I understand that we need following steps to verify this bug

1) Add RHEL worker nodes to the cluster(IPv4 cluster is ok)
2) Disable IPv6 on the RHEL worker node
3) Re-schedule downloads pods to the new added RHEL worker node to see if downloads pods are running

Is it correct? Can you also please give some guidance about how to disable IPv6 for a RHEL worker node?

Comment 8 Daein Park 2020-07-28 01:49:41 UTC
@Yadan hi,

> Is it correct? Can you also please give some guidance about how to disable IPv6 for a RHEL worker node?

You need not to add new worker. Just add "ipv6.disable=1" to kernel parameter when existing any worker node reboot, then you can disable IPv6 on the node.

Comment 11 errata-xmlrpc 2020-10-27 16:07:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196