Bug 1616150 - [3.7] [RHEL-7.6] Failed to execute iptables-restore: exit status 1 (iptables-restore: invalid option -- '5'
Summary: [3.7] [RHEL-7.6] Failed to execute iptables-restore: exit status 1 (iptables-...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.7.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.7.z
Assignee: Jacob Tanenbaum
QA Contact: Weihua Meng
URL:
Whiteboard:
: 1651436 (view as bug list)
Depends On:
Blocks: 1632744
TreeView+ depends on / blocked
 
Reported: 2018-08-15 06:17 UTC by Gan Huang
Modified: 2019-03-20 19:42 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1632744 (view as bug list)
Environment:
Last Closed: 2018-11-21 11:56:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3677791 0 None None None 2018-11-09 18:46:23 UTC
Red Hat Product Errata RHSA-2018:2906 0 None None None 2018-11-21 11:56:49 UTC

Description Gan Huang 2018-08-15 06:17:43 UTC
Description of problem:
Installation against RHEL-7.6 beta and OCP v3.7.61, consequently the router pod failed to start up:

<--snip-->
  37s		37s		1	kubelet, qe-ghuang-merrn-1			Warning		FailedCreatePodSandBox	Failed create pod sandbox: rpc error: code = 2 desc = NetworkPlugin cni failed to set up pod "router-1-deploy_default" network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 1 (iptables-restore: invalid option -- '5'
Try `iptables-restore -h' for more information.

<--snip-->

Version-Release number of selected component (if applicable):
openshift v3.7.61
kubernetes v1.7.6+a08f5eeb62
iptables-1.4.21-28.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. Trigger 3.7 installation on RHEL-7.6 beta (same results no matter it's firewalld or iptables)


Actual results:
Installation failed due to router pod failed

  37s		37s		1	kubelet, qe-ghuang-merrn-1			Warning		FailedCreatePodSandBox	Failed create pod sandbox: rpc error: code = 2 desc = NetworkPlugin cni failed to set up pod "router-1-deploy_default" network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 1 (iptables-restore: invalid option -- '5'
Try `iptables-restore -h' for more information.

Expected results:


Additional info:
Tested against OCP 3.6/3.9.3.10, do not hit the issue.

Tested with OCP 3.7 + RHEL-7.5 (iptables-1.4.21-24.1.el7_5.x86_64), also worked fine.

Comment 1 Gan Huang 2018-08-15 06:21:24 UTC
router pod can be re-deployed successfully after downgrading iptables to iptables-1.4.21-24.1.el7_5.x86_64

Adding test blocker as it's blocking the OCP 3.7 testing against RHEL-7.6

Comment 3 Eric Garver 2018-08-16 15:47:45 UTC
Can you show what is being passed to iptables-restore on stdin?

Comment 4 Phil Sutter 2018-08-16 16:23:44 UTC
This looks like fallout from Bug 1465078 for which iptables-restore argument parser was changed to not ignore unknown parameters given on command line.

Gan Huang, could you please find out how exactly iptables-restore is being called, i.e. what parameters are passed to the command?

Thanks, Phil

Comment 5 Gan Huang 2018-08-17 01:37:29 UTC
Unfortunately I don't know how to check that, the error was threw out by Kubernetes.

A very similar issue found on upstream:
https://github.com/kubernetes/kubernetes/issues/58956

OpenShift networking team should have proper input here.

Comment 7 Meng Bo 2018-08-17 02:36:15 UTC
# iptables-restore -w5
iptables-restore: invalid option -- '5'
Try `iptables-restore -h' for more information.

I can get the same error with the command above.

And from the openshift iptables.go code here:
https://github.com/openshift/origin/blob/release-3.7/vendor/k8s.io/kubernetes/pkg/util/iptables/iptables.go#L123

Seems the openshift failed to judge the iptables version correctly.

Comment 8 Phil Sutter 2018-08-17 12:44:12 UTC
Hi Meng Bo,

(In reply to Meng Bo from comment #7)
> # iptables-restore -w5

Ah yes, that's what I suspected. All iptables tools use getopt(), so '-w5' is equivalent to '-w -5' and there is no '-5' flag.

> iptables-restore: invalid option -- '5'
> Try `iptables-restore -h' for more information.
> 
> I can get the same error with the command above.
> 
> And from the openshift iptables.go code here:
> https://github.com/openshift/origin/blob/release-3.7/vendor/k8s.io/
> kubernetes/pkg/util/iptables/iptables.go#L123

Looking at the comments to that PR, it seems like people start to forget how unix program parameters typically work. :)

An easier solution than the one pointed out there would be to just pass '--wait=5' instead of '-w5'. That should reduce the change set considerably.

> Seems the openshift failed to judge the iptables version correctly.

Maybe I don't get your point here, but '-w5' has never worked and will never work. It's just wrong syntax.

Cheers, Phil

Comment 9 Casey Callendrello 2018-08-17 12:56:12 UTC
Not only that, but iptables-restore didn't get "--wait" support until v1.6.2. The version logic linked is for iptables, not iptables-restore.

Comment 10 Phil Sutter 2018-08-17 13:07:19 UTC
(In reply to Casey Callendrello from comment #9)
> Not only that, but iptables-restore didn't get "--wait" support until
> v1.6.2. The version logic linked is for iptables, not iptables-restore.

In fact, RHEL7 supports --wait option for iptables-restore since iptables-1.4.21-18.el7. The relevant bug is 1438597.

Cheers, Phil

Comment 11 Casey Callendrello 2018-09-19 11:35:34 UTC
CC Dan Winship.

Dan, should we backport https://github.com/kubernetes/kubernetes/pull/60978 ?

Comment 12 Dan Winship 2018-09-19 13:04:22 UTC
Doh. Yes. We backported the fix as far back as OCP 3.9 because the bug was introduced in kube 1.9, but I forgot that we had backported the buggy kube 1.9 code into OCP 3.7 too.

Comment 13 Casey Callendrello 2018-09-20 13:09:40 UTC
Assigning to Jacob.

Comment 16 Weihua Meng 2018-10-08 10:18:30 UTC
fixed.

openshift v3.7.65

iptables-1.4.21-28.el7.x86_64

Kernel Version: 3.10.0-957.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo)

Comment 22 Casey Callendrello 2018-11-20 10:45:14 UTC
*** Bug 1651436 has been marked as a duplicate of this bug. ***

Comment 24 Dan Winship 2018-11-20 13:42:56 UTC
3.10 and later never had the bug; it was fixed upstream before that

Comment 25 errata-xmlrpc 2018-11-21 11:56:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2906


Note You need to log in before you can comment on or make changes to this bug.