Bug 1710226

Summary: Upgrade RHEL node failed due to iptables rules changes
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Johnny Liu <jialiu>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: amurdaca, bbennett, danw, gpei, vrutkovs, walters, wking
Version: 4.1.0   
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The bootstrap MCS endpoint was restricted for existing nodes to improve security. Consequence: Nodes were unable to retrieve configs during upgrade. Fix: Tasks were updated to retrieve the ignition config from the rendered worker machine config and use it to update configs for the node. Result: Upgrades for RHEL nodes complete with successful config updates.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:48:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1708605    

Description Weihua Meng 2019-05-15 06:59:37 UTC
Description of problem:
Upgrade RHEL node failed due to iptables rules changes
likely caused by PR https://github.com/openshift/origin/pull/22821
rules: [][]string{
				{"-p", "tcp", "-m", "tcp", "--dport", "22623", "-j", "REJECT"},
				{"-p", "tcp", "-m", "tcp", "--dport", "22624", "-j", "REJECT"},
 
Version-Release number of the following components:
openshift-ansible-4.1.0-201905132354.git.156.3c52fcf.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. upgrade RHEL7 node in OCP4 cluster by upgrade.yml playbook


Actual results:
upgrade failed at TASK [openshift_node : Wait for bootstrap endpoint to show up]
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/config.yml:29

fatal: [dell-r730-068.dsal.lab.eng.rdu2.redhat.com]: FAILED! => {
    "attempts": 60,
    "changed": false,
    "content": "",
    "invocation": {
        "module_args": {
            "attributes": null,
            "backup": null,
            "body": null,
            "body_format": "raw",
            "client_cert": null,
            "client_key": null,
            "content": null,
            "creates": null,
            "delimiter": null,
            "dest": null,
            "directory_mode": null,
            "follow": false,
            "follow_redirects": "safe",
            "force": false,
            "force_basic_auth": false,
            "group": null,
            "headers": {},
            "http_agent": "ansible-httpget",
            "method": "GET",
            "mode": null,
            "owner": null,
            "regexp": null,
            "remote_src": null,
            "removes": null,
            "return_content": false,
            "selevel": null,
            "serole": null,
            "setype": null,
            "seuser": null,
            "src": null,
            "status_code": [
                200
            ],
            "timeout": 30,
            "unsafe_writes": null,
            "url": "https://10.1.10.122:22623/config/worker",
            "url_password": null,
            "url_username": null,
            "use_proxy": true,
            "validate_certs": false
        }
    },
    "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection refused>",
    "redirected": false,
    "status": -1,
    "url": "https://10.1.10.122:22623/config/worker"
}


Expected results:
Upgrade success.

Comment 1 Colin Walters 2019-05-15 12:43:13 UTC
Hum...so the Ansible upgrade is running as a pod?

One workaround is to make the whole thing hostNetwork.

Slightly more elaborate fix: have a transient hostNetwork container as part of the pod - that container fetches the config, and then the rest of the code can just read it.

Comment 3 Dan Winship 2019-05-15 12:57:46 UTC
(In reply to Colin Walters from comment #1)
> One workaround is to make the whole thing hostNetwork.

No, we block hostNetwork pods from accessing the MCS too.

Why does the upgrade need to do this? Everyone said that 22623 was only for provisioning new nodes. At this point, everything the upgrade could learn from the MCS should already be present on the node itself.

Comment 4 Colin Walters 2019-05-15 14:28:38 UTC
>  Everyone said that 22623 was only for provisioning new nodes.

Everyone was thinking of RHCOS...

>  At this point, everything the upgrade could learn from the MCS should already be present on the node itself.

Kind of, I think more the upgrade should be fetching the MachineConfig object from the cluster and not Ignition.

Comment 5 Antonio Murdaca 2019-05-15 15:46:54 UTC
(In reply to Colin Walters from comment #4)
> >  Everyone said that 22623 was only for provisioning new nodes.
> 
> Everyone was thinking of RHCOS...
> 
> >  At this point, everything the upgrade could learn from the MCS should already be present on the node itself.
> 
> Kind of, I think more the upgrade should be fetching the MachineConfig
> object from the cluster and not Ignition.

adding Vadim, is the above from Colin something the BYOH playbooks can do instead of reaching the MCS directly?

Comment 6 Vadim Rutkovsky 2019-05-15 15:55:13 UTC
(In reply to Colin Walters from comment #1)
> Hum...so the Ansible upgrade is running as a pod?

Scaleup is fetching ign file from MCS to run MCO in once-from mode bootstrap the node.

Seems it should temporarily disable firewall on masters to fetch ignition file and enable it back after this change. Is there any other cloud-neutral way to get worker ignition file to scaleup the node?

Comment 7 Antonio Murdaca 2019-05-15 16:08:29 UTC
(In reply to Vadim Rutkovsky from comment #6)
> (In reply to Colin Walters from comment #1)
> > Hum...so the Ansible upgrade is running as a pod?
> 
> Scaleup is fetching ign file from MCS to run MCO in once-from mode bootstrap
> the node.

Since --once-from supports also raw MachineConfig(s), w/o going through the MCS, but just use the cluster. So we could do:

oc get -o yaml machineconfig $(oc get machineconfigpool worker -o jsonpath='{.status.configuration.name}') > workerMC.yaml

mcd --once-from=./workerMC.yaml

> 
> Seems it should temporarily disable firewall on masters to fetch ignition
> file and enable it back after this change. Is there any other cloud-neutral
> way to get worker ignition file to scaleup the node?

Comment 8 Antonio Murdaca 2019-05-15 16:11:45 UTC
(In reply to Antonio Murdaca from comment #7)
> (In reply to Vadim Rutkovsky from comment #6)
> > (In reply to Colin Walters from comment #1)
> > > Hum...so the Ansible upgrade is running as a pod?
> > 
> > Scaleup is fetching ign file from MCS to run MCO in once-from mode bootstrap
> > the node.
> 
> Since --once-from supports also raw MachineConfig(s), w/o going through the
> MCS, but just use the cluster. So we could do:
> 
> oc get -o yaml machineconfig $(oc get machineconfigpool worker -o
> jsonpath='{.status.configuration.name}') > workerMC.yaml
> 
> mcd --once-from=./workerMC.yaml

You will need to pass a kubeconfig to the MCD as well in order to reach the apiserver (--kubeconfig)

> 
> > 
> > Seems it should temporarily disable firewall on masters to fetch ignition
> > file and enable it back after this change. Is there any other cloud-neutral
> > way to get worker ignition file to scaleup the node?

Comment 9 Russell Teague 2019-05-15 18:47:27 UTC
WIP PR: https://github.com/openshift/openshift-ansible/pull/11614

Comment 10 Antonio Murdaca 2019-05-15 19:32:16 UTC
(In reply to Vadim Rutkovsky from comment #6)
> (In reply to Colin Walters from comment #1)
> > Hum...so the Ansible upgrade is running as a pod?
> 
> Scaleup is fetching ign file from MCS to run MCO in once-from mode bootstrap
> the node.
> 
> Seems it should temporarily disable firewall on masters to fetch ignition
> file and enable it back after this change. Is there any other cloud-neutral
> way to get worker ignition file to scaleup the node?

By looking at what we have now, we believe it would be much simpler to do the above based on https://github.com/openshift/origin/pull/22821#issuecomment-491947640

The reason is the BYOH playbooks always ran the MCD with onceFrom=Ignition_config_from_MCS and the MachineConfig path has some downsides. Also, the Ignition from MCS is what we've been testing for a while now and I honestly don't feel comfortable switching to something else _today_.

Can the playbook just disable the firewall and re-enable it?

Comment 11 Colin Walters 2019-05-15 19:35:57 UTC
Another alternative is to fetch the MachineConfig from the cluster, and then extract the Ignition from it.  That should be pretty easy.

Comment 12 Antonio Murdaca 2019-05-15 19:38:38 UTC
(In reply to Colin Walters from comment #11)
> Another alternative is to fetch the MachineConfig from the cluster, and then
> extract the Ignition from it.  That should be pretty easy.

we're still missing some files that the MCS injects when serving the request for a pool though (like the initial node annotations and the kubeconfig).

Comment 13 Dan Winship 2019-05-15 20:50:48 UTC
> Can the playbook just disable the firewall and re-enable it?

Yes. Assuming you're running as root, not in a pod,

    iptables -F OPENSHIFT-BLOCK-OUTPUT

should temporarily reopen access from the root netnamespace. If openshift-sdn is still running on the node at that point, it will eventually re-add the firewall rules (where "eventually" might mean "0.005 seconds later" if your timing is unlucky).

Comment 14 Antonio Murdaca 2019-05-16 10:37:46 UTC
(In reply to Dan Winship from comment #13)
> > Can the playbook just disable the firewall and re-enable it?
> 
> Yes. Assuming you're running as root, not in a pod,
> 
>     iptables -F OPENSHIFT-BLOCK-OUTPUT
> 
> should temporarily reopen access from the root netnamespace. If
> openshift-sdn is still running on the node at that point, it will eventually
> re-add the firewall rules (where "eventually" might mean "0.005 seconds
> later" if your timing is unlucky).

Looks like removing that rule, given the speed sdn puts it back, isn't helpful though. The playbook still needs to fetch the Ignition from the MCS and I doubt it can be done in ~0.005)

Comment 15 Antonio Murdaca 2019-05-16 10:59:09 UTC
oc get -o json machineconfig $(oc get machineconfigpool worker -o jsonpath='{.status.configuration.name}') | jq '.spec.config'

^ The above does what Colin suggested and extracts the Ignition config from the MachineConfig. We can see if the above works for upgrade, given we're going to miss: 1) Initial node annotation file, which we don't need anyway on upgrade, and 2) kubeconfig on the node, which we might not need either.

Let's see if the above works and we keep running the MCD in once-from with ignition that way.

Comment 16 Dan Winship 2019-05-16 13:28:45 UTC
Sorry, I didn't mean it will *normally* replace it that quickly. It's just that it periodically resyncs the rules, and if you're unlucky, you might happen to run right before the periodic resync happens. So you'd need to have a retry-on-failure. Or better yet, stop the SDN before running. (The node is going to get rebooted anyway, right?)

Alternatively, I guess you could do:

    iptables -I OUTPUT -p tcp -m tcp --dport 22623 -j ACCEPT

then grab the data, then do

    iptables -D OUTPUT -p tcp -m tcp --dport 22623 -j ACCEPT

to clean up after. ("-I" means "prepend", so this would be adding a rule to accept the traffic before the existing rule to reject it could run). Then it wouldn't matter what other rules existed.

Comment 17 Russell Teague 2019-05-16 14:56:16 UTC
release-4.1: https://github.com/openshift/openshift-ansible/pull/11615

Comment 19 Weihua Meng 2019-05-17 04:30:04 UTC
Fixed.

openshift-ansible-4.1.0-201905161641.git.158.458bd44.el7.noarch

Comment 21 errata-xmlrpc 2019-06-04 10:48:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758