Description of problem: Upgrade ocp v3.9 to v3.10. Upgrade failed at task[openshift_node : Wait for node to be ready] due to /usr/bin/openshift-node-config was not available, which caused by atomic-openshift-node package installed unsuccessfully for any reason(negative scenario, and upgrade fail was expected). But checked upgrade log, shows that a pre TASK [openshift_node : download new node packages] was failed earlier before some fatal/unnecessary changes happen. and installer did not catch this failure well. TASK [openshift_node : download new node packages] ***************************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/upgrade/rpm_upgrade.yml:9 Tuesday 22 May 2018 06:45:33 +0000 (0:00:00.036) 0:08:51.121 *********** changed: [x.x.x.x] => {"attempts": 1, "changed": true, "cmd": ["yum", "install", "-y", "--downloadonly", "atomic-openshift-node-3.10*", "atomic-openshift-clients-3.10*", "PyYAML"], "delta": "0:00:05.438444", "end": "2018-05-22 02:46:45.628826", "failed": false, "rc": 0, "start": "2018-05-22 02:46:40.190382", "stderr": "", "stderr_lines": [], "stdout": "Loaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is not registered with an entitlement server. You can use subscription-manager to register.\nNo package atomic-openshift-node-3.10* available.\nNo package atomic-openshift-clients-3.10* available.\nPackage PyYAML-3.10-11.el7.x86_64 already installed and latest version\nNothing to do", "stdout_lines": ["Loaded plugins: product-id, search-disabled-repos, subscription-manager", "This system is not registered with an entitlement server. You can use subscription-manager to register.", "No package atomic-openshift-node-3.10* available.", "No package atomic-openshift-clients-3.10* available.", "Package PyYAML-3.10-11.el7.x86_64 already installed and latest version", "Nothing to do"]} Two many unnecessary/fatal changes have been made after task openshift_node : download new node packages failed, which caused original ocp can not work. /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/upgrade/rpm_upgrade_install.yml:11 openshift_node : download new node packages ----------------------------- 7.46s /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/upgrade/rpm_upgrade.yml:9 openshift_node : Remove old service information ------------------------- 6.52s /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/upgrade/config_changes.yml:48 openshift_node : Uninstall openvswitch ---------------------------------- 5.69s /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/upgrade/config_changes.yml:42 openshift_node : Configure Node settings -------------------------------- 5.27s /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/config/configure-node-settings.yml:2 openshift_node_group : create node config template ---------------------- 4.02s /usr/share/ansible/openshift-ansible/roles/openshift_node_group/tasks/create_config.yml:22 openshift_excluder : Get available excluder version --------------------- 3.99s /usr/share/ansible/openshift-ansible/roles/openshift_excluder/tasks/verify_excluder.yml:4 openshift_node : Install Node service file ------------------------------ 3.93s /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/systemd_units.yml:8 openshift_node_group : create node-config.yaml configmap ---------------- 3.92s /usr/share/ansible/openshift-ansible/roles/openshift_node_group/tasks/create_config.yml:50 Version-Release number of the following components: openshift-ansible-3.10.0-0.50.0.git.0.bd68ade.el7.noarch How reproducible: always Steps to Reproduce: 1. Do upgrade against rpm ocp with openshift_enable_openshift_excluder=false(negative scenario) 2. 3. Actual results: Two many unnecessary/fatal config have been changed before stop upgrade when check some errors in pre task. Expected results: Should break upgrade earlier when fail to download new node packages Additional info: Please attach logs from ansible-playbook with the -vvv flag
The problem here is that the yum module is not returning a failure. I guess we need to add a failed_when that checks for "No package (.*) available" in the output. Tim do you mind taking a look at this?
PR created for this https://github.com/openshift/openshift-ansible/pull/8481
Scott, the PR was against master, does this need to be backported to any branches?
No, master is fine.
Verified on openshift-ansible-3.10.0-0.54.0.git.0.537c485.el7.noarch TASK [openshift_node : download new node packages] ***************************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/upgrade/rpm_upgrade.yml:16 Thursday 31 May 2018 08:23:27 +0000 (0:00:00.037) 0:05:59.890 ********** fatal: [qx]: FAILED! => {"attempts": 1, "changed": true, "cmd": ["yum", "install", "-y", "--downloadonly", "atomic-openshift-3.10*", "atomic-openshift-hyperkube-3.10*", "atomic-openshift-node-3.10*", "atomic-openshift-clients-3.10*"], "delta": "0:00:10.078409", "end": "2018-05-31 04:25:44.047840", "failed": true, "failed_when_result": true, "rc": 0, "start": "2018-05-31 04:25:33.969431", "stderr": "", "stderr_lines": [], "stdout": "Loaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is not registered with an entitlement server. You can use subscription-manager to register.\nNo package atomic-openshift-3.10* available.\nNo package atomic-openshift-node-3.10* available.\nNo package atomic-openshift-clients-3.10* available.\nResolving Dependencies\n--> Running transaction check\n---> Package atomic-openshift-hyperkube.x86_64 0:3.10.0-0.54.0.git.0.00a8b84.el7 will be installed\n--> Finished Dependency Resolution\n\nDependencies Resolved\n\n================================================================================\n Package Arch Version Repository Size\n================================================================================\nInstalling:\n atomic-openshift-hyperkube\n x86_64 3.10.0-0.54.0.git.0.00a8b84.el7 aos_addon3_10 33 M\n\nTransaction Summary\n================================================================================\nInstall 1 Package\n\nTotal download size: 33 M\nInstalled size: 229 M\nBackground downloading packages, then exiting:\nexiting because \"Download Only\" specified", "stdout_lines": ["Loaded plugins: product-id, search-disabled-repos, subscription-manager", "This system is not registered with an entitlement server. You can use subscription-manager to register.", "No package atomic-openshift-3.10* available.", "No package atomic-openshift-node-3.10* available.", "No package atomic-openshift-clients-3.10* available.", "Resolving Dependencies", "--> Running transaction check", "---> Package atomic-openshift-hyperkube.x86_64 0:3.10.0-0.54.0.git.0.00a8b84.el7 will be installed", "--> Finished Dependency Resolution", "", "Dependencies Resolved", "", "================================================================================", " Package Arch Version Repository Size", "================================================================================", "Installing:", " atomic-openshift-hyperkube", " x86_64 3.10.0-0.54.0.git.0.00a8b84.el7 aos_addon3_10 33 M", "", "Transaction Summary", "================================================================================", "Install 1 Package", "", "Total download size: 33 M", "Installed size: 229 M", "Background downloading packages, then exiting:", "exiting because \"Download Only\" specified"]}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816