When doing a package update, we set the UpdateIdentifier parameter to a unique value, this triggers the SoftwareDeployment and yum_update.sh also has a check to make sure UpdateIdentifier is not empty and is a new unique value it hasn't seen before, and if so, it proceeds with updating packages. Assuming the package update is successful, if you then immediately try a scaling attempt after, it will likely fail. UpdateIdentifier is still set in the saved Heat environment (as expected), when the new node you're scaling out comes up for the first time, UpdateDeployment is triggered, and it sees a new value (to this node) of UpdateIdentifier. It then tries a yum update. That fails b/c no repos are configured on the node yet. Even when using rhel registration, which is enabled via the NodeExtraConfig resource, this will fail because there is no depends_on set on UpdateDeployment for NodeExtraConfig, so there is no guarantee any repos would even be enabled yet. The error from UpdateDeployment looks like: { "status": "FAILED", "server_id": "853ce223-2051-4cb5-868c-7cf72c312c2b", "config_id": "51cc12e8-b1bf-4b2a-b318-a977b1fc1a30", "output_values": { "deploy_stdout": "Started yum_update.sh on server 853ce223-2051-4cb5-868c-7cf72c312c2b at Thu Dec 10 21:33:33 EST 2015\nExcluding upgrading packages that are handled by config management tooling\nRunning: yum -y update --skip-broken\nLoaded plugins: product-id, subscription-manager\nThis system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.\nyum return code: 1\nFinished yum_update.sh on server 853ce223-2051-4cb5-868c-7cf72c312c2b at Thu Dec 10 21:33:36 EST 2015\n", "deploy_stderr": "cat: /var/lib/tripleo/installed-packages/*: No such file or directory\nThere are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n", "update_managed_packages": "true", "deploy_status_code": 1 }, "creation_time": "2015-12-11T02:32:26Z", "updated_time": "2015-12-11T02:33:38Z", "input_values": {}, "action": "CREATE", "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1", "id": "974ac676-67e1-4a1c-947d-92ed68b375f0" }
James, what is the severity here? Does it affect each scale out after update? Thanks
to verify: deploy with 7.0 undercloud and 7.0 overcloud, HA, net-iso. Update undercloud to 7.2, update overcloud to 7.2. After the update completes successfully, attempt scale out of compute nodes. The compute node should scale out fine and should not run any yum update. You could verify this by looking in the journalctl for os-collect-config or /var/log/yum.log on the new compute nodes. repeat, but start at 7.1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:2650