Description of problem:
The CFME Update Root Password method runs, outputs success in the logs, but Dynflow shows it as failed, and attempts to resume/re-run fail because the password has been changed.
Version-Release number of selected component (if applicable):
D, [2017-01-18T13:32:23.147884 #19110] DEBUG -- : ================ UpdateRootPassword run method ====================
I, [2017-01-18T13:32:23.148024 #19110] INFO -- : Updating CFME password...
D, [2017-01-18T13:32:24.563147 #19110] DEBUG -- : =========== completed entered =============
I, [2017-01-18T13:32:24.563413 #19110] INFO -- : Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
Resumed, it fails:
D, [2017-01-18T14:24:41.844780 #19110] DEBUG -- : ================ UpdateRootPassword run method ====================
I, [2017-01-18T14:24:41.844924 #19110] INFO -- : Updating CFME password...
D, [2017-01-18T14:24:44.229068 #19110] DEBUG -- : =========== failed entered =============
D, [2017-01-18T14:24:44.229179 #19110] DEBUG -- : =========== failed exited =============
D, [2017-01-18T14:24:44.229289 #19110] DEBUG -- : =========== completed entered =============
E, [2017-01-18T14:24:44.229515 #19110] ERROR -- : Password was not updated. Error: Authentication failed for user firstname.lastname@example.org
The above repeats for a number of attempts.
Please test this again with the latest compose and if it fails assign it to me. There is a PR that may have fixed this in addition to another issue.
I encountered the same failure Friday from the 20170118.t.1 build. Will try on latest 20170120.t.0 shortly.
After a little more digging I found that the root password update changes for multi-provider were pretty well flawed. I am testing a PR that should fix it. I will update this when a PR is created, which should hopefully be by this afternoon.
I'm still encountering the same bug even with the latest 20170123.t.0 images for both QCI-1.1 and QCIOOO-10
I checked, and the PR mentioned above *is* reflected in the ISO...
I can leave my environment in its current state if anybody would like to look in and pull logs.
Can confirm the PR is in QCI-1.1-RHEL-7-20170123.t.0, moving back to assigned.
It looks like the problem is in my environment's networking. I'm back to the drawing board on my environment setup, and most likely this bug (if it was a bug) is firmly squashed.
There was definitely a bug that could cause this error as well, though in that case it's because it didn't retry the second host password change if it failed the first time, rather than retrying 30 times and finally giving up.
It was especially hard to see in the original case because both the RHV and OSP CFME instances were being updated in one task and as you saw if one succeeded and one failed running the task would bomb out on the host with the now updated password. Now we use two tasks so it's a little clearer to get to the bottom of what's going on.
From my deployment today, as far as I can see the bug is fixed, but one successful run does not make a pattern so I'll move it to ON_QA so it gets a little more testing by QE before we close it.
I have not seen this bug re-occur in my testing. If you do see it re-occur then please re-open this bug.