Bug 1414546 - UpdateRootPassword step marked as failed, but seems to succeed
Summary: UpdateRootPassword step marked as failed, but seems to succeed
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - CloudForms
Version: 1.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 1.1
Assignee: Jason Montleon
QA Contact: Sudhir Mallamprabhakara
Dan Macpherson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-18 19:56 UTC by Chandler Wilkerson
Modified: 2020-01-08 16:38 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Chandler Wilkerson 2017-01-18 19:56:21 UTC
Description of problem:

The CFME Update Root Password method runs, outputs success in the logs, but Dynflow shows it as failed, and attempts to resume/re-run fail because the password has been changed.

Version-Release number of selected component (if applicable):
ISO QCI-1.1-20170118.t.0

How reproducible:

Always

Additional info:
From deployment.log

Original run:

D, [2017-01-18T13:32:23.147884 #19110] DEBUG -- : ================ UpdateRootPassword run method ====================
I, [2017-01-18T13:32:23.148024 #19110]  INFO -- : Updating CFME password...
D, [2017-01-18T13:32:24.563147 #19110] DEBUG -- : =========== completed entered =============
I, [2017-01-18T13:32:24.563413 #19110]  INFO -- : Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.

Resumed, it fails:
D, [2017-01-18T14:24:41.844780 #19110] DEBUG -- : ================ UpdateRootPassword run method ====================
I, [2017-01-18T14:24:41.844924 #19110]  INFO -- : Updating CFME password...
D, [2017-01-18T14:24:44.229068 #19110] DEBUG -- : =========== failed entered =============
D, [2017-01-18T14:24:44.229179 #19110] DEBUG -- : =========== failed exited =============
D, [2017-01-18T14:24:44.229289 #19110] DEBUG -- : =========== completed entered =============
E, [2017-01-18T14:24:44.229515 #19110] ERROR -- : Password was not updated. Error: Authentication failed for user root@192.168.155.137

The above repeats for a number of attempts.

Comment 1 Jason Montleon 2017-01-19 17:58:35 UTC
Please test this again with the latest compose and if it fails assign it to me. There is a PR that may have fixed this in addition to another issue.

Comment 2 Chandler Wilkerson 2017-01-23 15:45:24 UTC
I encountered the same failure Friday from the 20170118.t.1 build. Will try on latest 20170120.t.0 shortly.

Comment 3 Jason Montleon 2017-01-23 16:24:06 UTC
After a little more digging I found that the root password update changes for multi-provider were pretty well flawed. I am testing a PR that should fix it. I will update this when a PR is created, which should hopefully be by this afternoon.

Comment 4 Jason Montleon 2017-01-23 18:32:28 UTC
https://github.com/fusor/fusor/pull/1358

Comment 5 Chandler Wilkerson 2017-01-24 12:36:30 UTC
I'm still encountering the same bug even with the latest 20170123.t.0 images for both QCI-1.1 and QCIOOO-10

I checked, and the PR mentioned above *is* reflected in the ISO...

I can leave my environment in its current state if anybody would like to look in and pull logs.

Comment 6 Dylan Murray 2017-01-24 14:25:27 UTC
Can confirm the PR is in QCI-1.1-RHEL-7-20170123.t.0, moving back to assigned.

Comment 7 Chandler Wilkerson 2017-01-24 17:26:36 UTC
It looks like the problem is in my environment's networking. I'm back to the drawing board on my environment setup, and most likely this bug (if it was a bug) is firmly squashed.

Comment 8 Jason Montleon 2017-01-24 17:42:22 UTC
There was definitely a bug that could cause this error as well, though in that case it's because it didn't retry the second host password change if it failed the first time, rather than retrying 30 times and finally giving up.

It was especially hard to see in the original case because both the RHV and OSP CFME instances were being updated in one task and as you saw if one succeeded and one failed running the task would bomb out on the host with the now updated password. Now we use two tasks so it's a little clearer to get to the bottom of what's going on.

From my deployment today, as far as I can see the bug is fixed, but one successful run does not make a pattern so I'll move it to ON_QA so it gets a little more testing by QE before we close it.

Comment 9 James Olin Oden 2017-02-07 20:11:01 UTC
Compose: QCI-1.1-RHEL-7-20170203.t.0

I have not seen this bug re-occur in my testing.   If you do see it re-occur then please re-open this bug.


Note You need to log in before you can comment on or make changes to this bug.