Bug 1257970 - Failure updating root password on CFME: Failed to update root password on appliance. Error message: Failed to update root password on appliance, after a retry. Error message: Authentication failed for user root
Failure updating root password on CFME: Failed to update root password on ap...
Status: VERIFIED
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - CloudForms (Show other bugs)
1.0
Unspecified Unspecified
unspecified Severity unspecified
: TP2
: 1.0
Assigned To: Jesus M. Rodriguez
Dave Johnson
Dan Macpherson
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-28 10:08 EDT by John Matthews
Modified: 2016-04-29 12:22 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description John Matthews 2015-08-28 10:08:48 EDT
Description of problem:

This was seen on the development environment running latest code in master on 8/27.  

Ran into issue fusor_server thinks that updating the root password on CFME failed, when in fact it succeeded.

Looks like a possible timing problem with how we mark "success" in a callback and how we check for success then do a retry.

I suspect that the callback is running in a different thread than the main line of execution.  And the @success was not yet updated.  I am not sure if this is really happening, perhaps there is another explanation.

https://github.com/fusor/fusor/blob/master/server/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb#L51
              if not @success && @retry


Output:

2015-08-28 00:22:39 [I] ================ Leaving UploadImage run method ====================
2015-08-28 00:22:39 [I] ================ ImportTemplate run method ====================
2015-08-28 00:22:39 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_import_template.py --api_user 'admin@internal' --api_pass dog8code --api_host 192.168.152.14 --cluster_name Default --data_center_name Default --export_domain_name my_export --storage_domain_name my_storage --vm_template_name deploy_1-cfme-template
2015-08-28 00:23:44 [I] ================ Leaving ImportTemplate run method ====================
2015-08-28 00:23:44 [I] ================ VirtualMachine::Create run method ====================
2015-08-28 00:23:44 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_create_vm_from_template.py --api_host 192.168.152.14 --api_pass dog8code --vm_template_name deploy_1-cfme-template --cluster_name Default --vm_name deploy_1-cfme
2015-08-28 00:23:56 [I] ================ Leaving VirtualMachine::Create run method ====================
2015-08-28 00:23:56 [I] ================ VirtualMachine::AddDisk run method ====================
2015-08-28 00:23:56 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_add_disk_to_vm.py --api_host 192.168.152.14 --api_pass dog8code --size_gb 20 --storage_domain my_storage --vm_id 58022785-cb7f-403a-a359-20c9d5ad4245
2015-08-28 00:23:58 [I] ================ Leaving VirtualMachine::AddDisk run method ====================
2015-08-28 00:23:58 [I] ================ VirtualMachine::Start run method ====================
2015-08-28 00:23:58 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_start_vm.py --api_host 192.168.152.14 --api_pass dog8code --vm_id 58022785-cb7f-403a-a359-20c9d5ad4245
2015-08-28 00:25:01 [I] ================ Leaving VirtualMachine::Start run method ====================
2015-08-28 00:25:01 [I] ================ VirtualMachine::GetIp run method ====================
2015-08-28 00:25:01 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_get_ip_of_vm.py --api_host 192.168.152.14 --api_pass dog8code --vm_id 58022785-cb7f-403a-a359-20c9d5ad4245
2015-08-28 00:25:03 [I] ================ Leaving VirtualMachine::GetIp run method ====================
2015-08-28 00:25:03 [I] ================ UpdateRootPassword run method ====================
2015-08-28 00:25:04 [I] Password updated successfully. Changing password for user root.^M
passwd: all authentication tokens updated successfully.^M

2015-08-28 00:25:04 [I] Password updated successfully. Changing password for user root.^M
passwd: all authentication tokens updated successfully.^M

2015-08-28 00:25:04 [I] UpdateRootPassword will retry again once in 20.
2015-08-28 00:25:25 [E] Password was not updated. Error: Authentication failed for user root@192.168.152.15

2015-08-28 00:25:25 [E] Failed to update root password on appliance. Error message: Failed to update root password on appliance, after a retry. Error message: Authentication failed for user root@192.168.152.15
 (RuntimeError)
/git/RHCI/fusor/server/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb:68:in `rescue in run'
/git/RHCI/fusor/server/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb:30:in `run'
Comment 1 Jesus M. Rodriguez 2015-09-03 21:34:29 EDT
So ddavis fixed this today simply by cleaning up the code with rubocop. He changed:

if not @success && @retry

to 

if !@success && @retry

The ruby order of precedence means it evaluated the @success && @retry THEN it applied the not. This is pure EVIL and is why ruby is the spawn of the devil.

By change it to !@success then it does what we want.
Comment 2 Jesus M. Rodriguez 2015-09-03 21:36:13 EDT
if not @success && @retry scenario. @success is true, @retry is false. Because of the not the if statement evaluates to not (true && false) -> not (false) -> true

so we go into the retry code which is NOT what we wanted.

=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
success: true
retry: false
UpdateRootPassword will retry again once in 10.
Authentication failed for user root@10.13.137.16
Comment 3 Jesus M. Rodriguez 2015-09-03 21:38:22 EDT
if !@success && retry scenario. @success is true, @retry is false. This will evaluate to !true && false -> false && false -> false.

so we do NOT go into the retry code which is EXACTLY what we wanted.

=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
=> nil

The ssh_connection.rb calls the completed callback twice. That is a separate issue but it does not affect this particular scenario.
Comment 4 Jesus M. Rodriguez 2015-09-04 16:21:49 EDT
Fixed by https://github.com/fusor/fusor/pull/349
Comment 5 Antonin Pagac 2016-01-11 05:08:46 EST
John, how do I verify this?

Would it be enough to run a RHEV+CFME deployment and then check production.log for "[E] Failed to update root password on appliance."?
Comment 6 John Matthews 2016-01-11 11:13:59 EST
Antonin,

Yes, run a RHEV + CFME deployment.
Then verify:
  - ssh root@CFME_IP   use the root password for CFME you specified
  - log into CFME WebUI with the admin password you specified
Comment 7 Antonin Pagac 2016-02-03 08:33:15 EST
I tried RHELOSP+CFME combination and got similar error when deploying CFME:

"RuntimeError: Failed to update root password on appliance. Retries exhausted. Error: execution expired"

In my case, it seems that the update password action timed out, see attached log files, exception is at 07:49:43 in production.log. Note that I have deployed RHEV as first deployment, RHELOSP+CFME was my second one, and you may notice in the deployment.log there are bits from RHEV deployment.
Also, for this deployment I was re-using undercloud from previous RHCI install. I undeployed the Overcloud and left the nodes there. CFME was previously installed on the RHELOSP.

This is the Errors tab in the UI:


Action:

Actions::Fusor::Deployment::CloudForms::UpdateRootPassword

Input:

{"deployment_id"=>2, "locale"=>"en"}

Output:

{}

Exception:

RuntimeError: Failed to update root password on appliance. Retries exhausted. Error: execution expired

Backtrace:

/opt/rh/ruby193/root/usr/share/gems/gems/fusor_server-0.0.1/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb:73:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:487:in `block (3 levels) in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware.rb:16:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action/progress.rb:30:in `with_progress_calculation'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action/progress.rb:16:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware.rb:16:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.73/app/lib/actions/middleware/keep_locale.rb:23:in `block in run'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.73/app/lib/actions/middleware/keep_locale.rb:34:in `with_locale'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.73/app/lib/actions/middleware/keep_locale.rb:23:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/world.rb:30:in `execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:486:in `block (2 levels) in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:485:in `catch'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:485:in `block in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `block in with_error_handling'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `catch'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `with_error_handling'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:480:in `execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:262:in `execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:9:in `block (2 levels) in execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract.rb:155:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract.rb:155:in `with_meta_calculation'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:8:in `block in execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:22:in `open_action'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:7:in `execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/executors/parallel/worker.rb:20:in `block in on_message'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:859:in `block in assigns'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:858:in `tap'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:858:in `assigns'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:138:in `match_value'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:116:in `block in match'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:115:in `each'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:115:in `match'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/executors/parallel/worker.rb:17:in `on_message'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:82:in `on_envelope'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:72:in `receive'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `block (2 levels) in run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `loop'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `block in run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `catch'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:13:in `block in initialize'
/opt/rh/ruby193/root/usr/share/gems/gems/logging-1.8.1/lib/logging/diagnostic_context.rb:323:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/logging-1.8.1/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'
Comment 10 Antonin Pagac 2016-02-03 10:13:05 EST
I then tried RHEV+CFME and that works as expected.
Comment 11 John Matthews 2016-02-03 10:23:16 EST
Antonin, 

Let's file a new BZ to track the issue mentioned in comment #7.  The re-use of an existing undercloud may be contributing to some additional issues, outside of the scope of this original BZ.
Comment 12 Antonin Pagac 2016-02-15 05:25:10 EST
I can login with specified root and admin passwords for CFME on RHEV and also on RHELOSP. Marking as verified.

TP2 RC9
RHCI-6.0-RHEL-7-20160208.1-RHCI-x86_64-dvd1.iso
RHCIOOO-7-RHEL-7-20160127.0-RHCIOOO-x86_64-dvd1.iso

Note You need to log in before you can comment on or make changes to this bug.