Bug 1257970 - Failure updating root password on CFME: Failed to update root password on appliance. Error message: Failed to update root password on appliance, after a retry. Error message: Authentication failed for user root
Summary: Failure updating root password on CFME: Failed to update root password on ap...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - CloudForms
Version: 1.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: TP2
: 1.0
Assignee: John Matthews
QA Contact: Dave Johnson
Dan Macpherson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-28 14:08 UTC by John Matthews
Modified: 2022-12-01 05:23 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1304662 0 unspecified CLOSED "RuntimeError: Failed to update root password on appliance. Retries exhausted. Error: execution expired" when deploying ... 2021-02-22 00:41:40 UTC

Internal Links: 1304662

Description John Matthews 2015-08-28 14:08:48 UTC
Description of problem:

This was seen on the development environment running latest code in master on 8/27.  

Ran into issue fusor_server thinks that updating the root password on CFME failed, when in fact it succeeded.

Looks like a possible timing problem with how we mark "success" in a callback and how we check for success then do a retry.

I suspect that the callback is running in a different thread than the main line of execution.  And the @success was not yet updated.  I am not sure if this is really happening, perhaps there is another explanation.

https://github.com/fusor/fusor/blob/master/server/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb#L51
              if not @success && @retry


Output:

2015-08-28 00:22:39 [I] ================ Leaving UploadImage run method ====================
2015-08-28 00:22:39 [I] ================ ImportTemplate run method ====================
2015-08-28 00:22:39 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_import_template.py --api_user 'admin@internal' --api_pass dog8code --api_host 192.168.152.14 --cluster_name Default --data_center_name Default --export_domain_name my_export --storage_domain_name my_storage --vm_template_name deploy_1-cfme-template
2015-08-28 00:23:44 [I] ================ Leaving ImportTemplate run method ====================
2015-08-28 00:23:44 [I] ================ VirtualMachine::Create run method ====================
2015-08-28 00:23:44 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_create_vm_from_template.py --api_host 192.168.152.14 --api_pass dog8code --vm_template_name deploy_1-cfme-template --cluster_name Default --vm_name deploy_1-cfme
2015-08-28 00:23:56 [I] ================ Leaving VirtualMachine::Create run method ====================
2015-08-28 00:23:56 [I] ================ VirtualMachine::AddDisk run method ====================
2015-08-28 00:23:56 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_add_disk_to_vm.py --api_host 192.168.152.14 --api_pass dog8code --size_gb 20 --storage_domain my_storage --vm_id 58022785-cb7f-403a-a359-20c9d5ad4245
2015-08-28 00:23:58 [I] ================ Leaving VirtualMachine::AddDisk run method ====================
2015-08-28 00:23:58 [I] ================ VirtualMachine::Start run method ====================
2015-08-28 00:23:58 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_start_vm.py --api_host 192.168.152.14 --api_pass dog8code --vm_id 58022785-cb7f-403a-a359-20c9d5ad4245
2015-08-28 00:25:01 [I] ================ Leaving VirtualMachine::Start run method ====================
2015-08-28 00:25:01 [I] ================ VirtualMachine::GetIp run method ====================
2015-08-28 00:25:01 [I] Running: /usr/share/fusor_ovirt/bin/ovirt_get_ip_of_vm.py --api_host 192.168.152.14 --api_pass dog8code --vm_id 58022785-cb7f-403a-a359-20c9d5ad4245
2015-08-28 00:25:03 [I] ================ Leaving VirtualMachine::GetIp run method ====================
2015-08-28 00:25:03 [I] ================ UpdateRootPassword run method ====================
2015-08-28 00:25:04 [I] Password updated successfully. Changing password for user root.^M
passwd: all authentication tokens updated successfully.^M

2015-08-28 00:25:04 [I] Password updated successfully. Changing password for user root.^M
passwd: all authentication tokens updated successfully.^M

2015-08-28 00:25:04 [I] UpdateRootPassword will retry again once in 20.
2015-08-28 00:25:25 [E] Password was not updated. Error: Authentication failed for user root.152.15

2015-08-28 00:25:25 [E] Failed to update root password on appliance. Error message: Failed to update root password on appliance, after a retry. Error message: Authentication failed for user root.152.15
 (RuntimeError)
/git/RHCI/fusor/server/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb:68:in `rescue in run'
/git/RHCI/fusor/server/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb:30:in `run'

Comment 1 Jesus M. Rodriguez 2015-09-04 01:34:29 UTC
So ddavis fixed this today simply by cleaning up the code with rubocop. He changed:

if not @success && @retry

to 

if !@success && @retry

The ruby order of precedence means it evaluated the @success && @retry THEN it applied the not. This is pure EVIL and is why ruby is the spawn of the devil.

By change it to !@success then it does what we want.

Comment 2 Jesus M. Rodriguez 2015-09-04 01:36:13 UTC
if not @success && @retry scenario. @success is true, @retry is false. Because of the not the if statement evaluates to not (true && false) -> not (false) -> true

so we go into the retry code which is NOT what we wanted.

=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
success: true
retry: false
UpdateRootPassword will retry again once in 10.
Authentication failed for user root.137.16

Comment 3 Jesus M. Rodriguez 2015-09-04 01:38:22 UTC
if !@success && retry scenario. @success is true, @retry is false. This will evaluate to !true && false -> false && false -> false.

so we do NOT go into the retry code which is EXACTLY what we wanted.

=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
=========== completed entered =============
Password updated successfully. Changing password for user root.
passwd: all authentication tokens updated successfully.
=========== completed exited =============
=> nil

The ssh_connection.rb calls the completed callback twice. That is a separate issue but it does not affect this particular scenario.

Comment 4 Jesus M. Rodriguez 2015-09-04 20:21:49 UTC
Fixed by https://github.com/fusor/fusor/pull/349

Comment 5 Antonin Pagac 2016-01-11 10:08:46 UTC
John, how do I verify this?

Would it be enough to run a RHEV+CFME deployment and then check production.log for "[E] Failed to update root password on appliance."?

Comment 6 John Matthews 2016-01-11 16:13:59 UTC
Antonin,

Yes, run a RHEV + CFME deployment.
Then verify:
  - ssh root@CFME_IP   use the root password for CFME you specified
  - log into CFME WebUI with the admin password you specified

Comment 7 Antonin Pagac 2016-02-03 13:33:15 UTC
I tried RHELOSP+CFME combination and got similar error when deploying CFME:

"RuntimeError: Failed to update root password on appliance. Retries exhausted. Error: execution expired"

In my case, it seems that the update password action timed out, see attached log files, exception is at 07:49:43 in production.log. Note that I have deployed RHEV as first deployment, RHELOSP+CFME was my second one, and you may notice in the deployment.log there are bits from RHEV deployment.
Also, for this deployment I was re-using undercloud from previous RHCI install. I undeployed the Overcloud and left the nodes there. CFME was previously installed on the RHELOSP.

This is the Errors tab in the UI:


Action:

Actions::Fusor::Deployment::CloudForms::UpdateRootPassword

Input:

{"deployment_id"=>2, "locale"=>"en"}

Output:

{}

Exception:

RuntimeError: Failed to update root password on appliance. Retries exhausted. Error: execution expired

Backtrace:

/opt/rh/ruby193/root/usr/share/gems/gems/fusor_server-0.0.1/app/lib/actions/fusor/deployment/cloud_forms/update_root_password.rb:73:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:487:in `block (3 levels) in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware.rb:16:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action/progress.rb:30:in `with_progress_calculation'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action/progress.rb:16:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware.rb:16:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.73/app/lib/actions/middleware/keep_locale.rb:23:in `block in run'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.73/app/lib/actions/middleware/keep_locale.rb:34:in `with_locale'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.73/app/lib/actions/middleware/keep_locale.rb:23:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:22:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/world.rb:30:in `execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:486:in `block (2 levels) in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:485:in `catch'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:485:in `block in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `block in with_error_handling'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `catch'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:402:in `with_error_handling'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:480:in `execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:262:in `execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:9:in `block (2 levels) in execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract.rb:155:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract.rb:155:in `with_meta_calculation'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:8:in `block in execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:22:in `open_action'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:7:in `execute'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/executors/parallel/worker.rb:20:in `block in on_message'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:859:in `block in assigns'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:858:in `tap'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:858:in `assigns'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:138:in `match_value'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:116:in `block in match'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:115:in `each'
/opt/rh/ruby193/root/usr/share/gems/gems/algebrick-0.4.0/lib/algebrick.rb:115:in `match'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/executors/parallel/worker.rb:17:in `on_message'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:82:in `on_envelope'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:72:in `receive'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `block (2 levels) in run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `loop'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `block in run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `catch'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:99:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/micro_actor.rb:13:in `block in initialize'
/opt/rh/ruby193/root/usr/share/gems/gems/logging-1.8.1/lib/logging/diagnostic_context.rb:323:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/logging-1.8.1/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'

Comment 10 Antonin Pagac 2016-02-03 15:13:05 UTC
I then tried RHEV+CFME and that works as expected.

Comment 11 John Matthews 2016-02-03 15:23:16 UTC
Antonin, 

Let's file a new BZ to track the issue mentioned in comment #7.  The re-use of an existing undercloud may be contributing to some additional issues, outside of the scope of this original BZ.

Comment 12 Antonin Pagac 2016-02-15 10:25:10 UTC
I can login with specified root and admin passwords for CFME on RHEV and also on RHELOSP. Marking as verified.

TP2 RC9
RHCI-6.0-RHEL-7-20160208.1-RHCI-x86_64-dvd1.iso
RHCIOOO-7-RHEL-7-20160127.0-RHCIOOO-x86_64-dvd1.iso


Note You need to log in before you can comment on or make changes to this bug.