Description of problem: Unable to set or change any release version\content-view\LCE , if the content host is associated with VMware compute-resource and connection to vCenter from satellite server is down. Version-Release number of selected component (if applicable): Satellite 6.7 [ older versions as well ] How reproducible: Always Steps to Reproduce: 1. Configure VMware compute resource. 2. Build a host from satellite using the compute resource. 3. Use following command to block the connection from satellite to vCenter. # iptables -I OUTPUT -d vcenter.example.com -j DROP 4. Try to set or change the release version\CV\LCE of the host. Actual results: * It will wait for some time and then throw following error in GUI. ~~ An error occurred saving the Content Host: Failed to find compute attributes, please check if VM test-rhel7.example.com was deleted ~~ * It will log following details in production.log at the same time. ~~ 2020-07-20T22:54:33 [I|app|ad7b7494] Started PUT "/api/v2/hosts/24" for 10.74.9.157 at 2020-07-20 22:54:33 +0530 2020-07-20T22:54:33 [I|app|ad7b7494] Processing by Api::V2::HostsController#update as JSON 2020-07-20T22:54:33 [I|app|ad7b7494] Parameters: {"id"=>"24", "host"=>{"subscription_facet_attributes"=>{"id"=>19, "autoheal"=>true, "purpose_role"=>"", "purpose_usage"=>"", "service_level"=>"", "release_version"=>"7Server"}}, "apiv"=>"v2"} 2020-07-20T22:55:06 [W|app|22f93f5a] Action failed 2020-07-20T22:55:06 [I|app|22f93f5a] Deface: [WARNING] No :original defined for 'change 500 page content', you should change its definition to include: :original => '35d2b4f7aac0c083740c6de6775473457e9ae9d8' 2020-07-20T22:55:06 [I|app|22f93f5a] Rendering common/500.html.erb 2020-07-20T22:55:06 [I|app|22f93f5a] Rendered common/500.html.erb (5.4ms) 2020-07-20T22:55:06 [I|app|22f93f5a] Completed 500 Internal Server Error in 60039ms (Views: 19.5ms | ActiveRecord: 3.2ms) 2020-07-20T22:55:20 [I|app|1c9b8bb8] Started GET "/notification_recipients" for 10.74.9.157 at 2020-07-20 22:55:20 +0530 2020-07-20T22:55:20 [I|app|1c9b8bb8] Processing by NotificationRecipientsController#index as JSON 2020-07-20T22:55:20 [I|app|1c9b8bb8] Completed 200 OK in 12ms (Views: 0.1ms | ActiveRecord: 2.7ms) 2020-07-20T22:55:34 [I|app|ad7b7494] Adding Compute instance for test-rhel7.example.com 2020-07-20T22:55:34 [W|app|ad7b7494] Failed to find compute attributes, please check if VM test-rhel7.example.com was deleted 2020-07-20T22:55:34 [W|app|ad7b7494] Rolling back due to a problem: [#<Orchestration::Task:0x00007f479fe84a48 @name="Set up compute instance test-rhel7.example.com", @id="Set up compute instance test-rhel7.example.com", @status="failed", @priority=3, @action=[#<Host::Managed id: 24, name: "test-rhel7.example.com", last_compile: "2020-07-19 20:19:25", last_report: nil, updated_at: "2020-07-19 20:19:25", created_at: "2020-07-17 20:57:27", root_pass: "$5$S5makfAftJJIxgUd$dhLS9YrQMfXjNGD.ADRzt2oGzZgvqJ...", architecture_id: 1, operatingsystem_id: 4, environment_id: nil, ptable_id: 106, medium_id: nil, build: false, comment: "", disk: "", installed_at: "2020-07-19 17:24:08", model_id: 1, hostgroup_id: 1, owner_id: 4, owner_type: "User", enabled: true, puppet_ca_proxy_id: nil, managed: true, use_image: nil, image_file: nil, uuid: "5000ed8f-054b-f2c8-80ba-f993a114d1a5", compute_resource_id: 1, puppet_proxy_id: nil, certname: nil, image_id: nil, organization_id: 1, location_id: 2, type: "Host::Managed", otp: nil, realm_id: nil, compute_profile_id: 4, provision_method: "build", grub_pass: "$5$S5makfAftJJIxgUd$dhLS9YrQMfXjNGD.ADRzt2oGzZgvqJ...", discovery_rule_id: nil, global_status: 1, lookup_value_matcher: "fqdn=test-rhel7.example.com", pxe_loader: "PXELinux BIOS", initiated_at: "2020-07-19 17:18:22", build_errors: nil, openscap_proxy_id: nil>, :setCompute], @created=1595265934.121043, @timestamp=2020-07-20 17:25:34 UTC>] 2020-07-20T22:55:34 [I|app|ad7b7494] Processed 1 tasks from queue 'Host::Managed Main', completed 0/2 2020-07-20T22:55:34 [E|app|ad7b7494] Task 'Set up compute instance test-rhel7.example.com' *failed* 2020-07-20T22:55:34 [E|app|ad7b7494] Task 'Query instance details for test-rhel7.example.com' *canceled* 2020-07-20T22:55:34 [E|app|ad7b7494] Unprocessable entity Host::Managed (id: 24): Failed to find compute attributes, please check if VM test-rhel7.example.com was deleted ~~ Expected results: 1. Satellite should allow changing\setting the values of content host, which are not related to compute resource e.g. release version. 2. Production.log should show more meaning full message rather than just showing "Completed 500 Internal Server Error". Like ~~ Unable to reach vcenter.example.com. ~~ Additional info: NA
Created redmine issue https://projects.theforeman.org/issues/31307 from this bug
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/31307 has been resolved.
Failed with Sat 6.10.0 snap 5.0. Tl;dr: not much changed and what actually improved is very slow to use. Some error messages are now more meaningful but the root issue seems to be unsolved. I'm open to other points of view but right now, I don't see this as verification material. Snap 5.0, i.e. before fix: === Using a reproducer from OP through WebUI, trying to edit a host that is associated with a CR that is unavailale due to network times out: "Oops, we're sorry but something went wrong execution expired" Using Hammer: # hammer host update --name mae-opsahl.vms.sat.rdu2.redhat.com --content-view testcv --organization-id 1 --location-id 2 Could not update the host: Failed to find compute attributes, please check if VM mae-opsahl.vms.sat.rdu2.redhat.com was deleted Creating a vmware host, changing the CR password to wrong and then going to Host -> Edit in webUI leads to: "Oops, we're sorry but something went wrong InvalidLogin: Cannot complete login due to an incorrect user name or password." Doing the same in Hammer: # hammer host update --name <FQDN> --content-view testcv --organization-id 1 --location-id 2 Could not update the host: Failed to find compute attributes, please check if VM <FQDN> was deleted === Snap 7.0, i.e. after fix: === Using a reproducer from OP through WebUI, trying to edit a host that is associated with a CR that is unavailale due to network takes several minutes for the edit page to appear and then allows for editing some values (e.g. content view), again taking several minutes saving the changes and then failing with "Receiving vm data for host '<FQDN>' from used compute resource 'testvmware (VMware)' failed: 'Connection to compute resource timed out'." => This still doesn't allow for editing. Opening the edit page works but very slowly. I understand this is due to waiting for network connectivity but it renders the new functionality almost unusable by default, even if it actually worked. Using Hammer: # hammer host update --name <FQDN> --content-view testcv --organization-id 1 --location-id 2 Could not update the host: Receiving vm data for host '<FQDN>' from used compute resource 'testvmware (VMware)' failed: 'Connection to compute resource timed out'. => The error in Hammer has changed and probably does a better job at suggesting this is a network issue. But I highly doubt this is the intended result. Wasn't the goal to make this actually editable, not even through WebUI but also by Hammer? Creating a vmware host, changing the CR password to wrong and then going to Host -> Edit in webUI leads to: "Oops, we're sorry but something went wrong InvalidLogin: Cannot complete login due to an incorrect user name or password." Doing the same in Hammer: # hammer host update --name <FQDN> --content-view testcv --organization-id 1 --location-id 2 Could not update the host: Failed to find compute attributes, please check if VM <FQDN> was deleted => Nothing changed here. This is a similar case but CR unavailability is caused not by network but by password being incorrectly specified. Some parameters of the host could and should be editable even in this case. But impact is low, most people probably don't expect hosts on a CR with incorrectly specified password to work in any way. This being said, an error message in Hammer could be better. If you won't fix this as part of this BZ, I'll perhaps file a separate, low severity and low priority BZ about this. ===
Hi, while the issue is valid, it exists since Satellite day 1 and currently not going to change. Modifying the host tries to also modify the host on the compute resource and if it's unavailable, it would lead to an inconsistent state. If the compute resource is down intentionally and the user still needs to modify the host, they first need to disassociate the host (edit host -> disassociate VM button) and then perform the modifications. Feel free to re-open if the workaround doesn't fit in this case