Description of problem: When customer add the provider, the credentials validate ok and the provider is added successfully. However, when CloudForms goes to refresh the provider for the first time, the provider is only showing it has discovered some networks and no other components. The status of the provider in CloudForms always shows refreshed as never. It seems the refresh process is getting stuck and not completing. If customer log onto the appliance cli, they can see a ruby process(MIQ: Azure::CloudManager::RefreshWorker) sitting at 100% CPU and consuming an ever-increasing amount of memory. On the first attempt it nearly consumed all memory and caused the appliance to become unresponsive Version-Release number of selected component (if applicable): 5.7.1.3 How reproducible: NA Steps to Reproduce: 1. 2. 3. Actual results: Refresh failure with "MIQ(ManageIQ::Providers::Azure::CloudManager::RefreshParser#download_template) Failed to download Azure template https://gallerystoreprodch.blob.core.windows.net/prod-microsoft-windowsazure-gallery/Microsoft.ClassicStorage.0.3.5-preview/DeploymentTemplates/ClassicStorageAccount.json. Reason: #<RestClient::NotFound: 404 Not Found>" log Expected results: Refreshing Azure provider properly Additional info: Appliance became unresponsive due to "MIQ: Azure::CloudManager::RefreshWorker" comsuming all CPU/MEM resources. Tried to enable debug log but it became unresponsive before that stage. Tasks: 286 total, 1 running, 285 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.3 us, 0.2 sy, 26.2 ni, 73.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 16004820 total, 5715776 free, 9339104 used, 949940 buff/cache KiB Swap: 9957372 total, 9957372 free, 0 used. 6166148 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4378 root 27 7 2879.1m 2.454g 5.3m S 100.0 16.1 6:30.31 ruby 5432 root 27 7 698.2m 336.1m 4.8m S 4.7 2.2 0:03.23 ruby ... UID PID PPID C STIME TTY TIME CMD root 4378 2594 78 10:35 ? 00:06:59 MIQ: Azure::CloudManager::RefreshWorker id: 106044, queue: ems_6
Should be fixed with these two PR's that updated the azure-armrest gem to 0.7.0: https://github.com/ManageIQ/manageiq-providers-azure/pull/42 https://github.com/ManageIQ/manageiq-gems-pending/pull/100
*** This bug has been marked as a duplicate of bug 1434988 ***
How many zones do they have? How many other providers do they have in the same zone? How many appliances do they have?
Just to be clear, the customer has one appliance for the Azure zone? Or do they have 8 zones on a single appliance? If it's the latter case, that could be a problem.
Additional logging added in https://github.com/ManageIQ/manageiq-providers-azure/pull/128
Created attachment 1328487 [details] Additional logging added to azure inventory collection https://github.com/ManageIQ/manageiq-providers-azure/pull/128 cherry-picked back to 5.8.1.5, manageiq-providers-azure-ae2e0e06ee85 Extract it into /opt/rh/cfme-gemset/bundler/gems/manageiq-providers-azure-ae2e0e06ee85
*** Bug 1482028 has been marked as a duplicate of this bug. ***