Bug 1431912

Summary: Cannot add Azure provider to CloudForms 4.2
Product: Red Hat CloudForms Management Engine Reporter: tachoi
Component: ProvidersAssignee: Daniel Berger <dberger>
Status: CLOSED CURRENTRELEASE QA Contact: Leo Khomenko <lkhomenk>
Severity: medium Docs Contact:
Priority: high    
Version: 5.7.0CC: agrare, bsorota, cpelland, dberger, djoo, ealcaniz, fdewaley, glamb, jfrey, jhardy, lkhomenk, mfeifer, myoder, ncatling, obarenbo, rspagnol, saali, simaishi, tachoi
Target Milestone: GAKeywords: Reopened, TestOnly, ZStream
Target Release: 5.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 5.9.0.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1500049 1500050 (view as bug list) Environment:
Last Closed: 2018-03-06 14:46:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1500049, 1500050, 1503797    
Attachments:
Description Flags
Additional logging added to azure inventory collection none

Description tachoi 2017-03-14 02:24:56 UTC
Description of problem:
When customer add the provider, the credentials validate ok and the provider is added successfully. However, when CloudForms goes to refresh the provider for the first time, the provider is only showing it has discovered some networks and no other components. The status of the provider in CloudForms always shows refreshed as never. It seems the refresh process is getting stuck and not completing.

If customer log onto the appliance cli, they can see a ruby process(MIQ: Azure::CloudManager::RefreshWorker) sitting at 100% CPU and consuming an ever-increasing amount of memory. On the first attempt it nearly consumed all memory and caused the appliance to become unresponsive

Version-Release number of selected component (if applicable):
5.7.1.3

How reproducible:
NA

Steps to Reproduce:
1.
2.
3.

Actual results:
Refresh failure with 
"MIQ(ManageIQ::Providers::Azure::CloudManager::RefreshParser#download_template) Failed to download Azure template https://gallerystoreprodch.blob.core.windows.net/prod-microsoft-windowsazure-gallery/Microsoft.ClassicStorage.0.3.5-preview/DeploymentTemplates/ClassicStorageAccount.json. Reason: #<RestClient::NotFound: 404 Not Found>" log

Expected results:
Refreshing Azure provider properly

Additional info:
Appliance became unresponsive due to "MIQ: Azure::CloudManager::RefreshWorker" comsuming all CPU/MEM resources. Tried to enable debug log but it became unresponsive before that stage.

Tasks: 286 total,   1 running, 285 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.2 sy, 26.2 ni, 73.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16004820 total,  5715776 free,  9339104 used,   949940 buff/cache
KiB Swap:  9957372 total,  9957372 free,        0 used.  6166148 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND        
 4378 root      27   7 2879.1m 2.454g   5.3m S 100.0 16.1   6:30.31 ruby           
 5432 root      27   7  698.2m 336.1m   4.8m S   4.7  2.2   0:03.23 ruby 
...

UID        PID  PPID  C STIME TTY          TIME CMD
root      4378  2594 78 10:35 ?        00:06:59 MIQ: Azure::CloudManager::RefreshWorker id: 106044, queue: ems_6

Comment 6 Daniel Berger 2017-04-03 15:21:12 UTC
Should be fixed with these two PR's that updated the azure-armrest gem to 0.7.0:

  https://github.com/ManageIQ/manageiq-providers-azure/pull/42
  https://github.com/ManageIQ/manageiq-gems-pending/pull/100

Comment 7 Satoe Imaishi 2017-04-05 17:42:28 UTC

*** This bug has been marked as a duplicate of bug 1434988 ***

Comment 14 Daniel Berger 2017-08-07 19:38:44 UTC
How many zones do they have? How many other providers do they have in the same zone? How many appliances do they have?

Comment 18 Daniel Berger 2017-08-15 18:40:58 UTC
Just to be clear, the customer has one appliance for the Azure zone? Or do they have 8 zones on a single appliance? If it's the latter case, that could be a problem.

Comment 35 Adam Grare 2017-09-20 15:30:04 UTC
Additional logging added in https://github.com/ManageIQ/manageiq-providers-azure/pull/128

Comment 36 Adam Grare 2017-09-20 15:31:44 UTC
Created attachment 1328487 [details]
Additional logging added to azure inventory collection

https://github.com/ManageIQ/manageiq-providers-azure/pull/128 cherry-picked back to 5.8.1.5, manageiq-providers-azure-ae2e0e06ee85

Extract it into /opt/rh/cfme-gemset/bundler/gems/manageiq-providers-azure-ae2e0e06ee85

Comment 38 Felix Dewaleyne 2017-09-28 15:05:48 UTC
*** Bug 1482028 has been marked as a duplicate of this bug. ***