Bug 1420536
Summary: | Refresh of infrastructure provider fails with bad request with OSP director as provider | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | michael_rasoulian <michael_rasoulian> | ||||||||||||||||
Component: | Providers | Assignee: | Tzu-Mainn Chen <tzumainn> | ||||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Ola Pavlenko <opavlenk> | ||||||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||||||
Priority: | high | ||||||||||||||||||
Version: | 5.7.0 | CC: | abellott, arkady_kanevsky, cdevine, christopher_dearborn, cpelland, dajohnso, david_paterson, dcain, gekis, gtanzill, jfrey, jhajyahy, jhardy, john_terpstra, John_walsh, kurt_hey, manisha_tripathy, maufart, mburns, michael_rasoulian, morazi, obarenbo, randy_perryman, rrasouli, scohen, simaishi, smerrow, sreichar, tzumainn | ||||||||||||||||
Target Milestone: | GA | Keywords: | TestOnly, ZStream | ||||||||||||||||
Target Release: | 5.8.0 | ||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
Whiteboard: | openstack | ||||||||||||||||||
Fixed In Version: | 5.8.0.1 | Doc Type: | If docs needed, set a value | ||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | 1415544 | ||||||||||||||||||
: | 1420916 1420919 (view as bug list) | Environment: | |||||||||||||||||
Last Closed: | 2017-06-12 16:42:37 UTC | Type: | Bug | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | Openstack | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Bug Depends On: | |||||||||||||||||||
Bug Blocks: | 1415544, 1420916 | ||||||||||||||||||
Attachments: |
|
Created attachment 1248677 [details]
EVM log for 4.1 appliance
Hi! It looks like the error is caused by something we changed between 4.1 and 4.2 to make the infra provider's refresh more efficient: add a filter to the stack resource list query. However I can't seem to reproduce the error you're getting. Could you tell me what version of OSP you're using, and then run the following through the openstack CLI? nova list heat resource-list -n 50 --filter 'physical_resource_id=<nova server id 1>;physical_resource_id=<nova server id 2>;physical_resource_id=<etc>' And then run it again with '-n 2'? So for example: [stack@instack ~]$ nova list +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | fcd55792-2134-4e2f-844a-0511a074d71b | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=192.0.2.16 | | dc883110-979a-4706-bf87-cd5954383a64 | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.0.2.12 | [stack@instack ~]$ heat resource-list -n 50 --filter 'physical_resource_id=fcd55792-2134-4e2f-844a-0511a074d71b;physical_resource_id=dc883110-979a-4706-bf87-cd5954383a64' overcloud WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead +---------------+--------------------------------------+---------------------+-----------------+----------------------+--------------------------------------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | stack_name | +---------------+--------------------------------------+---------------------+-----------------+----------------------+--------------------------------------------------+ | Controller | dc883110-979a-4706-bf87-cd5954383a64 | OS::TripleO::Server | CREATE_COMPLETE | 2017-02-08T20:24:36Z | overcloud-Controller-irlqbti2ss23-0-io26vdivgb32 | | NovaCompute | fcd55792-2134-4e2f-844a-0511a074d71b | OS::TripleO::Server | CREATE_COMPLETE | 2017-02-08T20:24:38Z | overcloud-Compute-xc2ksahjmtfu-0-kf2r4yfbyc3x | +---------------+--------------------------------------+---------------------+-----------------+----------------------+--------------------------------------------------+ We are using OSP 9.0. [osp_admin@director ~]$ nova list +--------------------------------------+-------------------------+--------+------------+-------------+--------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+--------------------------+ | 2b669962-e924-49f6-abb2-90dcc05e9a2d | overcloud-cephstorage-0 | ACTIVE | - | Running | ctlplane=192.168.120.140 | | 8c6ddb1f-c9a5-4693-87e4-7d8ac587067d | overcloud-cephstorage-1 | ACTIVE | - | Running | ctlplane=192.168.120.127 | | 18b7d918-13aa-4b9d-af26-68f96c4472db | overcloud-cephstorage-2 | ACTIVE | - | Running | ctlplane=192.168.120.126 | | 1e0c5d0e-f13d-49cf-81c7-04cefc4d1c86 | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=192.168.120.129 | | 53ac570d-cf52-45ba-9a7e-f0d14b9d6ab6 | overcloud-compute-1 | ACTIVE | - | Running | ctlplane=192.168.120.141 | | 127f31bf-6891-4d34-96e6-429f2fa3297c | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.168.120.139 | | ae3e5d17-0544-42fb-9c34-ae9ca6a8d913 | overcloud-controller-1 | ACTIVE | - | Running | ctlplane=192.168.120.128 | | 0908613d-8847-4e28-b2fd-595c3fdbac17 | overcloud-controller-2 | ACTIVE | - | Running | ctlplane=192.168.120.146 | +--------------------------------------+-------------------------+--------+------------+-------------+--------------------------+ When I run the 2nd command as printed above, I get a "too few arguments" error. If I include the stack name, I get: [osp_admin@director ~]$ heat resource-list overcloud -n 50 --filter 'physical_resource_id=<nova server id 1>;physical_resource_id=<nova server id 2>;physical_resource_id=<etc>' WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead ERROR: type object 'Resource' has no attribute 'physical_resource_id' Ah, yep, sorry about that. Can you attach the output of just 'heat resource-list -n 50 overcloud'? Created attachment 1248856 [details]
heat resource-list output n 50
Created attachment 1248857 [details]
heat resource-list output n 2
Okay - I'm not sure I know why the heat resource-list query fails, but I think I know what will fix it. The 4.2 evm.log shows the fog error to involve this API call: /v1/f626a28f15474e07af7734d60999f045/stacks/overcloud-CephStorageNodesPostDeployment-l6ur5gyetwlz-ExtraConfig-k5qw34aixb5g-ExtraDeployments-3e7a3d66tlz7/beb83f53-c7f3-4221-b3e1-caaa338f1ec7/resources I'm guessing that the recursive resource query fails on that for some reason I don't understand. The good news is that in 4.2 this recursive query goes 50 deep, when in fact we only need to go 2 deep to get the information we need. The resource specified in the failing API call is present when you specify a depth of 50, but *not* when you specify a depth of 2. Would it be possible to try and apply this one-line change to your CF 4.2 instance? https://github.com/ManageIQ/manageiq/pull/13748 If that works, I can try and get it included in 4.2.1. When is 4.2.1 expected to be released? Looks like a simple fix of replacing heat resource-list with openstack stack resource list should fix the problem. How did it passed QE? Did anybody tested CF-4.2 with OSP9 or newer? Hi Arkady. It's actually not a question of replacing a command; both of those CLI commands are equivalent to the single fog command we're using, and won't cause an error here. We've definitely tested 4.2 on OSP9 and are currently developing against OSP10, and we've never seen this error. However we can't test every overcloud deployment possibility (and although this is an infra provider refresh issue, the error is happening when the infra provider is trying to analyze its deployed overcloud) and it looks like this is one that caused an issue. The good news is I'm pretty sure the one-line fix will resolve the issue, but it would be great if we could get some confirmation. 4.2.1 GA is targeted for February 22nd. I'll test the fix shortly and report back with the refresh results. I'm still getting the same refresh error with the one-line fix implemented. Here's where I made the change: [root@localhost infra_manager]# pwd /var/www/miq/vmdb/app/models/manageiq/providers/openstack/infra_manager [root@localhost infra_manager]# grep orchestration_service.list_resources refresh_parser.rb @orchestration_service.list_resources(:stack => stack, :nested_depth => 2, :physical_resource_id => server_ids).body['resources'] Strange - can you attach evm.log and fog.log? Created attachment 1248871 [details]
evm.log after one-line change
Created attachment 1248872 [details]
fog.log after one-line change
Created attachment 1248873 [details]
"rpm -qa | grep openstack" output
Detail on our installed packages
Actually, nevermind - I figured it out. The Heat filtering option is a fairly recent feature; it was added in Mitaka, which corresponds to OSP9, but perhaps it actually slipped to one release later. To confirm: the line you changed before, can you update to the following? @orchestration_service.list_resources(:stack => stack, :nested_depth => 2).body['resources'] Sorry for the confusion. That worked! It seems I had to restart the appliance however for it to take. The refresh was successful after reboot. Is the filtering referenced anywhere else that would potentially cause another issue? Nope, the filtering only happens in this one place in the infra provider. I've already tested this fix against OSP10 successfully as well, so I'll create a PR and get it into 4.2.1. Thanks for your patience! *** Bug 1421421 has been marked as a duplicate of this bug. *** 5.8.0.3, verified and RHOS9 |
Created attachment 1248676 [details] EVM log for 4.2 CF appliance Description of problem: After successfully adding OSP director as an infrastructure provider (credentials validate successfully), the refresh fails and no infrastructure data is populated. The error given is: <Excon::Error::BadRequest: Expected(200) <=> Actual(400 Bad Request) excon.error.response Attached is the evm_4.2.log with more error details. This issue is not seen in CF 4.1 with the same configuration. Attached is the evm_4.1.log which refreshes cleanly. Version-Release number of selected component (if applicable): CF 4.2 How reproducible: Refresh is never successful. Steps to Reproduce: 1. Add OSP director as an infrastructure provider, selecting either AMQP or Ceilometer for Events. 2. Once added, attempt a refresh of the infrastructure to populate collect the data. Actual results: Error in status seen in "Last Refresh" Expected results: Refresh status successful Additional info: