Description of problem: When there is no Swift or Cinder service on an OpenStack Cloud Provider we get another: "[NoMethodError]: undefined method `name'". This leads to CF entering a retry loop and punishing OpenStack with a mini-DOS attack doing constant refreshes instead of the hourly background refresh. Version-Release number of selected component (if applicable): cfme-5.7.1.3-1.el7cf.x86_64 How reproducible: Always for an OpenStack provider that does not have service endpoints for Swift or Cinder. Steps to Reproduce: 1. Remove the endpoints for Swift or Cinder on OSP. 2. Make the OSP a Cloud Provider in CF. 3. Monitor the evm.log for failed Fog connections and "[NoMethodError]: undefined method `name'. Actual results: Functionality is mostly not affected but the constant "provider refresh" calls will bring OSP to it knees eventually. Speculation...... Maybe the insanely high keystone connections OSP tracks from CF is a result of CF not realising it needs to close connections that were partially successful???? We will see if there is a correspond change with the fix for this issue. Expected results: CF should only attempt to refresh the state of endpoints that actually exist. Additional info: ======== snippet of evm.log from cfme-5.7.1.3-1.el7cf.x86_64 ==================== ----] E, [2017-06-01T16:22:40.680885 #27279:3a113c] ERROR -- : MIQ(ManageIQ::Providers::StorageManager::SwiftManager::Refresher#refresh) EMS: [quartz.cbr.lab Swift Manager], id: [24000000000006] Refresh failed [----] E, [2017-06-01T16:22:40.681628 #27279:3a113c] ERROR -- : [NoMethodError]: undefined method `name' for nil:NilClass Method:[rescue in block in refresh] [----] E, [2017-06-01T16:22:40.681772 #27279:3a113c] ERROR -- : /var/www/miq/vmdb/app/models/manageiq/providers/openstack/cloud_manager.rb:80:in `swift_service' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser.rb:19:in `initialize' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser.rb:9:in `new' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser.rb:9:in `ems_inv_to_hashes' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresher.rb:6:in `parse_legacy_inventory' /var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:122:in `block in parse_targeted_inventory'
After applying the following patches, courtesy of Jerry Keselman, there are no more errors for Swift and Cinder refresh. The normal "info" level messages can be seen for each service being refreshed. Not sure if Cinder and Swift should be showing up at all but the patch makes it better than before. I am creating this bugzilla after the event so the changes can be appropriately tracked. I have tested and confirm the patch works and will depend on Jerry to determine if the SwiftManager and CinderManager log lines below are appropriate. *********** Below is the patches from Jerry ******************** 1) Delete the Openstack provider from your CloudForms appliance and wait for all instances, etc. to be removed from the configuration. 2) Shut down the Evm service on your appliance (rake evm:stop). 3) Apply these three changes (note that the line numbers may not be the same but the method names and all surrounding lines should be consistent): diff --git a/app/models/manageiq/providers/openstack/cloud_manager.rb b/app/models/manageiq/providers/openstack/cloud_manager.rb index b580dc4..1df6a1d 100644 --- a/app/models/manageiq/providers/openstack/cloud_manager.rb +++ b/app/models/manageiq/providers/openstack/cloud_manager.rb @@ -99,12 +99,12 @@ class ManageIQ::Providers::Openstack::CloudManager < ManageIQ::Providers::CloudM def cinder_service vs = openstack_handle.detect_volume_service - vs.name == :cinder ? vs : nil + vs && vs.name == :cinder ? vs : nil end def swift_service vs = openstack_handle.detect_storage_service - vs.name == :swift ? vs : nil + vs && vs.name == :swift ? vs : nil end git diff swift_manager_mixin.rb diff --git a/app/models/mixins/swift_manager_mixin.rb b/app/models/mixins/swift_manager_mixin.rb index 8bfc7df..44ad40c 100644 --- a/app/models/mixins/swift_manager_mixin.rb +++ b/app/models/mixins/swift_manager_mixin.rb @@ -16,6 +16,7 @@ module SwiftManagerMixin private def ensure_swift_managers + return false unless swift_service created = ensure_swift_manager swift_manager.name = "#{name} Swift Manager" swift_manager.zone_id = zone_id git diff cinder_manager_mixin.rb diff --git a/app/models/mixins/cinder_manager_mixin.rb b/app/models/mixins/cinder_manager_mixin.rb index 96dd1c4..cadc8a7 100644 --- a/app/models/mixins/cinder_manager_mixin.rb +++ b/app/models/mixins/cinder_manager_mixin.rb @@ -19,6 +19,7 @@ module CinderManagerMixin private def ensure_cinder_managers + return false unless cinder_service created = ensure_cinder_manager cinder_manager.name = "#{name} Cinder Manager" cinder_manager.zone_id = zone_id 4) Restart the evm service (rake evm:start) 5) Add the Openstack Provider back into your CF configuration. If all goes well, the Cinder and Swift Managers will not be added as well. If they are added however, can you send me the relevant evm.log that would be great. ********** Below is the evm.log capture showing the remaining SwiftManager and CinderManager events after successfully applying the patches **************************************** [----] I, [2017-06-06T09:47:41.608577 #12888:93d140] INFO -- : Q-task_id([log_status]) MIQ(ManageIQ::Providers::Openstack::NetworkManager::RefreshWorker#log_status) [Refresh Worker for Providers: Users Network Manager] Worker ID [24000000016081], PID [23867], GUID [cb4cf116-4a39-11e7-915e-fa163ee73436], Last Heartbeat [2017-06-05 23:47:30 UTC], Process Info: Memory Usage [403431424], Memory Size [799236096], Proportio nal Set Size: [243303000], Memory % [2.42], CPU Time [681.0], CPU % [0.01], Priority [27] [----] I, [2017-06-06T09:47:41.609413 #12888:93d140] INFO -- : Q-task_id([log_status]) MIQ(ManageIQ::Providers::StorageManager::CinderManager::RefreshWorker#log_status) [Refresh Worker for Providers: Users Cind er Manager] Worker ID [24000000016082], PID [23876], GUID [cb525aca-4a39-11e7-915e-fa163ee73436], Last Heartbeat [2017-06-05 23:47:24 UTC], Process Info: Memory Usage [399781888], Memory Size [799236096], Propor tional Set Size: [226746000], Memory % [2.4], CPU Time [362.0], CPU % [0.01], Priority [27] [----] I, [2017-06-06T09:47:41.610260 #12888:93d140] INFO -- : Q-task_id([log_status]) MIQ(ManageIQ::Providers::StorageManager::SwiftManager::RefreshWorker#log_status) [Refresh Worker for Providers: Users Swift Manager] Worker ID [24000000016083], PID [23886], GUID [cb596856-4a39-11e7-915e-fa163ee73436], Last Heartbeat [2017-06-05 23:47:23 UTC], Process Info: Memory Usage [399777792], Memory Size [798183424], Proporti onal Set Size: [226222000], Memory % [2.4], CPU Time [333.0], CPU % [0.01], Priority [27] [----] I, [2017-06-06T09:47:41.610475 #12888:93d140] INFO -- : Q-task_id([log_status]) MIQ(MiqReportingWorker#log_status) [Reporting Worker] Worker ID [24000000015216], PID [12952], GUID [25aaddec-49f6-11e7-915 e-fa163ee73436], Last Heartbeat [2017-06-05 23:47:28 UTC], Process Info: Memory Usage [310841344], Memory Size [678096896], Proportional Set Size: [200126000], Memory % [1.87], CPU Time [1903.0], CPU % [0.03], P riority [27]
Maybe I reported too soon? There are *no* instances logged for CinderManager causing errors. However, there are two (2) different stack traces in relation to SwiftManager failing to perform a refresh against an OSP instance with no endpoints for Swift or Cinder. I have pasted snips of the 2 stack traces I thought were interesting and will attach the complete evm.log to this ticket. [----] E, [2017-06-06T12:08:47.505432 #13306:93d140] ERROR -- : MIQ(ManageIQ::Providers::StorageManager::SwiftManager::Refresher#refresh) EMS: [Users Swift Manager], id: [24000000000014] Refresh failed [----] E, [2017-06-06T12:08:47.506328 #13306:93d140] ERROR -- : [NoMethodError]: undefined method `each' for nil:NilClass Method:[rescue in block in refresh] [----] E, [2017-06-06T12:08:47.506443 #13306:93d140] ERROR -- : /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser/cross_linkers/openstack.rb:14:in `cross_link' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser/cross_linkers.rb:13:in `cross_link' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser.rb:30:in `ems_inv_to_hashes' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresh_parser.rb:9:in `ems_inv_to_hashes' /var/www/miq/vmdb/app/models/manageiq/providers/storage_manager/swift_manager/refresher.rb:6:in `parse_legacy_inventory' /var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:122:in `block in parse_targeted_inventory' ... [----] E, [2017-06-06T12:08:47.506493 #13306:93d140] ERROR -- : MIQ(ManageIQ::Providers::StorageManager::SwiftManager::Refresher#refresh) EMS: [Users Swift Manager], id: [24000000000014] Unable to perform refresh for the following targets: [----] E, [2017-06-06T12:08:47.506545 #13306:93d140] ERROR -- : MIQ(ManageIQ::Providers::StorageManager::SwiftManager::Refresher#refresh) --- ManageIQ::Providers::StorageManager::SwiftManager [Users Swift Manager] id [24000000000014] [----] I, [2017-06-06T12:08:47.517491 #13306:93d140] INFO -- : MIQ(ManageIQ::Providers::StorageManager::SwiftManager::Refresher#refresh) Refreshing all targets...Complete [----] E, [2017-06-06T12:08:47.517615 #13306:93d140] ERROR -- : MIQ(MiqQueue#deliver) Message id: [24000000301633], Error: [undefined method `each' for nil:NilClass] [----] E, [2017-06-06T12:08:47.517703 #13306:93d140] ERROR -- : [EmsRefresh::Refreshers::EmsRefresherMixin::PartialRefreshError]: undefined method `each' for nil:NilClass Method:[rescue in deliver] [----] E, [2017-06-06T12:08:47.517809 #13306:93d140] ERROR -- : /var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:50:in `refresh' /var/www/miq/vmdb/app/models/manageiq/providers/base_manager/refresher.rb:10:in `refresh' /var/www/miq/vmdb/app/models/ems_refresh.rb:91:in `block in refresh' /var/www/miq/vmdb/app/models/ems_refresh.rb:90:in `each' /var/www/miq/vmdb/app/models/ems_refresh.rb:90:in `refresh' /var/www/miq/vmdb/app/models/miq_queue.rb:347:in `block in deliver' /opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
Assigning to Jerry Keselman after a private gitter conversation with him.
Please assess the impact of this issue and update the severity accordingly. Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition. If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.
I don't have the system or time to work on this for free any more. I don't know why this ticket has come back to me as I thought I had provided all requested information. Happy for it to be closed if it can't be worked on without me.
The errors in Comment 3 is a duplicate BZ of - https://bugzilla.redhat.com/show_bug.cgi?id=1538501 which was fixed by this PR: https://github.com/ManageIQ/manageiq/pull/16922. I am going to add a PR for the fixes suggested in Comment 2 which should resolve this.
https://github.com/ManageIQ/manageiq/pull/17067
https://github.com/ManageIQ/manageiq-providers-openstack/pull/240
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/73f42d4b321dccdb1ba4a44f067784d36fda465e commit 73f42d4b321dccdb1ba4a44f067784d36fda465e Author: Jerry Keselman <jkeselma> AuthorDate: Wed Feb 28 11:26:14 2018 -0500 Commit: Jerry Keselman <jkeselma> CommitDate: Wed Feb 28 11:26:14 2018 -0500 Fail Cinder/Swift Ensures if Service not Present Return false for cinder and swift ensure calls if the service is not present. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1458959 Requires a fix to the Openstack provider as well although order of merges is irrelevant. app/models/mixins/cinder_manager_mixin.rb | 1 + app/models/mixins/swift_manager_mixin.rb | 1 + 2 files changed, 2 insertions(+)
New commit detected on ManageIQ/manageiq-providers-openstack/master: https://github.com/ManageIQ/manageiq-providers-openstack/commit/44b23b6959ddfab9095342b9d5041f979a5e4ad6 commit 44b23b6959ddfab9095342b9d5041f979a5e4ad6 Author: Jerry Keselman <jkeselma> AuthorDate: Wed Feb 28 11:42:07 2018 -0500 Commit: Jerry Keselman <jkeselma> CommitDate: Wed Feb 28 11:42:07 2018 -0500 Dont return Storage Services if They arent present Check if the cinder and swift services are actually present before checking the names and returning them. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1458959 app/models/manageiq/providers/openstack/cloud_manager.rb | 4 +- 1 file changed, 2 insertions(+), 2 deletions(-)
Verified on 5.10.0.27