Red Hat Bugzilla – Bug 1356112
sosreport runs glance-manage db_version and heat-manage db_version commands on compute nodes
Last modified: 2018-04-10 14:01:23 EDT
Description of problem: sosreport runs glance-manage db_version and heat-manage db_version commands on compute nodes which hangs for a couple of minutes because the non controllers nodes don't get configured for these services: During sosreport run: [root@overcloud-compute-0 heat-admin]# ps axu | grep glance root 9282 0.0 0.0 8584 648 pts/1 S 12:27 0:00 timeout 300s glance-manage db_version root 9283 2.4 1.2 312976 76896 pts/1 S 12:27 0:01 /usr/bin/python2 /usr/bin/glance-manage db_version root 9428 0.0 0.0 112648 972 pts/2 S+ 12:28 0:00 grep --color=auto glance [root@overcloud-compute-0 heat-admin]# ps axu | grep heat root 9548 0.0 0.0 8584 652 pts/1 S 12:29 0:00 timeout 300s heat-manage db_version root 9549 3.9 0.8 312172 52804 pts/1 S 12:29 0:01 /usr/bin/python /usr/bin/heat-manage db_version root 9631 0.0 0.0 112648 972 pts/2 S+ 12:29 0:00 grep --color=auto heat Version-Release number of selected component (if applicable): sos-3.2-36.el7ost.2.noarch How reproducible: 100% Steps to Reproduce: 1. Run sosreport on non controllers nodes Actual results: During the run check the heat and glance commands. Expected results: There are no db_version commands ran as the services are not configured to know about the db.
(showing glance issue, equivalent one is for heat): /usr/lib/python2.7/site-packages/sos/plugins/openstack_glance.py : class OpenStackGlance(Plugin): """OpenStack Glance""" plugin_name = "openstack_glance" profiles = ('openstack', 'openstack_controller') option_list = [] def setup(self): # Glance self.add_cmd_output( "glance-manage db_version", suggest_filename="glance_db_version" ) The "profiles = " triggers the plugin is run on a system with either openstack or openstack_controller package installed. Then everytime, the "glance-manage db_version" is called. What shall be the trigger to call this command? Only presence of openstack_controller? Other commands and logs collection from this plugin shall be collected on either controller or compute nodes, right? Just this command shall not be called on compute node (where openstack_controller is missing, I guess)?
(In reply to Pavel Moravec from comment #4) > (showing glance issue, equivalent one is for heat): > > /usr/lib/python2.7/site-packages/sos/plugins/openstack_glance.py : > > class OpenStackGlance(Plugin): > """OpenStack Glance""" > plugin_name = "openstack_glance" > profiles = ('openstack', 'openstack_controller') > > option_list = [] > > def setup(self): > # Glance > self.add_cmd_output( > "glance-manage db_version", > suggest_filename="glance_db_version" > ) > > > The "profiles = " triggers the plugin is run on a system with either > openstack or openstack_controller package installed. Then everytime, the > "glance-manage db_version" is called. > > What shall be the trigger to call this command? Only presence of > openstack_controller? > > Other commands and logs collection from this plugin shall be collected on > either controller or compute nodes, right? Just this command shall not be > called on compute node (where openstack_controller is missing, I guess)? Hello, could you please comment here (I forgot to raise needinfo here)?
(In reply to Pavel Moravec from comment #6) > (In reply to Pavel Moravec from comment #4) > > (showing glance issue, equivalent one is for heat): > > > > /usr/lib/python2.7/site-packages/sos/plugins/openstack_glance.py : > > > > class OpenStackGlance(Plugin): > > """OpenStack Glance""" > > plugin_name = "openstack_glance" > > profiles = ('openstack', 'openstack_controller') > > > > option_list = [] > > > > def setup(self): > > # Glance > > self.add_cmd_output( > > "glance-manage db_version", > > suggest_filename="glance_db_version" > > ) > > > > > > The "profiles = " triggers the plugin is run on a system with either > > openstack or openstack_controller package installed. Then everytime, the > > "glance-manage db_version" is called. > > > > What shall be the trigger to call this command? Only presence of > > openstack_controller? > > > > Other commands and logs collection from this plugin shall be collected on > > either controller or compute nodes, right? Just this command shall not be > > called on compute node (where openstack_controller is missing, I guess)? > > Hello, > could you please comment here (I forgot to raise needinfo here)? I think with the new composable roles architecture we should trigger the component related commands based on whether the service is running on a particular node. We cannot assume that a node is an openstack controller anymore because the controller services can be split across different nodes. So for instance in case of glance we should check that the openstack-glance-api service status and run the glance-manage db_version command only on the nodes where it is running.
Just to confirm the change: - additional check whether "service openstack-glance-api status" returns 0 should be added (or "service openstack-heat-api status") - if the check returns nonzero, then just "[heat|glance]-manage db_version" command must be skipped - or whole plug-in as well must be skipped? I.e. when the service isnt running on the particular node, shall we collect stuff like /var/log/glance/*.log /etc/glance/* output of "openstack image list --long" (for glance, similar stuff is for heat) If service isnt running on the particular node, shall or shouldnt we collect the above configs/logs/commands output?
(In reply to Pavel Moravec from comment #8) > Just to confirm the change: > > - additional check whether "service openstack-glance-api status" returns 0 > should be added (or "service openstack-heat-api status") > > - if the check returns nonzero, then just "[heat|glance]-manage db_version" > command must be skipped - or whole plug-in as well must be skipped? I.e. > when the service isnt running on the particular node, shall we collect stuff > like > > /var/log/glance/*.log > /etc/glance/* > output of "openstack image list --long" > > (for glance, similar stuff is for heat) > > If service isnt running on the particular node, shall or shouldnt we collect > the above configs/logs/commands output? Bouncing needinfo, esp. if the BZ should be targeted to RHEL 7.5 (with devel freeze bit approaching already).
(In reply to Pavel Moravec from comment #8) > Just to confirm the change: > > - additional check whether "service openstack-glance-api status" returns 0 > should be added (or "service openstack-heat-api status") > > - if the check returns nonzero, then just "[heat|glance]-manage db_version" > command must be skipped - or whole plug-in as well must be skipped? I.e. > when the service isnt running on the particular node, shall we collect stuff > like > > /var/log/glance/*.log > /etc/glance/* > output of "openstack image list --long" > > (for glance, similar stuff is for heat) > > If service isnt running on the particular node, shall or shouldnt we collect > the above configs/logs/commands output? Sorry for the delay. To answer your question we need to: 1. check the service status(service, systemctl, etc) 2. if service is running then run all commands and collection related to that particlar service. if it's not running then do not run the commands/collection related to the service.
I asked also whether it makes sense to stop collecting logs and configs in case the service is not running. Anyway after clarification with Lee and mschuppert, we agreed it makes sense to collect them (i.e. to troubleshoot cases when service fails to start). See linked upstream PR.
Fixed via sos 3.5 rebase.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0963