1447859 – Control plane service restart during compute node scaleout

Bug 1447859 - Control plane service restart during compute node scaleout

Summary: Control plane service restart during compute node scaleout

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	puppet-nova
Sub Component:
Version:	11.0 (Ocata)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	z1
Target Release:	11.0 (Ocata)
Assignee:	Emilien Macchi
QA Contact:	Gurenko Alex
Docs Contact:
URL:
Whiteboard:
Depends On:	1472142
Blocks:	1455175
TreeView+	depends on / blocked

Reported:	2017-05-04 05:25 UTC by VIKRANT
Modified:	2017-07-19 17:04 UTC (History)
CC List:	8 users (show)
Fixed In Version:	puppet-nova-10.4.0-6.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1455175 (view as bug list)
Environment:
Last Closed:	2017-07-19 17:04:56 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
controller sosreport before scaleout (10.08 MB, application/x-xz) 2017-05-06 01:47 UTC, VIKRANT	no flags	Details
controller sosreport after scaleout (10.91 MB, application/x-xz) 2017-05-06 01:49 UTC, VIKRANT	no flags	Details
compute sosreport before scaleout (8.63 MB, application/x-xz) 2017-05-06 01:49 UTC, VIKRANT	no flags	Details
compute sosreport after scaleout (8.77 MB, application/x-xz) 2017-05-06 01:50 UTC, VIKRANT	no flags	Details
undercloud (15.85 MB, application/x-xz) 2017-05-06 01:52 UTC, VIKRANT	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1690946	None	None	None	2017-05-17 20:36:55 UTC
OpenStack gerrit	457258	None	MERGED	Move gnocchi wsgi configuration to step 3	2021-01-04 08:19:19 UTC
OpenStack gerrit	457259	None	MERGED	Move ceilometer wsgi to step 3	2021-01-04 08:19:19 UTC
OpenStack gerrit	465980	None	MERGED	Properly handle arrays for enabled_perf_events	2021-01-04 08:19:19 UTC
Red Hat Product Errata	RHBA-2017:1778	normal	SHIPPED_LIVE	Red Hat OpenStack Platform 11.0 director Bug Fix Advisory	2017-07-19 21:01:28 UTC

Description VIKRANT 2017-05-04 05:25:46 UTC

Description of problem:

Services are getting restarted on existing controller and compute nodes during the scaleout of compute nodes. 

~~~
Controller:

Not restarted: neutron, nova, glance, rabbitmq, cinder, swift, ceilometer,
redis, heat

Restarted: httpd, keystone (as httpd was restarted hence keystone restart
was expected)

Compute:

Not restarted: neutron,

Restarted: nova-compute
~~~


Version-Release number of selected component (if applicable):
RHEL OSP 11

How reproducible:
Everytime.

Steps to Reproduce:
1. Deploy openstack setup with 1 controller and 1 compute.
2. Capture the sosreport from the overlcoud nodes. 
2. Try to perform scaleout from 1 to 2 compute node. 
3. Capture the sosreport again and compare the services which got restarted after scaleout. 

Actual results:
Services are getting restarted. 

Expected results:
No service should get restarted

Additional info:

More information in coming in next comment.

Comment 7 VIKRANT 2017-05-06 01:47:25 UTC

Created attachment 1276727 [details]
controller sosreport before scaleout

Comment 8 VIKRANT 2017-05-06 01:49:09 UTC

Created attachment 1276728 [details]
controller sosreport after scaleout

Comment 9 VIKRANT 2017-05-06 01:49:49 UTC

Created attachment 1276729 [details]
compute sosreport before scaleout

Comment 10 VIKRANT 2017-05-06 01:50:44 UTC

Created attachment 1276730 [details]
compute sosreport after scaleout

Comment 11 VIKRANT 2017-05-06 01:52:14 UTC

Created attachment 1276731 [details]
undercloud

Comment 12 Alex Schultz 2017-05-15 20:02:56 UTC

For the controllers, the httpd was restarted due to the gnocchi/ceilometer the configuration being removed in step3 but being reapplied in step4. This needs to be pulled downstream.


So for the compute node, the services restarted due to libvirt/enabled_perf_events being 'created'. But if you look in the configuration, it's not actually set in the configuration file. Need to track down why this is happening, usually it's because of an issue handling '' or [].  This may not be fixed yet.

May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Compute::Libvirt/Nova_config[libvirt/enabled_perf_events]/ensure: created#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Deps/Anchor[nova::config::end]: Triggered 'refresh' from 1 events#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Deps/Anchor[nova::service::begin]: Triggered 'refresh' from 1 events#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]: Triggered 'refresh' from 1 events#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Deps/Anchor[nova::service::end]: Triggered 'refresh' from 1 events#033[0m

Comment 14 Alex Schultz 2017-07-18 19:00:01 UTC

Verified nova-compute is not restarted on a compute scaleout action with puppet-nova-10.4.1-1.el7ost.noarch

Comment 16 errata-xmlrpc 2017-07-19 17:04:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1778

Note You need to log in before you can comment on or make changes to this bug.