Bug 1447859 - Control plane service restart during compute node scaleout
Summary: Control plane service restart during compute node scaleout
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-nova
Version: 11.0 (Ocata)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z1
: 11.0 (Ocata)
Assignee: Emilien Macchi
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On: 1472142
Blocks: 1455175
TreeView+ depends on / blocked
 
Reported: 2017-05-04 05:25 UTC by VIKRANT
Modified: 2017-07-19 17:04 UTC (History)
8 users (show)

Fixed In Version: puppet-nova-10.4.0-6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1455175 (view as bug list)
Environment:
Last Closed: 2017-07-19 17:04:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
controller sosreport before scaleout (10.08 MB, application/x-xz)
2017-05-06 01:47 UTC, VIKRANT
no flags Details
controller sosreport after scaleout (10.91 MB, application/x-xz)
2017-05-06 01:49 UTC, VIKRANT
no flags Details
compute sosreport before scaleout (8.63 MB, application/x-xz)
2017-05-06 01:49 UTC, VIKRANT
no flags Details
compute sosreport after scaleout (8.77 MB, application/x-xz)
2017-05-06 01:50 UTC, VIKRANT
no flags Details
undercloud (15.85 MB, application/x-xz)
2017-05-06 01:52 UTC, VIKRANT
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1690946 0 None None None 2017-05-17 20:36:55 UTC
OpenStack gerrit 457258 0 None MERGED Move gnocchi wsgi configuration to step 3 2021-01-04 08:19:19 UTC
OpenStack gerrit 457259 0 None MERGED Move ceilometer wsgi to step 3 2021-01-04 08:19:19 UTC
OpenStack gerrit 465980 0 None MERGED Properly handle arrays for enabled_perf_events 2021-01-04 08:19:19 UTC
Red Hat Product Errata RHBA-2017:1778 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 director Bug Fix Advisory 2017-07-19 21:01:28 UTC

Description VIKRANT 2017-05-04 05:25:46 UTC
Description of problem:

Services are getting restarted on existing controller and compute nodes during the scaleout of compute nodes. 

~~~
Controller:

Not restarted: neutron, nova, glance, rabbitmq, cinder, swift, ceilometer,
redis, heat

Restarted: httpd, keystone (as httpd was restarted hence keystone restart
was expected)

Compute:

Not restarted: neutron,

Restarted: nova-compute
~~~


Version-Release number of selected component (if applicable):
RHEL OSP 11

How reproducible:
Everytime.

Steps to Reproduce:
1. Deploy openstack setup with 1 controller and 1 compute.
2. Capture the sosreport from the overlcoud nodes. 
2. Try to perform scaleout from 1 to 2 compute node. 
3. Capture the sosreport again and compare the services which got restarted after scaleout. 

Actual results:
Services are getting restarted. 

Expected results:
No service should get restarted

Additional info:

More information in coming in next comment.

Comment 7 VIKRANT 2017-05-06 01:47:25 UTC
Created attachment 1276727 [details]
controller sosreport before scaleout

Comment 8 VIKRANT 2017-05-06 01:49:09 UTC
Created attachment 1276728 [details]
controller sosreport after scaleout

Comment 9 VIKRANT 2017-05-06 01:49:49 UTC
Created attachment 1276729 [details]
compute sosreport before scaleout

Comment 10 VIKRANT 2017-05-06 01:50:44 UTC
Created attachment 1276730 [details]
compute sosreport after scaleout

Comment 11 VIKRANT 2017-05-06 01:52:14 UTC
Created attachment 1276731 [details]
undercloud

Comment 12 Alex Schultz 2017-05-15 20:02:56 UTC
For the controllers, the httpd was restarted due to the gnocchi/ceilometer the configuration being removed in step3 but being reapplied in step4. This needs to be pulled downstream.


So for the compute node, the services restarted due to libvirt/enabled_perf_events being 'created'. But if you look in the configuration, it's not actually set in the configuration file. Need to track down why this is happening, usually it's because of an issue handling '' or [].  This may not be fixed yet.

May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Compute::Libvirt/Nova_config[libvirt/enabled_perf_events]/ensure: created#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Deps/Anchor[nova::config::end]: Triggered 'refresh' from 1 events#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Deps/Anchor[nova::service::begin]: Triggered 'refresh' from 1 events#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]: Triggered 'refresh' from 1 events#033[0m
May  4 15:57:25 host-192-168-24-12 os-collect-config: #033[mNotice: /Stage[main]/Nova::Deps/Anchor[nova::service::end]: Triggered 'refresh' from 1 events#033[0m

Comment 14 Alex Schultz 2017-07-18 19:00:01 UTC
Verified nova-compute is not restarted on a compute scaleout action with puppet-nova-10.4.1-1.el7ost.noarch

Comment 16 errata-xmlrpc 2017-07-19 17:04:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1778


Note You need to log in before you can comment on or make changes to this bug.