Bug 1767138

Summary:	Excessive CPU usage in ProcessLauncher()'s wait loop
Product:	Red Hat OpenStack	Reporter:	John Eckersberg <jeckersb>
Component:	python-oslo-service	Assignee:	John Eckersberg <jeckersb>
Status:	CLOSED EOL	QA Contact:	pkomarov
Severity:	medium	Docs Contact:
Priority:	medium
Version:	15.0 (Stein)	CC:	hberaud, lmiccini, mburns, michele, stchen
Target Milestone:	---	Keywords:	Triaged, ZStream
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-09-30 19:15:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description John Eckersberg 2019-10-30 18:41:21 UTC

Description of problem:
This was originally reported upstream here:

https://bugs.launchpad.net/oslo-incubator/+bug/1095346

And the "fix" was that the wait loop sleeps for 0.01 seconds at a time in order for signal handlers and child exits to interleave properly in the event loop.

This gets to be somewhat CPU intensive when you have all of the openstack services colocated on a controller node with multiple workers, all spamming this wait loop.
 

Version-Release number of selected component (if applicable):
All, this is present in upstream master/devstack for oslo_service

How reproducible:
Always

Steps to Reproduce:
strace any service which uses oslo_service

Actual results:
Call select/wait4 100x per seconds:

[stack@devstack ~]$ sudo strace -f -p 14592
strace: Process 14592 attached
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=3424}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9965}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9968}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9904}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9942}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9953}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9967}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9959}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=9977}) = 0 (Timeout)
wait4(0, 0x7ffe9d82bad4, WNOHANG, NULL) = 0
...

Expected results:
This needs to poll in a better way so it can "block" (as much as one can in eventlet...) indefinitely on either signal or child status.  The cotyledon library does this for ceilometer (does not use oslo_service), see bug 1762854.  We can probably do similar.

Additional info:

Comment 1 John Eckersberg 2019-10-30 18:49:17 UTC

Some baseline measurements on my devstack VM, very consistent:

[root@devstack ~]# perf stat -e syscalls:sys_enter_select -e syscalls:sys_enter_wait4 -a -I 1000
#           time             counts unit events
     1.000123502                612      syscalls:sys_enter_select                                   
     1.000123502                491      syscalls:sys_enter_wait4                                    
     2.000979655                617      syscalls:sys_enter_select                                   
     2.000979655                492      syscalls:sys_enter_wait4                                    
     3.001318574                615      syscalls:sys_enter_select                                   
     3.001318574                492      syscalls:sys_enter_wait4                                    
     4.002134374                610      syscalls:sys_enter_select                                   
     4.002134374                491      syscalls:sys_enter_wait4                                    
     5.002278994                616      syscalls:sys_enter_select                                   
     5.002278994                493      syscalls:sys_enter_wait4                                    
     6.002431351                611      syscalls:sys_enter_select                                   
     6.002431351                488      syscalls:sys_enter_wait4                                    
     7.003026036                615      syscalls:sys_enter_select                                   
     7.003026036                493      syscalls:sys_enter_wait4                                    
     8.003501667                613      syscalls:sys_enter_select                                   
     8.003501667                490      syscalls:sys_enter_wait4                                    
     9.004544555                622      syscalls:sys_enter_select                                   
     9.004544555                500      syscalls:sys_enter_wait4                                    
    10.004696272                608      syscalls:sys_enter_select                                   
    10.004696272                485      syscalls:sys_enter_wait4                                    

We can do better!

Comment 2 John Eckersberg 2020-04-27 13:36:48 UTC

The main bit in cotyledon is here:

https://github.com/sileht/cotyledon/blob/master/cotyledon/_utils.py#L114-L203

with _wait_forever being the waitloop.

The main idea is that it sets up a pipe and uses set_wakeup_fd to trigger signal handling via that pipe.  Then the waitloop can block forever on select() because the fdset includes the pipe, so signals will properly interrupt the loop.  At that point you don't have to loop to check os.waitpid() for child exit; instead you'll catch SIGCHLD when the child terminates and you can just call os.waitpid() at that time to do final cleanup.

Comment 3 Hervé Beraud 2020-04-27 17:59:12 UTC

Thanks John for the given details, I think I got it.

(In reply to John Eckersberg from comment #2)
> The main bit in cotyledon is here:
> 
> https://github.com/sileht/cotyledon/blob/master/cotyledon/_utils.py#L114-L203
> 
> with _wait_forever being the waitloop.
> 
> The main idea is that it sets up a pipe and uses set_wakeup_fd to trigger
> signal handling via that pipe.  

If I'm right our ProcessLauncher could process signals by using set_wakeup_fd with our pipe `self.writepipe` [1].

> Then the waitloop can block forever on
> select() because the fdset includes the pipe, so signals will properly
> interrupt the loop.  
> At that point you don't have to loop to check
> os.waitpid() for child exit; instead you'll catch SIGCHLD when the child
> terminates and you can just call os.waitpid() at that time to do final
> cleanup.

We could adapt this code [2] to move from `os.waitpid` to `select.select` like did in cotyledon,
also moving to it could allow us to drop few code conditions and actions. 

The major difference between cotyledon and oslo.service on this topic is that cotyledon split the service management [4] and the signal manager [5]. Oslo.service provide all these feature in a all in one class (`ProcessLauncher`), so also it could be more proper to do the same thing on the oslo side by creating and SignalHandler class inherited by `ProcessLauncher`. The main goal will be let the `ProcessLauncher` manage workers (adjust) like cotyledon did [7] by using a supervisor thread for child [8] (to manage adjustments on the number of worker needed).

Indeed oslo.service mixes worker adjustments and the waitloop so I think split these parts could be valuable to isolate mechanismes and keep the things more cleaner.

Thoughts?

[1] https://github.com/openstack/oslo.service/blob/046507db14ad48c6f085faa48f6e7d27eb6d45c1/oslo_service/service.py#L429
[2] https://github.com/openstack/oslo.service/blob/046507db14ad48c6f085faa48f6e7d27eb6d45c1/oslo_service/service.py#L611,L617
[3] https://github.com/openstack/oslo.service/blob/046507db14ad48c6f085faa48f6e7d27eb6d45c1/oslo_service/service.py#L648,L693
[4] https://cotyledon.readthedocs.io/en/latest/api.html#cotyledon.ServiceManager
[5] https://github.com/sileht/cotyledon/blob/master/cotyledon/_utils.py#L114-L203
[6] https://github.com/openstack/oslo.service/blob/046507db14ad48c6f085faa48f6e7d27eb6d45c1/oslo_service/service.py#L409
[7] https://github.com/sileht/cotyledon/blob/bcc34b6883f921e10689ad0cb4d734e4ac4cbfaf/cotyledon/_service_manager.py#L323
[8] https://github.com/sileht/cotyledon/blob/bcc34b6883f921e10689ad0cb4d734e4ac4cbfaf/cotyledon/_service_manager.py#L228

Side note: I never received notifications about this BZ neither by email or by using the "My bugs" links of bugzilla... so sorry for my late message on this topic.

Comment 4 Hervé Beraud 2020-04-27 18:01:37 UTC

(In reply to Hervé Beraud from comment #3)
> ...
> The major difference between cotyledon and oslo.service on this topic is
> that cotyledon split the service management [4] and the signal manager [5].
> Oslo.service provide all these feature in a all in one class
> (`ProcessLauncher`), so also it could be more proper to do the same thing on

Oslo.service provide all these feature in a all in one class (`ProcessLauncher`) [6]

> ...
> [6]
> https://github.com/openstack/oslo.service/blob/
> 046507db14ad48c6f085faa48f6e7d27eb6d45c1/oslo_service/service.py#L409
> ...

Comment 5 stchen 2020-09-30 19:15:53 UTC

Closing EOL, OSP 15 has been retired as of Sept 19