RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1476214 - running docker containers prevents processes to use real-time scheduling when restarted
Summary: running docker containers prevents processes to use real-time scheduling when...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: corosync
Version: 7.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Jan Friesse
QA Contact: Marian Krcmarik
URL:
Whiteboard:
Depends On: 1467919 1469170
Blocks: 1415556 1477461
TreeView+ depends on / blocked
 
Reported: 2017-07-28 10:14 UTC by Chris Jones
Modified: 2018-04-10 16:53 UTC (History)
28 users (show)

Fixed In Version: corosync-2.4.0-10.el7
Doc Type: Bug Fix
Doc Text:
Previously, when the corosync service was started or restarted after systemd had enabled CPU Accounting, corosync was not able to run with Real Time (RT) scheduling priority, which could reduce the stability of the High Availability (HA) cluster. This update moves corosync to the root CPU cgroup by default, and now corosync can run with Real Time priority, as expected.
Clone Of: 1467919
: 1477461 (view as bug list)
Environment:
Last Closed: 2018-04-10 16:52:19 UTC
Target Upstream Version:
Embargoed:
igkioka: needinfo-


Attachments (Terms of Use)
Proposed patch (8.20 KB, patch)
2017-07-28 15:22 UTC, Jan Friesse
no flags Details | Diff
Proposed patch v2 - upstream (8.43 KB, patch)
2017-08-01 12:44 UTC, Jan Friesse
no flags Details | Diff
Proposed patch v2 - upstream (8.43 KB, patch)
2017-08-01 12:45 UTC, Jan Friesse
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0920 0 None None None 2018-04-10 16:53:42 UTC

Comment 2 Jan Friesse 2017-07-28 15:20:54 UTC
@Chris
I've implemented "workaround" which just moves corosync to root cpu cgroup on start using libcgroup.

Scratch build is https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13742381

I've tested by using CPUAccounting=True in one of unit files (not the corosync one) but it would be nice if you could test it with docker use case.

Comment 3 Jan Friesse 2017-07-28 15:22:27 UTC
Created attachment 1305986 [details]
Proposed patch

main: Add support for libcgroup

When corosync is started in environment where it ends in cgroup without
properly set rt_runtime_us it's impossible to get RT priority.

Already implemented workaround is to use higher non-RT priority.

This patch implements another solution. It moves corosync into root cpu
cgroup. Root cpu cgroup hopefully has enough RT budget.

Another solution was mentioned on ML
https://lists.freedesktop.org/archives/systemd-devel/2017-July/039353.html
but this means to generate some "random" values.

Comment 4 Jan Friesse 2017-07-28 15:29:47 UTC
What I've tested ("Unit test"):
- Install httpd
- copy httpd.service into /etc and add "CPUAccounting=True" line to [service] section
- systemctl daemon-reload
- service httpd restart
- service corosync restart

(Before patch):

- corosync should have standard priority (no RT)

- Install updated corosync
- service corosync restart
- corosync should have RT priority

Comment 5 Chris Jones 2017-07-28 15:54:14 UTC
Scratch build tested in two different scenarios (by poki and rasca) and worked in both :)

Comment 6 Raoul Scarazzini 2017-07-28 15:59:39 UTC
@Jan I tested the new package and it does what we need.

With the corosync shipped in osp12:

[root@overcloud-controller-0 ~]# rpm -qa corosync
corosync-2.4.0-9.el7.x86_64
[root@overcloud-controller-0 ~]# ps -eo pid,class,rtprio,command --sort=+class | grep [c]orosync
 20229 RR      99 corosync
[root@overcloud-controller-0 ~]# systemctl restart corosync
[root@overcloud-controller-0 ~]# ps -eo pid,class,rtprio,command --sort=+class | grep [c]orosync
191635 TS       - corosync

So scheduler was changed. With the new corosync package:

[root@overcloud-controller-1 ~]# rpm -Uvh /home/heat-admin/corosync*
Preparing...                          ################################# [100%]
Updating / installing...
   1:corosynclib-2.4.0-9.el7.jf1      ################################# [ 25%]
   2:corosync-2.4.0-9.el7.jf1         ################################# [ 50%]
Cleaning up / removing...
   3:corosynclib-2.4.0-9.el7          ################################# [ 75%]
   4:corosync-2.4.0-9.el7             ################################# [100%]
[root@overcloud-controller-1 ~]# ps -eo pid,class,rtprio,command --sort=+class | grep [c]orosync
 19985 RR      99 corosync
[root@overcloud-controller-1 ~]# systemctl restart corosync
[root@overcloud-controller-1 ~]# ps -eo pid,class,rtprio,command --sort=+class | grep [c]orosync
 11204 RR      99 corosync

So scheduler have been kept.

Comment 8 Jan Friesse 2017-08-01 12:44:46 UTC
Created attachment 1307550 [details]
Proposed patch v2 - upstream

Comment 9 Jan Friesse 2017-08-01 12:45:39 UTC
Created attachment 1307551 [details]
Proposed patch v2 - upstream

Comment 12 Chris Jones 2017-08-18 11:10:18 UTC
Can I suggest that the doctext be:

Previously, if corosync was started (or restarted) after systemd had enabled CPU Accounting, corosync would not be able to run with Real Time scheduling priority, which could reduce the stability of the High Availability (HA) cluster. This update moves corosync to the root CPU cgroup by default, allowing it to obtain Real Time priority.

Comment 14 Jan Friesse 2017-08-18 14:29:31 UTC
Yep, Chris description sounds much better.

Comment 16 pkomarov 2018-02-15 11:53:29 UTC
Verified, 

#rpm -qa|grep corosync
corosync-2.4.0-10.el7.x86_64
corosynclib-2.4.0-10.el7.x86_64

#  systemctl is-active docker
active

# docker pull centos
Using default tag: latest
Trying to pull repository registry.access.redhat.com/centos ... 
Trying to pull repository docker.io/library/centos ... 
latest: Pulling from docker.io/library/centos
af4b0a2388c6: Pull complete 
Digest: sha256:2671f7a3eea36ce43609e9fe7435ade83094291055f1c96d9d1d1d7c0b986a5d

# docker run -it centos /bin/true

# pcs cluster stop
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...
# pcs cluster start
Starting Cluster...

# chrt -p $(pidof corosync)
pid 461983's current scheduling policy: SCHED_RR
pid 461983's current scheduling priority: 99

Comment 19 errata-xmlrpc 2018-04-10 16:52:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0920


Note You need to log in before you can comment on or make changes to this bug.