RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1469170 - Corosync should set priority when set of RR scheduler fails
Summary: Corosync should set priority when set of RR scheduler fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: corosync
Version: 7.5
Hardware: All
OS: All
unspecified
low
Target Milestone: rc
: ---
Assignee: Jan Friesse
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1476214
TreeView+ depends on / blocked
 
Reported: 2017-07-10 14:50 UTC by Jan Friesse
Modified: 2021-09-09 12:25 UTC (History)
4 users (show)

Fixed In Version: corosync-2.4.0-10.el7
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-04-10 16:52:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Proposed patch (6.29 KB, patch)
2017-07-10 14:50 UTC, Jan Friesse
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5911451 0 None None None 2021-03-29 09:52:10 UTC
Red Hat Product Errata RHBA-2018:0920 0 None None None 2018-04-10 16:53:42 UTC

Description Jan Friesse 2017-07-10 14:50:08 UTC
Created attachment 1295850 [details]
Proposed patch

Description of problem:
When for some reason sched_setscheduler fails to set RR scheduler corosync continues without changed priority (so like a standard process). This is not optimal because coorsync has near realtime requirements.

We cannot solve sched_setscheduler failure but we can at least set change nice value so corosync gets some advantage over other processes.

Version-Release number of selected component (if applicable):
Every

How reproducible:
100%

Steps to Reproduce:
Force sched_setscheduler to fail

Actual results:
Nice value is unchanged

Expected results:
Nice value is set to lowest possible value for SCHED_OTHER

Additional info:
"Unit test"
https://github.com/corosync/corosync/pull/228#issuecomment-313723620

Comment 2 Jan Friesse 2017-07-28 15:46:30 UTC
"Unit test" may be invalid when https://bugzilla.redhat.com/show_bug.cgi?id=1476214 is also merged. Solution is to use -R so corosync doesn't try to move itself into root cgroup.

Comment 5 michal novacek 2018-01-16 10:32:07 UTC
I have verified that NICE value is set to -20 if RTPRIO cannot be set with corosync-2.4.3-1.el7.x86_64.

----

Common part
===========

> create new rt group
[root@virt-426 ~]# cgcreate -g cpu:test
[root@virt-426 ~]# cgset -r cpu.rt_runtime_us=100000 test
[root@virt-426 ~]# cgget -r cpu.rt_runtime_us test
test:
cpu.rt_runtime_us: 100000


> check that corosync start in the correct group
[root@virt-426 ~]# cgexec -g cpu:test  corosync
notice  [MAIN  ] Corosync Cluster Engine ('2.4.3'): started and ready to provide service.
info    [MAIN  ] Corosync built-in features: dbus systemd xmlconf qdevices qnetd snmp libcgroup pie relro bindnow
notice  [MAIN  ] Corosync sucesfully moved to root cgroup

[root@virt-426 ~]# ps -T  -O cls,rtprio,pri,ni $(pidof corosync)
  PID CLS RTPRIO PRI  NI S TTY          TIME COMMAND
20334  RR     99 139   - S ?        00:00:05 corosync
20334  RR     99 139   - S ?        00:00:00 corosync

> kill corosync process
[root@virt-426 x86_64]# killall corosync
[root@virt-426 x86_64]# killall corosync
corosync: no process found

Before the patch (corosync-2.4.0-9.el7.x86_64)
==============================================

[root@virt-426 x86_64]# cgset  -r cpu.rt_runtime_us=0 test
[root@virt-426 x86_64]# cgget  -r cpu.rt_runtime_us test
test:
cpu.rt_runtime_us: 0

> run corosync in the changed group
[root@virt-426 x86_64]# cgexec -g cpu:test  corosync
notice  [MAIN  ] Corosync Cluster Engine ('2.4.0'): started and ready to provide service.
info    [MAIN  ] Corosync built-in features: dbus systemd xmlconf qdevices qnetd snmp pie relro bindnow

> RTPRIO not set and NICE not set
[root@virt-426 x86_64]# ps -T -O cls,rtprio,pri,ni $(pidof corosync)
  PID CLS RTPRIO PRI  NI S TTY          TIME COMMAND
 2463  TS      -  19   0 S ?        00:00:00 corosync
 2463  TS      -  19   0 S ?        00:00:00 corosync


After the patch (corosync-2.4.3-1.el7.x86_64)
=============================================

> use "-R" so corosync does not try to move to root group
[root@virt-426 ~]# cgexec -g cpu:test corosync -R
notice  [MAIN  ] Corosync Cluster Engine ('2.4.3'): started and ready to provide service.
info    [MAIN  ] Corosync built-in features: dbus systemd xmlconf qdevices qnetd snmp libcgroup pie relro bindnow
warning [MAIN  ] Could not set SCHED_RR at priority 99: Operation not permitted (1)

> RTPRIO not set but NICE value of corosync changed to -20
[root@virt-426 ~]# ps -T -O cls,rtprio,pri,ni $(pidof corosync)
  PID CLS RTPRIO PRI  NI S TTY          TIME COMMAND
 2013  TS      -  39 -20 S ?        00:00:00 corosync -R
 2013  TS      -  39 -20 S ?        00:00:00 corosync -R

Comment 8 errata-xmlrpc 2018-04-10 16:52:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0920


Note You need to log in before you can comment on or make changes to this bug.