Bug 1897085
| Summary: | spausedd lacks capability to move to root cgroup [RHEL 8] | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Reid Wahl <nwahl> | |
| Component: | corosync | Assignee: | Jan Friesse <jfriesse> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 8.3 | CC: | ccaulfie, cluster-maint, phagara, sfoucek | |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
|
| Target Release: | 8.4 | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | corosync-3.1.0-3.el8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1897087 (view as bug list) | Environment: | ||
| Last Closed: | 2021-05-18 15:26:09 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1897087 | |||
|
Description
Reid Wahl
2020-11-12 09:43:54 UTC
Patch: https://github.com/jfriesse/spausedd/commit/21836a25 Reproducer is in the first comment. Before fix(version of corosync and spausedd is 3.1.0-1, bz1896309 isn't fixed, so spausedd will throw SIGSEGV instead of "Can't set SCHED_RR"): >[root@virt-038 ~]# rpm -q spausedd >spausedd-3.1.0-1.el8.x86_64 >[root@virt-038 ~]# rpm -q corosync >corosync-3.1.0-1.el8.x86_64 >[root@virt-038 ~]# cat /etc/systemd/system.conf | grep DefaultCPUAccounting >#DefaultCPUAccounting=no >[root@virt-038 ~]# sed -i 's/#DefaultCPUAccounting=no/DefaultCPUAccounting=yes/g' /etc/systemd/system.conf >[root@virt-038 ~]# cat /etc/systemd/system.conf | grep DefaultCPUAccounting >DefaultCPUAccounting=yes >[root@virt-038 ~]# reboot >Connection to virt-038.cluster-qe.lab.eng.brq.redhat.com closed by remote host. >Connection to virt-038.cluster-qe.lab.eng.brq.redhat.com closed. >[root@virt-038 ~]# systemctl start spausedd.service >[root@virt-038 ~]# systemctl status spausedd.service >● spausedd.service - Scheduler Pause Detection Daemon > Loaded: loaded (/usr/lib/systemd/system/spausedd.service; disabled; vendor preset: disabled) > Active: failed (Result: core-dump) since Fri 2020-12-04 13:01:18 CET; 1min 37s ago > Docs: man:spausedd > Process: 2473 ExecStart=/usr/bin/spausedd -D (code=exited, status=0/SUCCESS) > Main PID: 2489 (code=dumped, signal=SEGV) > CPU: 42ms > >Dec 04 13:01:17 virt-038.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Starting Scheduler Pause Detection Daemon... >Dec 04 13:01:17 virt-038.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Started Scheduler Pause Detection Daemon. >Dec 04 13:01:18 virt-038.cluster-qe.lab.eng.brq.redhat.com systemd[1]: spausedd.service: Main process exited, code=dumped, status=11/SEGV >Dec 04 13:01:18 virt-038.cluster-qe.lab.eng.brq.redhat.com systemd[1]: spausedd.service: Failed with result 'core-dump'. >Dec 04 13:01:18 virt-038.cluster-qe.lab.eng.brq.redhat.com systemd[1]: spausedd.service: Consumed 42ms CPU time >[root@virt-038 ~]# spausedd >Dec 04 13:05:14 spausedd: Segmentation fault (core dumped) Result: After DefaultCPUAccounting set to yes and reboot, spausedd.service fails with SIGSEGV error. After fix: >[root@virt-489 ~]# rpm -q spausedd >spausedd-3.1.0-3.el8.x86_64 >[root@virt-489 ~]# rpm -q corosync >corosync-3.1.0-3.el8.x86_64 >[root@virt-489 ~]# cat /etc/systemd/system.conf | grep DefaultCPUAccounting >#DefaultCPUAccounting=no >[root@virt-489 ~]# sed -i 's/#DefaultCPUAccounting=no/DefaultCPUAccounting=yes/g' /etc/systemd/system.conf >[root@virt-489 ~]# cat /etc/systemd/system.conf | grep DefaultCPUAccounting >DefaultCPUAccounting=yes >[root@virt-489 ~]# reboot >Connection to virt-489.cluster-qe.lab.eng.brq.redhat.com closed by remote host. >Connection to virt-489.cluster-qe.lab.eng.brq.redhat.com closed. >[root@virt-489 ~]# systemctl start spausedd.service >[root@virt-489 ~]# systemctl status spausedd.service >● spausedd.service - Scheduler Pause Detection Daemon > Loaded: loaded (/usr/lib/systemd/system/spausedd.service; disabled; vendor preset: disabled) > Active: active (running) since Fri 2020-12-04 12:49:45 CET; 13min ago > Docs: man:spausedd > Process: 2312 ExecStart=/usr/bin/spausedd -D (code=exited, status=0/SUCCESS) > Main PID: 2313 (spausedd) > Tasks: 1 (limit: 25573) > Memory: 1.7M > CPU: 8ms > CGroup: /system.slice/spausedd.service > └─2313 /usr/bin/spausedd -D > >Dec 04 12:49:45 virt-489.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Starting Scheduler Pause Detection Daemon... >Dec 04 12:49:45 virt-489.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Started Scheduler Pause Detection Daemon. >Dec 04 12:49:45 virt-489.cluster-qe.lab.eng.brq.redhat.com spausedd[2313]: Running main poll loop with maximum timeout 200 and steal threshold 10% >[root@virt-489 ~]# spausedd >Dec 04 12:50:03 spausedd: Running main poll loop with maximum timeout 200 and steal threshold 10% >[root@virt-489 ~]# chrt -p $(pidof spausedd) >pid 2313's current scheduling policy: SCHED_RR >pid 2313's current scheduling priority: 99 Result: After DefaultCPUAccounting set to yes and reboot, spausedd.service starts normally and obtains SCHED_RR priority. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (corosync bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1780 |