Back to bug 2024347
| Who | When | What | Removed | Added |
|---|---|---|---|---|
| Red Hat Bugzilla | 2021-11-17 21:57:51 UTC | Pool ID | sst_pt_gcc_glibc_rhel_9 | |
| Red Hat One Jira (issues.redhat.com) | 2021-11-17 22:01:23 UTC | Link ID | Red Hat Issue Tracker RHELPLAN-103091 | |
| Florian Weimer | 2021-11-18 10:18:56 UTC | Doc Type | --- | If docs needed, set a value |
| Florian Weimer | 2021-12-06 20:14:20 UTC | Type | Bug | Enhancement |
| Keywords | FutureFeature, Triaged | |||
| Jeremy Linton (ARM) | 2021-12-09 21:33:23 UTC | Blocks | 1877135 | |
| Florian Weimer | 2021-12-09 23:14:28 UTC | Depends On | 2030872 | |
| Keywords | Patch | |||
| Doc Type | If docs needed, set a value | Enhancement | ||
| Hardware | aarch64 | All | ||
| Doc Text | Feature: sched_getcpu implementation using rseq (restartable sequences) Reason: sched_getcpu is implemented in terms of the getcpu system call on AArch64, which is very slow for typical usage of sched_getcpu in parallel algorithms. On other architectures, vDSO acceleration is already used, but the vDSO uses a special register which might stall execution. Result: sched_getcpu performance on AArch64 is significantly improved. Other architectures see a slight improvement. | |||
| Summary | Please improve sched_getcpu performance to avoid syscalls | glibc: Implement sched_getcpu using rseq | ||
| Flags | needinfo?(jlinton) | |||
| Martin Cermak | 2021-12-09 23:30:06 UTC | QA Contact | qe-baseos-tools-bugs | skolosov |
| Florian Weimer | 2021-12-16 20:12:30 UTC | Depends On | 2033446 | |
| Florian Weimer | 2022-01-14 19:52:13 UTC | Summary | glibc: Implement sched_getcpu using rseq | glibc: Implement optional sched_getcpu using rseq |
| Assignee | glibc-bugzilla | fweimer | ||
| Status | NEW | ASSIGNED | ||
| Florian Weimer | 2022-01-19 11:29:53 UTC | Summary | glibc: Implement optional sched_getcpu using rseq | glibc: Optional sched_getcpu acceleration using rseq |
| Florian Weimer | 2022-01-20 20:35:15 UTC | Fixed In Version | glibc-2.34-19.el9 | |
| Status | ASSIGNED | MODIFIED | ||
| errata-xmlrpc | 2022-01-31 21:03:34 UTC | Status | MODIFIED | ON_QA |
| Sergey Kolosov | 2022-02-03 20:43:02 UTC | Status | ON_QA | VERIFIED |
| Florian Weimer | 2022-04-11 04:55:37 UTC | Docs Contact | mtimar | |
| CC | mtimar | |||
| Doc Text | Feature: sched_getcpu implementation using rseq (restartable sequences) Reason: sched_getcpu is implemented in terms of the getcpu system call on AArch64 | .`sched_getcpu` implementation now uses `rseq` (restartable sequences) to achieve improved performance on AArch64 and other architectures Standard implementation of `sched_getcpu` on AArch64 uses `getcpu` system call | ||
| Doc Text | , which is very slow for typical usage of sched_getcpu in parallel algorithms. On other architectures, vDSO acceleration is already used | , which is very slow when called in parallel algorithms. Other architectures use `vDSO` acceleration to get around this. Implementing `sched_getcpu` using `rseq` greatly improves performance on AArch64 architectures. Other architectures see a slight | ||
| Doc Text | , but the vDSO uses a special register which might stall execution. Result: sched_getcpu performance on AArch64 is significantly improved. Other architectures see a slight improvement. | improvement. | ||
| Flags | needinfo?(fweimer) | |||
| Flags | needinfo?(fweimer) | |||
| Jeremy Linton | 2022-04-25 15:17:31 UTC | Flags | needinfo?(fweimer) | |
| CC | jeremy.linton | |||
| Jeremy Linton | 2022-05-11 17:00:25 UTC | Flags | needinfo?(jeremy.linton) | |
| Flags | needinfo?(jeremy.linton) | |||
| Florian Weimer | 2022-05-13 09:28:16 UTC | Flags | needinfo?(fweimer) | |
| Florian Weimer | 2022-05-13 09:28:32 UTC | Flags | needinfo?(jlinton) | |
| Florian Weimer | 2022-05-13 15:20:24 UTC | Depends On | 2033446 | |
| errata-xmlrpc | 2022-05-17 00:33:35 UTC | Status | VERIFIED | RELEASE_PENDING |
| errata-xmlrpc | 2022-05-17 15:48:51 UTC | Resolution | --- | ERRATA |
| Status | RELEASE_PENDING | CLOSED | ||
| Last Closed | 2022-05-17 15:48:51 UTC | |||
| errata-xmlrpc | 2022-05-17 15:49:17 UTC | Link ID | Red Hat Product Errata RHBA-2022:3917 | |
| Gabi Fialová | 2022-06-09 08:00:31 UTC | CC | gfialova | |
| Flags | needinfo?(mtimar) | |||
| Gabi Fialová | 2022-06-20 13:01:26 UTC | Doc Text | .`sched_getcpu` implementation now uses `rseq` (restartable sequences) to achieve improved performance on AArch64 and other architectures Standard implementation of `sched_getcpu` on AArch64 uses `getcpu` system call | .`sched_getcpu` implementation now uses `rseq` (restartable sequences) to achieve improved performance on the 64-bit ARM architectures and other architectures Standard implementation of `sched_getcpu` on the 64-bit ARM architectures uses `getcpu` |
| Doc Text | , which is very slow when called in parallel algorithms. Other architectures use `vDSO` acceleration to get around this. Implementing `sched_getcpu` using `rseq` greatly improves performance on AArch64 architectures. Other architectures see a slight | system call | ||
| Doc Text | improvement. | , which is very slow when called in parallel algorithms. Other architectures use `vDSO` acceleration to get around this. Implementing `sched_getcpu` using `rseq` greatly improves performance on the 64-bit ARM architectures. Other architectures see a | ||
| Doc Text | slight improvement. | |||
| Flags | needinfo?(fweimer) | |||
| Florian Weimer | 2022-06-20 14:22:43 UTC | Flags | needinfo?(fweimer) | |
| Jacob Taylor Valdez | 2022-06-21 07:48:05 UTC | CC | jvaldez | |
| Flags | needinfo?(fweimer) | |||
| Florian Weimer | 2022-06-21 08:00:08 UTC | Flags | needinfo?(fweimer) | |
| Jacob Taylor Valdez | 2022-06-21 08:39:48 UTC | Doc Text | .`sched_getcpu` implementation now uses `rseq` (restartable sequences) to achieve improved performance on the 64-bit ARM architectures and other architectures Standard implementation of `sched_getcpu` on the 64-bit ARM architectures uses `getcpu` system call, which is very slow when called in parallel algorithms. Other architectures use `vDSO` acceleration to get around this. Implementing `sched_getcpu` using `rseq` greatly improves performance on the 64-bit ARM architectures. Other architectures see a slight improvement. | .`sched_getcpu` implementation can now, optionally, use `rseq` (restartable sequences) to improve performance on the 64-bit ARM architectures and other architectures The previous implementation of `sched_getcpu` on the 64-bit ARM architectures uses the `getcpu` system call, which is too slow for efficient use in most parallel algorithms. Other architectures use vDSO (virtual dynamic shared object) acceleration to work around this. Implementing `sched_getcpu` using `rseq` greatly improves performance on the 64-bit ARM architectures. Other architectures see a slight improvement. To configure `sched_getcpu` to use `rseq`, set the `GLIBC_TUNABLES=glibc.pthread.rseq=1` environment variable: ---- # GLIBC_TUNABLES=glibc.pthread.rseq=1 # export GLIBC_TUNABLES ---- |
| Florian Weimer | 2022-06-22 08:14:02 UTC | Flags | needinfo?(mtimar) | |
| Mark O'Brien | 2023-07-18 14:29:19 UTC | Pool ID | sst_pt_glibc_rhel_9 | sst_pt_libraries_rhel_9 |
Back to bug 2024347