Bug 1956237
| Summary: | Running runqlat from bcc-tools hangs the system | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Marko Myllynen <myllynen> |
| Component: | bcc | Assignee: | Jiri Olsa <jolsa> |
| Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 34 | CC: | acaringi, agerstmayr, jmarchan, jolsa, rdossant, skozina |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-03 15:44:57 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Marko Myllynen
2021-05-03 09:30:37 UTC
Cannot reproduce with kernel-5.11.17-300:
$: uname -r
5.11.17-300.fc34.x86_64
$: rpm -q bcc-tools
bcc-tools-0.18.0-4.fc34.x86_64
# /usr/share/bcc/tools/runqlat
...
3 warnings generated.
Tracing run queue latency... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 3 |* |
8 -> 15 : 40 |************************ |
16 -> 31 : 66 |****************************************|
32 -> 63 : 9 |***** |
64 -> 127 : 15 |********* |
128 -> 255 : 7 |**** |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 1 | |
2048 -> 4095 : 1 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 1 | |
16384 -> 32767 : 3 |* |
32768 -> 65535 : 4 |** |
65536 -> 131071 : 1 | |
131072 -> 262143 : 2 |* |
Thanks for looking into this. I now installed a new Fedora 34 Server VM (on otherwise idle RHEL 7 host) using all defaults and tried .12/.16/.17 kernels. This happens with all those kernel versions here occasionally. Changing the VM CPU doesn't seem to have any notable effect on the frequency of the issue. My testing procedure is: 1) Force off the VM 2) Power on the VM 3) Login via console as root 4) Run 'sync' 5) Run /usr/share/bcc/tools/runqlat After rebooting if the command works, it seems to work several times in row. However, after that the system seems to get stuck on shutdown. If it fails, it gets stuck after "3 warnings generated" or after "Hit Ctrl-C" lines printed. If booting without "rhgb quiet" boot parameters sometimes I see audit messages on consoles about ~100 messages suppressed and backlog limit exceeded. When booting with "audit=0" then it seems runqlat works reliably on two different F34 VMs and the system does not hang on power down. Does this help explaining what might be going on here? Thanks. It's a known deadlock issue in the kernel. AFAIK, it's still being worked on upstream. *** This bug has been marked as a duplicate of bug 1938312 *** |