Created attachment 1242573 [details]
dmesg log file
Description of problem:
Virsh hangs when running cyclictest on multi-vcpu machine.
The test was run for 24 hours.
Other issues found:
A number of processes were stopped like tuned.
Version-Release number of selected component (if applicable):
If related to the BZ:https://bugzilla.redhat.com/show_bug.cgi?id=1403265
then pretty frequent.
Steps to Reproduce:
1.Create an rt environemnt
2.Create a rt-guest
Maybe related to this BZ:https://bugzilla.redhat.com/show_bug.cgi?id=1403265
By mutlti-vcpu ,I meant the machine has a guest with several real time cpus
*Very initial* debugging seems to show that systemd executes the following code path in the kernel:
Then, two things happen:
1. It blocks for good in wait_for_completion()
2. As it took cgroups global lock before blocking, everyone acquiring that lock will block too. That's why we have a bunch of processes blocked (kworkers, libvirtd, systemd-journal etc)
I'll keep investigating...
It seems that this issue happens as a result of bug 1403265 triggering first. In that case this BZ might a duplicate.
I'll focus on getting bug 1403265 fixed first and will get back to this afterwards.
I've confirmed that this issue only happens as a result of bug 1403265 triggering first. Closing as a dupe.
*** This bug has been marked as a duplicate of bug 1403265 ***
I have no permission on 'bug 1403265'. Which version of redhat kernel fixed this issue? @Luiz Capitulino
(In reply to wbs9399 from comment #7)
> I have no permission on 'bug 1403265'. Which version of redhat kernel fixed
> this issue? @Luiz Capitulino
This was fixed awhile back in AUG2017 in kernel-rt-3.10.0-693.rt56.617.el7. If you install the most recent kernel-rt release you will get this fix plus all the bug and security fixes released since then.