RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1388528 - KVM-RT: halting and starting guests cause latency spikes [rhel-rt-7.3.z]
Summary: KVM-RT: halting and starting guests cause latency spikes [rhel-rt-7.3.z]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel-rt
Version: 7.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Clark Williams
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On: 1378172
Blocks: 1353018
TreeView+ depends on / blocked
 
Reported: 2016-10-25 14:53 UTC by Marcel Kolaja
Modified: 2016-12-06 17:10 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: synchronized_rcu_expedited() is a call used upstream to increase the priority of rcu synchronize operations Consequence: Calling this may hold off realtime operations and cause latency spikes Fix: make the call to synchronize_rcu_expedited conditional on not being in an RT kernel Result: No latency spikes caused by the rcu expedited call
Clone Of: 1378172
Environment:
Last Closed: 2016-12-06 17:10:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2883 0 normal SHIPPED_LIVE kernel-rt bug fix update 2016-12-06 22:01:07 UTC

Description Marcel Kolaja 2016-10-25 14:53:24 UTC
This bug has been copied from bug #1378172 and has been proposed
to be backported to 7.3 z-stream (EUS).

Comment 5 Pei Zhang 2016-11-15 10:10:19 UTC
Hi Clark,

QE failed to reproduce this issue with rhel7.3GA version.

Below scenarios were tested, but still failed reproduce, no spikes in the testing(the Max latencies < 20):
1. run cyclictests on vm1 for 15m, reboot/halt/shutdown vm2 5min later
2. run cyclictests on vm1, vm2 and vm3 for 15m, reboot vm2 several times


However, QE can reproduce this issue with rhel7.2.z(3.10.0-327.36.1.rt56.237.el7.x86_64) version.


We also tested with this bug's fixed version kernel-rt-3.10.0-514.1.1.rt56.422.el7, no spike occurs.


Could you give QE some suggestions about this bug?  Thanks.


Best Regards,
-Pei

Comment 6 Pei Zhang 2016-11-15 10:15:45 UTC
The rhel7.3GA version we tested: 3.10.0-514.rt56.420.el7.x86_64

Comment 7 Luiz Capitulino 2016-11-15 14:46:24 UTC
The rhel7.3GA kernel does have the bug, I think it's just a matter of trying harder to reproduce it.

What you could do is:

1. Run cyclictest for longer (eg. 1 hour)

2. The second VM should keep rebooting in a loop while cyclictest runs on the other VM

Comment 8 Luiz Capitulino 2016-11-15 15:03:28 UTC
Another note, make sure that the VM that reboots is a "standard" VM. Meaning that, it has a network NIC etc.

The best way is probably to install it with virt-install and don't change the XML.

Comment 9 Luiz Capitulino 2016-11-16 18:21:34 UTC
I talked to Pai Zhang today on IRC and I think we have found out why the problem is not reproducing. As it turn out, the bug reproduces on halt and re-start, not in reboots (as I mention in bug 1378172 comment 22. Sorry for having forgotten about that.

The reproducer I've been using is:

1. Install a "standard" VM with virt-manager (that is, don't change the XML)

2. In the VM, add "halt -p" to /etc/rc.d/rc.local (save a snapshot before doing this if you plan to use the VM afterwards)

3. In the host, write a script that does "virsh start VM" every few seconds in a loop

Then while this is running, run the cyclitest test-case in the RT VM.

Comment 10 Pei Zhang 2016-11-17 10:50:01 UTC
Thanks Luiz for providing the detail reproduce method.

==Reproduce==
Versions:
RHEL7.3GA version: 3.10.0-514.rt56.420.el7.x86_64

Steps:
Same as Comment 9. And run cyclitest tests in rt VM for 1 hour.

Results:
# Min Latencies: 00003
# Avg Latencies: 00005
# Max Latencies: 00033

The Max latencies 33 > 20. So this bug has been reproduced.


==Verification==
Versions:
3.10.0-514.1.1.rt56.422.el7.x86_64

Steps:
Same as reproduce.

Results:
# Min Latencies: 00003
# Avg Latencies: 00005
# Max Latencies: 00011


So this bug has been fixed well.

Comment 11 Pei Zhang 2016-11-17 10:51:04 UTC
Set this bug 'VERIFIED' as Comment 10.

Comment 12 Luiz Capitulino 2016-11-17 13:22:05 UTC
Thanks for insisting on having a reproducer Pei Zhang!

Comment 14 errata-xmlrpc 2016-12-06 17:10:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2883.html


Note You need to log in before you can comment on or make changes to this bug.