Bug 1332563 - tuned-profiles-nfv: accommodate new ktimersoftd thread
tuned-profiles-nfv: accommodate new ktimersoftd thread
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: tuned (Show other bugs)
7.4
Unspecified Unspecified
high Severity unspecified
: rc
: ---
Assigned To: Jaroslav Škarvada
Tereza Cerna
: Patch, Upstream, ZStream
Depends On:
Blocks: kvm-rt-tuned 1400961 1273048 1440663
  Show dependency treegraph
 
Reported: 2016-05-03 09:20 EDT by Luiz Capitulino
Modified: 2017-08-01 08:32 EDT (History)
20 users (show)

See Also:
Fixed In Version: tuned-2.8.0-1.el7
Doc Type: Enhancement
Doc Text:
The priority of the "ktimersoftd" and "ksoftirqd" kernel threads has been increased, which improves Real Time kernel performance when using the tuned service.
Story Points: ---
Clone Of:
: 1440663 (view as bug list)
Environment:
Last Closed: 2017-08-01 08:32:51 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Luiz Capitulino 2016-05-03 09:20:32 EDT
Description of problem:

The RHEL7.3 RT kernel has a new per-CPU kernel thread called ktimersoftd. This thread has fifo:1 priority, which is the same we're assigning to KVM's vCPU threads.

This is how KVM-RT per-cpu threads look like today:

   167  FF  99  [posixcputmr/15] *
   164  FF  99  [migration/15] *
   163  FF   3  [rcuc/15] *
   166  FF   2  [ksoftirqd/15] *
  3272  FF   1  qemu-kvm *
   165  FF   1  [ktimersoftd/15] *

Maybe we should bump by 1 rcuc, ksoftirqd and ktimersoftd.

Version-Release number of selected component (if applicable): 3.10.0-382.rt56.261.el7.x86_64
Comment 1 Luiz Capitulino 2016-05-09 10:25:13 EDT
I've bumped the RT prio for per-cpu threads in the following way (in comparison to what I posted in the description):

   167  FF  99  [posixcputmr/15] *
   164  FF  99  [migration/15] *
   163  FF   4  [rcuc/15] *
   166  FF   3  [ksoftirqd/15] *
   165  FF   2  [ktimersoftd/15] *

And the qemu-kvm thread keeps fifo:1.

Before this change I was seeing spikes of max=40us on the multiple VMs test-case. After this change, I've got max=21us (which is considered good). This seems to indicate that the change is good, but more testing is necessary.

However, I don't know what's the relationship between ksoftirqd and ktimersoftd, so I don't know if they should have the same prio for example.
Comment 2 Clark Williams 2016-05-11 11:02:35 EDT
Luiz,

We should take a look at who is raising the timer softirq. If it is something that's part of our KVM-RT stack, we may want the timer softirq to have a higher priority than the default ksoftirqd thread.
Comment 3 Luiz Capitulino 2016-05-11 11:13:45 EDT
Oh, I raised it myself by hand :)

The KVM-RT tuned profile is responsible for the settings I showed in the description (rcuc fifo:3 and ksoftirqd fifo:2). The priorities in comment 1 were set by hand for testing.

I can test with ktimersoftirqd > ksoftirqd and if that works fine, I can post a patch for the KVM-RT profile to make this the default.
Comment 4 Beth Uptagrafft 2016-06-15 13:48:25 EDT
Luiz, any updates on this BZ?
Comment 5 Luiz Capitulino 2016-06-15 14:04:42 EDT
Not yet. What's pending here is me running a 24 hours test duration to confirm that raising the ktimersoftd thread priority won't cause any regressions.

I haven't done this yet for three reasons:

 - I was triggering bug 1328890 in my test run (this is now fixed)
 - I'm triggering an new issue where my VMs loose networking (I'm debugging this now)
 - PTO time (I may have more to come)

So, as the last two items are still in progress, I may not have an update for the next week or so.
Comment 6 Luiz Capitulino 2016-08-31 10:55:50 EDT
Jaroslav,

Before I start, let me say that it's totally my fault that this BZ fell through the cracks. But as it turns out, we need it for 7.3.

We have to make a change to the realtime-virtual-host profile so that the kernel threads priorities on a isolated core will look like this:

   135  FF  99  [posixcputmr/13] *
   132  FF  99  [migration/13] *
   131  FF   4  [rcuc/13] *
   133  FF   3  [ktimersoftd/13] *
   134  FF   2  [ksoftirqd/13] *

This is needed to accommodate the new ktimersoftd thread, otherwise its default priority will conflict with the vCPU thread priority on an isolated core.

Here's the change we have to make:

--- tuned.conf.orig     2016-08-31 08:21:50.978757302 -0400
+++ tuned.conf  2016-08-31 09:34:55.618890005 -0400
@@ -33,10 +33,13 @@ isolated_cores_expanded=${f:cpulist_unpa
 group.ksoftirqd=0:f:2:*:ksoftirqd.*
 
 # for i in `pgrep rcuc` ; do grep Cpus_allowed_list /proc/$i/status ; done
-group.rcuc=0:f:3:*:rcuc.*
+group.rcuc=0:f:4:*:rcuc.*
 
 # for i in `pgrep rcub` ; do grep Cpus_allowed_list /proc/$i/status ; done
-group.rcub=0:f:3:*:rcub.*
+group.rcub=0:f:4:*:rcub.*
+
+# for i in `pgrep ktimersoftd` ; do grep Cpus_allowed_list /proc/$i/status ; done
+group.ktimersoftd=0:f:3:*:ktimersoftd.*
 
 [script]
 script=script.sh
Comment 15 Luiz Capitulino 2016-09-12 14:29:23 EDT
There are two ways to reproduce this bug:

1. Simply check the ktimersoftd and ksoftirqd kernel threads priority

On current (not-fixed) profile:

# ps axo pid,class,rtprio,comm | grep ktimersoft
     4 FF       1 ktimersoftd/0
     ...

# ps axo pid,class,rtprio,comm | grep softirq
     3 FF       2 ksoftirqd/0
     ...

Here we see that both ksoftirqd and ktimersoft are SCHED_FIFO tasks, but ksoftirqd has greater RT priority than ktimersoftd. What we want is (fixed profile output):

# ps axo pid,class,rtprio,comm | grep ktimersoft
     4 FF       3 ktimersoftd/0
     ...

# ps axo pid,class,rtprio,comm | grep softirq
     3 FF       2 ksoftirqd/0

2. Try to reproduce one of the issues we think are possible

This will be a bit hard to do, and will require you to setup a system for KVM-RT. So let's only do this if it turns out to really necessary.
Comment 17 Jaroslav Škarvada 2017-03-21 05:48:00 EDT
Upstream commit fixing the problem:
https://github.com/redhat-performance/tuned/commit/3ca7cfceb155104b73144826af35e42a363b7072

Available for preliminary testing in tuned-*-2.7.1-1.20170321git3ca7cfce.el7 from:
https://jskarvad.fedorapeople.org/tuned/devel/repo/
Comment 19 Luiz Capitulino 2017-03-21 10:51:48 EDT
(In reply to Jaroslav Škarvada from comment #17)

> Available for preliminary testing in tuned-*-2.7.1-1.20170321git3ca7cfce.el7
> from:
> https://jskarvad.fedorapeople.org/tuned/devel/repo/

Works as expected, also passed short duration tests.
Comment 22 Tereza Cerna 2017-04-12 05:03:02 EDT
This is only sanity check that tuned can set priority of ktimersoftd process.


I did these steps:

# tuned-adm profile realtime-virtual-host

# cat ktimersoftd 
#!/bin/bash
while true
do
sleep 3600
done

# chmod a+rx ktimersoftd

# ./ktimersoftd &


=========================================
Verified in:
    tuned-2.8.0-1.el7.noarch
    tuned-profiles-nfv-2.8.0-1.el7.noarch
PASS
=========================================

# ps axo pid,class,rtprio,comm | grep ktimersoft
16976 FF       3 ktimersoftd

# ps axo pid,class,rtprio,comm | grep softirq
    3 FF       2 ksoftirqd/0
   13 FF       2 ksoftirqd/1
   17 FF       2 ksoftirqd/2
   21 FF       2 ksoftirqd/3


=========================================
Reproduced in:
    tuned-2.7.1-3.el7_3.1.noarch
	tuned-profiles-nfv-2.7.1-3.el7_3.1.noarch
FAIL
=========================================

# ps axo pid,class,rtprio,comm | grep ktimersoft
17361 TS       - ktimersoftd

# ps axo pid,class,rtprio,comm | grep softirq
    3 FF       2 ksoftirqd/0
   13 FF       2 ksoftirqd/1
   17 FF       2 ksoftirqd/2
   21 FF       2 ksoftirqd/3
Comment 24 Pei Zhang 2017-04-12 06:18:21 EDT
Hi Luiz,

Before QE verify the functionality, I'd like to confirm the testing method.

I noticed you mentioned the max latency of multi VMs in Comment 1, so is this the check point of this bug? If so, seems it will be hard to reproduce. Because I did the latency testing without this fix, and used tuned-2.7.1-5.20170314git92d558b8.el7.noarch, the max latency < 20us already.

Running 12h, Boot 4 VMs at same time, their latency values are:
min(us)  avg(us)  max(us)
00005    00006    00011
00005    00006    00012
00005    00006    00012
00005    00006    00012


Could you please share more details or methods about how to verify the functionality? Thanks.



Best Regards,
Pei
Comment 25 Luiz Capitulino 2017-04-12 15:39:21 EDT
Pei,

I'm not sure it makes sense what I said in comment 1. Before the fix for this issue, ktimersoftd had the same priority as the vCPU thread in the host. This can have to negative implications:

1. If ktimersoftd becomes runnable, it won't execute until the vCPU thread relinquishes the CPU, which could be forever. This can lead to a bad system state

2. If ktimersoftd becomes runnable and the vCPU relinquishes the CPU and the vCPU thread becomes runnable, then the vCPU thread will have to wait for the ktimersoftd thread to block

Item 2 could be the spike I was seeing, but I never confirmed it and it's an extremely hard to reproduce scenario.

The way I recommend to verify this BZ is just to check ktimersoftd has the expected priority. Which is what Tereza did and which is listed in comment 6.
Comment 26 Pei Zhang 2017-04-12 20:47:05 EDT
(In reply to Luiz Capitulino from comment #25)
> Pei,
> 
> I'm not sure it makes sense what I said in comment 1. Before the fix for
> this issue, ktimersoftd had the same priority as the vCPU thread in the
> host. This can have to negative implications:
> 
> 1. If ktimersoftd becomes runnable, it won't execute until the vCPU thread
> relinquishes the CPU, which could be forever. This can lead to a bad system
> state
> 
> 2. If ktimersoftd becomes runnable and the vCPU relinquishes the CPU and the
> vCPU thread becomes runnable, then the vCPU thread will have to wait for the
> ktimersoftd thread to block
> 
> Item 2 could be the spike I was seeing, but I never confirmed it and it's an
> extremely hard to reproduce scenario.
> 
> The way I recommend to verify this BZ is just to check ktimersoftd has the
> expected priority. Which is what Tereza did and which is listed in comment 6.

OK. Thanks Luiz for your confirmation about verification of this bug.
Comment 29 Tereza Cerna 2017-04-26 06:56:05 EDT
Tested manually (see c#22) and by automated test case  /CoreOS/tuned/Regression/create-new-nfv-profiles 

 
Verified in:
    tuned-2.8.0-2.el7.noarch
    tuned-profiles-nfv-host-2.8.0-2.el7.noarch
    tuned-profiles-nfv-guest-2.8.0-2.el7.noarch
PASS

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: Set priority of ktimersoftd process [BZ#1332563, BZ#1440663]
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [   PASS   ] :: Command 'chmod +x ktimersoftd' (Expected 0, got 0)
:: [   PASS   ] :: Command './ktimersoftd &' (Expected 0, got 0)
:: [ 06:43:38 ] :: Priority of ktimersoft process is '3'
:: [ 06:43:38 ] :: Priority of ksoftirqd process is '2'
:: [   PASS   ] :: Ksoftirqd process shoud have bigger priority than ktimersoft process. (Assert: "3" should be greater than "2")
:: [   PASS   ] :: Command 'kill -9 143272' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'killall sleep'
:: [   PASS   ] :: Command 'killall sleep' (Expected 0, got 0)




Reproduced in:
    tuned-2.7.1-3.el7_3.1.noarch
    tuned-profiles-nfv-2.7.1-3.el7_3.1.noarch
FAIL

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: Set priority of ktimersoftd process [BZ#1332563, BZ#1440663]
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [   PASS   ] :: Command 'chmod +x ktimersoftd' (Expected 0, got 0)
:: [   PASS   ] :: Command './ktimersoftd &' (Expected 0, got 0)
:: [ 06:48:47 ] :: Priority of ktimersoft process is '-'
:: [ 06:48:47 ] :: Priority of ksoftirqd process is '2'
/usr/share/beakerlib/testing.sh: line 289: [: -: integer expression expected
:: [   FAIL   ] :: Ksoftirqd process shoud have bigger priority than ktimersoft process. (Assert: "-" should be greater than "2")
:: [   PASS   ] :: Command 'kill -9 145964' (Expected 0, got 0)
:: [   PASS   ] :: Command 'killall sleep' (Expected 0, got 0)
Comment 30 errata-xmlrpc 2017-08-01 08:32:51 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2102

Note You need to log in before you can comment on or make changes to this bug.