Bug 1286221

Summary: Tuna is not moving threads away from isolated (-i) CPUs.
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Bristot de Oliveira <daolivei>
Component: tunaAssignee: John Kacur <jkacur>
Status: CLOSED ERRATA QA Contact: Jiri Kastner <jkastner>
Severity: urgent Docs Contact:
Priority: high    
Version: 7.3CC: bhu, jkacur, jkastner
Target Milestone: rcKeywords: ZStream
Target Release: 7.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Before moving a thread or a process, tuna checks the stat file to see if PF_NO_SETAFFINITY is set, which means the thread/process is not migratable. This was working correctly for processes, but the location of the stat file was slightly different for threads causing the check to fail. This code has been modified to correctly handle thread migration.
Story Points: ---
Clone Of:
: 1292537 1293353 (view as bug list) Environment:
Last Closed: 2016-11-04 05:15:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1203710, 1274397, 1282960, 1292537, 1293353    

Description Daniel Bristot de Oliveira 2015-11-27 13:33:32 UTC
Description of problem:

Tuna is not moving threads away from isolated (-i) CPUs.

Version-Release number of selected component (if applicable):
tuna-0.11.1-10.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Isolate some CPUs using tuna. For example, on a 4 CPUs box isolate the CPUs
2 and 3.

# tuna -c 2,3 -i

2. Check the cpumask of all threads of a threaded application. For example, on
a RHEL7, check the cpumask of tuned's threads:

# ps -eLo lwp,comm | grep " tuned"  | while read lwp comm; do taskset -p $lwp; done

Actual results:
Only the threads with the same PID/TID has the correct cpumask (excluding
isolated CPUs). For example:

# ps -eLo lwp,comm | grep " tuned"  | while read lwp comm; do taskset -p $lwp; done
pid 761's current affinity mask: 3
pid 820's current affinity mask: f
pid 822's current affinity mask: f
pid 829's current affinity mask: f


Expected results:
All threads with the correct cpumask (excluding isolated CPUs) set. For example

# ps -eLo lwp,comm | grep " tuned"  | while read lwp comm; do taskset -p $lwp; done
pid 761's current affinity mask: 3
pid 820's current affinity mask: 3
pid 822's current affinity mask: 3
pid 829's current affinity mask: 3

Additional info:
tuna-0.11.1-8.el7.noarch works fine.

Comment 1 Daniel Bristot de Oliveira 2015-11-27 19:30:33 UTC
Upstream patch commit:

https://git.kernel.org/cgit/utils/tuna/tuna.git/commit/?id=95c4e2ad2603cd29af1357c0ceb780da8dc161cc

Comment 2 John Kacur 2015-12-21 13:41:28 UTC
The following is the commit that caused the regression
commit 29fbb6e82357c87be652c6717ef52d808ec0af78
tuna: Decide whether to isolate a thread based on PF_NO_SETAFFINITY

Comment 3 Daniel Bristot de Oliveira 2015-12-21 14:07:28 UTC
John pointed me a tuna build with the proposed fix and I tested it.

It works!

Test output:
# rpm -Uvh tuna-0.11.1-11.el7.noarch.rpm 
Preparing...                          ################################# [100%]
Updating / installing...
   1:tuna-0.11.1-11.el7               ################################# [ 50%]
Cleaning up / removing...
   2:tuna-0.11.1-10.el7               ################################# [100%]
# tuna -c 2-6 -i
# ps -eLo lwp,comm | grep " tuned"  | while read lwp comm; do taskset -p $lwp; done
pid 1483's current affinity mask: 3
pid 1691's current affinity mask: 3
pid 1696's current affinity mask: 3
pid 1697's current affinity mask: 3

Comment 4 John Kacur 2015-12-21 14:08:44 UTC
Fixed in version tuna-0.11.1-11.el7 and up

Comment 9 errata-xmlrpc 2016-11-04 05:15:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2392.html