Bug 1307175

Summary: oo-accept-node does not validate whether threads are in cgroups
Product: OpenShift Container Platform Reporter: Miciah Dashiel Butler Masters <mmasters>
Component: ContainersAssignee: Miciah Dashiel Butler Masters <mmasters>
Status: CLOSED ERRATA QA Contact: DeShuai Ma <dma>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.2.0CC: agrimm, anli, aos-bugs, jokerman, libra-bugs, mmccomas, rchopra, rthrashe
Target Milestone: ---Keywords: UpcomingRelease
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-origin-node-util-1.38.6.1-1.el6op Doc Type: Bug Fix
Doc Text:
Cause: Consequence: Fix: Result: The oo-accept-node script was incorrectly verifying that processes were in cgroups procs. Output from this script may have been misleading as the script should have been checking that threads are in cgroup tasks. Oo-accept-node will now correctly compare threads with cgroup tasks.
Story Points: ---
Clone Of: 1067107 Environment:
Last Closed: 2016-03-22 16:54:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1067107, 1308740    
Bug Blocks:    

Description Miciah Dashiel Butler Masters 2016-02-12 22:29:54 UTC
+++ This bug was initially created as a clone of Bug #1067107 +++

Description of problem:

oo-accept-node only verifies that processes are in cgroups, not threads.  It should be using the -L flag to ps, and comparing against tasks instead of cgroup.procs

This was mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1020029 , but was not addressed as part of that fix.

--- Additional comment from openshift-github-bot on 2016-01-28 16:00:26 EST ---

Commits pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/5c6699e31aee4139d976bb9a254d252eebdb719c
oo-accept-node: update to check cgroup tasks

Bug 1067107
BZ https://bugzilla.redhat.com/show_bug.cgi?id=1067107

Updates oo-accept-node to check thread ids against cgroup tasks instead of
checking process ids against cgroup procs.

https://github.com/openshift/origin-server/commit/c9ee5b56766f9cfb0b2b9e35b21b32ba4b30e971
Merge pull request #6347 from thrasher-redhat/bug1067107

Merged by openshift-bot

--- Additional comment from Rory Thrasher on 2016-01-28 17:08:38 EST ---

QA,

Please verify that oo-accept-node now checks threads against tasks and will report when a thread is missing from the gear's tasks.

1. Create a multi-threaded app and then any second app.

# rhc app-create myapp jbossas-7
# rhc app-create other python-3.3

2. Check the cgroup tasks for a resource in 'myapp' and pick a thread id.

# cat /cgroup/cpu/openshift/<myapp_uuid>/tasks
7621
...
7655

3. Add the task id to the same resource in another gear (in 'other'), which will remove it from the original tasks.

# echo 7655 > /cgroup/cpu/openshift/<other_uuid>/tasks

4. Verify that oo-accept-node will find the missing task with an error message similar to "#{uuid} has a thread: tid:#{tid}, pid:#{pid} missing from cgroups controller: #{controller}".

# oo-accept-node


Thank you.

Comment 4 Rory Thrasher 2016-02-22 20:21:14 UTC
*** Bug 1308740 has been marked as a duplicate of this bug. ***

Comment 6 Anping Li 2016-02-26 09:11:26 UTC
Verified and pass. We can see report as following
INFO: find district uuid: 56cfcbfe82611dc2ef000001
INFO: determining node uid range: 1000 to 6999
FAIL: anlidom-myapp-1 has a thread: tid:27872, pid:27723 missing from cgroups controller: cpu
INFO: checking presence of tc qdisc

Comment 8 errata-xmlrpc 2016-03-22 16:54:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-0489.html