Bug 1067107 - oo-accept-node does not validate whether threads are in cgroups
Summary: oo-accept-node does not validate whether threads are in cgroups
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Rory Thrasher
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks: 1307175 1308740
TreeView+ depends on / blocked
 
Reported: 2014-02-19 17:23 UTC by Andy Grimm
Modified: 2017-05-31 18:22 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1307175 1308740 (view as bug list)
Environment:
Last Closed: 2017-05-31 18:22:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andy Grimm 2014-02-19 17:23:24 UTC
Description of problem:

oo-accept-node only verifies that processes are in cgroups, not threads.  It should be using the -L flag to ps, and comparing against tasks instead of cgroup.procs

This was mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1020029 , but was not addressed as part of that fix.

Comment 1 openshift-github-bot 2016-01-28 21:00:26 UTC
Commits pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/5c6699e31aee4139d976bb9a254d252eebdb719c
oo-accept-node: update to check cgroup tasks

Bug 1067107
BZ https://bugzilla.redhat.com/show_bug.cgi?id=1067107

Updates oo-accept-node to check thread ids against cgroup tasks instead of
checking process ids against cgroup procs.

https://github.com/openshift/origin-server/commit/c9ee5b56766f9cfb0b2b9e35b21b32ba4b30e971
Merge pull request #6347 from thrasher-redhat/bug1067107

Merged by openshift-bot

Comment 2 Rory Thrasher 2016-01-28 22:08:38 UTC
QA,

Please verify that oo-accept-node now checks threads against tasks and will report when a thread is missing from the gear's tasks.

1. Create a multi-threaded app and then any second app.

# rhc app-create myapp jbossas-7
# rhc app-create other python-3.3

2. Check the cgroup tasks for a resource in 'myapp' and pick a thread id.

# cat /cgroup/cpu/openshift/<myapp_uuid>/tasks
7621
...
7655

3. Add the task id to the same resource in another gear (in 'other'), which will remove it from the original tasks.

# echo 7655 > /cgroup/cpu/openshift/<other_uuid>/tasks

4. Verify that oo-accept-node will find the missing task with an error message similar to "#{uuid} has a thread: tid:#{tid}, pid:#{pid} missing from cgroups controller: #{controller}".

# oo-accept-node


Thank you.

Comment 3 Meng Bo 2016-03-07 06:18:34 UTC
Checked on devenv_5778, bug has been fixed.

# oo-accept-node -v
INFO: loading node configuration file /etc/openshift/node.conf
INFO: loading resource limit file /etc/openshift/resource_limits.conf
INFO: finding external network device
INFO: checking that external network device has a globally scoped IPv4 address
INFO: checking node public hostname resolution
INFO: checking selinux status
INFO: checking selinux openshift-hosted policy
INFO: checking selinux booleans
INFO: checking selinux nodes
INFO: checking package list
INFO: checking services
INFO: checking kernel semaphores >= 512
INFO: checking cgroups configuration
INFO: checking cgroups tasks
INFO: find district uuid: NONE
INFO: determining node uid range: 1000 to 6999
FAIL: 56dd1ae67804afdf14000002 has a thread: tid:11214, pid:7559 missing from cgroups controller: cpu
INFO: checking presence of tc qdisc
INFO: checking for cgroup filter
INFO: checking presence of tc classes
INFO: checking filesystem quotas
INFO: checking quota db file selinux label
INFO: checking 2 user accounts
INFO: checking application dirs
INFO: checking system httpd configs
INFO: checking cartridge repository
1 ERRORS

Comment 5 Eric Paris 2017-05-31 18:22:11 UTC
We apologize, however, we do not plan to address this report at this time. The majority of our active development is for the v3 version of OpenShift. If you would like for Red Hat to reconsider this decision, please reach out to your support representative. We are very sorry for any inconvenience this may cause.


Note You need to log in before you can comment on or make changes to this bug.