Bug 1320170

Summary: Unable to set perFSGroup local quota before first node startup.
Product: OpenShift Container Platform Reporter: Devan Goodwin <dgoodwin>
Component: InstallerAssignee: Devan Goodwin <dgoodwin>
Status: CLOSED ERRATA QA Contact: Ma xiaoqiang <xiama>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: aos-bugs, jokerman, mmccomas, tdawson, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: v3.2.0.8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-12 16:39:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Devan Goodwin 2016-03-22 13:38:20 UTC
Description of problem:

If you specify perFSGroup in your node-config.yaml before the first node startup, the volume directory does not yet exist at the correct time and checks for the quota plugin cannot run, the node will not start.

Ops is planning to use openshift-ansible to manage node configs, which will soon support setting perFSGroup quota. Thus any time a new node is created it will fail to start.


Version-Release number of selected component (if applicable):

Problem is present in origin master as of 1ae5b25eaf86357306ce9f0c2fca4bba7e81a6da


How reproducible:

Should be 100% if you create a new node, and have a way to set the perFSGroup quota *before* the node is first launched.

Easiest way to do this is probably once the openshift-ansible fix is in to enable setting this setting. (will link to PR as soon as it's available) 


Steps to Reproduce:

1. Ensure node-config.yaml has a perFSGroup setting before the node is first started:

volumeConfig:
  localQuota:
    perFSGroup: "512Mi"

Actual results:

Node service will refuse to start and show errors in journalctl similar to:

Mar 22 08:06:09 node3.aos.example.com atomic-openshift-node[4638]: I0322 08:06:09.004776    4638 node_config.go:179] Replacing empty-dir volume plugin with quota wrapper
Mar 22 08:06:09 node3.aos.example.com atomic-openshift-node[4638]: F0322 08:06:09.008850    4638 start_node.go:124] unable to check filesystem type for emptydir volume /var/lib/origin/openshift.local.volumes: exit status 1

You will see that /var/lib/origin/openshift.local.volumes does not yet exist.

Expected results:

The checks should only be performed *after* the volumeDirectory is known to exist.


Additional info:

Comment 1 Devan Goodwin 2016-03-22 15:39:28 UTC
PR up in: https://github.com/openshift/origin/pull/8193

Comment 2 Devan Goodwin 2016-03-24 19:25:38 UTC
Merged in origin, should appear in OSE v3.2.0.8.

Comment 3 Troy Dawson 2016-03-28 20:59:59 UTC
OSE v3.2.0.8 has been built and pushed to testing.

Comment 4 Ma xiaoqiang 2016-03-29 01:17:02 UTC
check on ose v3.2.0.8

Scenario 1
Install env with following parameter
<--snip-->
openshift_node_local_quota_per_fsgroup=50Mi
<--snip-->

Check the node configuration
#vim /etc/origin/node/node-config.yaml
<--snip-->
volumeConfig:
  localQuota:
    perFSGroup: 50Mi

Scenario 2
Install env without 'openshift_node_local_quota_per_fsgroup'
check the node configuration
#vim /etc/origin/node/node-config.yaml
<--snip-->
volumeConfig:
  localQuota:
    perFSGroup:

the node service isrunning, move this issue to VERIFIED.

Comment 6 errata-xmlrpc 2016-05-12 16:39:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1065