Bug 1317463 - [infrastructure_public_295] Master service boot failure due to Predicate type not found for MaxGCEPDVolumeCount
[infrastructure_public_295] Master service boot failure due to Predicate type...
Status: CLOSED NOTABUG
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.1.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Solly Ross
Qixuan Wang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-14 06:17 EDT by Qixuan Wang
Modified: 2016-05-25 09:28 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-15 02:41:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Qixuan Wang 2016-03-14 06:17:38 EDT
Description of problem:
Setup OSE environment on GCE, configure GCE as cloud-provider on master, enable MaxGCEPDVolumeCount predicate scheduler, then master service start failed and restart continually. Log shows "Predicate type not found for MaxGCEPDVolumeCount".

Version-Release number of selected component (if applicable):
openshift v3.1.1.6-21-gcd70c35
kubernetes v1.1.0-origin-1107-g4c8e6f4
etcd 2.1.2


How reproducible:
Always

Steps to Reproduce:
1. Configure master to aware of GCE
# cat master-config.yaml

<----------snip--------->
kubernetesMasterConfig:
  apiServerArguments:
    cloud-provider:
      - "gce"
  controllerArguments:
    cloud-provider:
      - "gce"
<----------snip--------->


2. Enable "MaxGCEPDVolumeCount" predicate in scheduler.json
# cat /etc/origin/master/scheduler.json
{
  "kind": "Policy",
  "apiVersion": "v1",
  "predicates": [
    {"name": "MaxGCEPDVolumeCount"},
    {"name": "MatchNodeSelector"},
    {"name": "PodFitsResources"},
    {"name": "PodFitsPorts"},
    {"name": "NoDiskConflict"},
    {"name": "Region", "argument": {"serviceAffinity" : {"labels" : ["region"]}}}
  ],"priorities": [
    {"name": "LeastRequestedPriority", "weight": 1},
    {"name": "ServiceSpreadingPriority", "weight": 1},
    {"name": "Zone", "weight" : 2, "argument": {"serviceAntiAffinity" : {"label": "zone"}}}
  ]
}


3. Restart master service
# service atomic-openshift-master restart


Actual results:
3. Master restart continually.

# journalctl -f -u atomic-openshift-master
Mar 14 09:30:07 10.240.0.8 systemd[1]: Started Atomic OpenShift Master.
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: I0314 09:30:08.642847   53514 nodecontroller.go:133] Sending events to api server.
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: I0314 09:30:08.643384   53514 factory.go:137] Creating scheduler from configuration: {{ } [{MaxGCEPDVolumeCo
unt <nil>} {MatchNodeSelector <nil>} {PodFitsResources <nil>} {PodFitsPorts <nil>} {NoDiskConflict <nil>} {Region 0xc20db82a90}] [{LeastRequestedPriority 1 <nil>} {Ser
viceSpreadingPriority 1 <nil>} {Zone 2 0xc20db82d90}]}
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: I0314 09:30:08.643437   53514 factory.go:146] Registering predicate: MaxGCEPDVolumeCount
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: F0314 09:30:08.643452   53514 plugins.go:123] Invalid configuration: Predicate type not found for MaxGCEPDVo
lumeCount
Mar 14 09:30:08 10.240.0.8 systemd[1]: atomic-openshift-master.service: main process exited, code=exited, status=255/n/a
Mar 14 09:30:08 10.240.0.8 systemd[1]: Unit atomic-openshift-master.service entered failed state.
Mar 14 09:30:08 10.240.0.8 systemd[1]: atomic-openshift-master.service failed.
Mar 14 09:30:08 10.240.0.8 systemd[1]: atomic-openshift-master.service holdoff time over, scheduling restart.
Mar 14 09:30:08 10.240.0.8 systemd[1]: Starting Atomic OpenShift Master...
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: W0314 09:30:09.214164   53543 start_master.go:269] assetConfig.loggingPublicURL: invalid value '', Details: 
required to view aggregated container logs in the console
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: W0314 09:30:09.214260   53543 start_master.go:269] assetConfig.metricsPublicURL: invalid value '', Details: 
required to view cluster metrics in the console
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: I0314 09:30:09.816416   53543 master_config.go:113] Successfully initialized cloud provider: "gce" from the 
config file: ""
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: I0314 09:30:09.964754   53543 start_master.go:380] Starting master on 0.0.0.0:8443 (v3.1.1.6-21-gcd70c35)


Expected results:
MaxGCEPDVolumeCount should be recognized and master service works well.


Additional info:
# cat /usr/lib/systemd/system/atomic-openshift-master.service 
[Unit]
Description=Atomic OpenShift Master
Documentation=https://github.com/openshift/origin
After=network.target
After=etcd.service
Before=atomic-openshift-node.service
Requires=network.target

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/atomic-openshift-master
Environment=GOTRACEBACK=crash
ExecStart=/usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS
LimitNOFILE=131072
LimitCORE=infinity
WorkingDirectory=/var/lib/origin/
SyslogIdentifier=atomic-openshift-master
Restart=on-failure

[Install]
WantedBy=multi-user.target
WantedBy=atomic-openshift-node.service
Comment 1 Andy Goldstein 2016-03-14 09:08:32 EDT
This is a new feature for 3.2 so it's not available in 3.1.1.6. Please retest with the 3.2 puddles.
Comment 2 Qixuan Wang 2016-03-15 02:41:14 EDT
Can't reproduce on OSE 3.2 puddle.
openshift v3.2.0.3
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

Note You need to log in before you can comment on or make changes to this bug.