Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1317463

Summary:	[infrastructure_public_295] Master service boot failure due to Predicate type not found for MaxGCEPDVolumeCount
Product:	OpenShift Container Platform	Reporter:	Qixuan Wang <qixuan.wang>
Component:	Node	Assignee:	Solly Ross <sross>
Status:	CLOSED NOTABUG	QA Contact:	Qixuan Wang <qixuan.wang>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.1.0	CC:	agoldste, aos-bugs, jokerman, mmccomas, qixuan.wang
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-03-15 06:41:14 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Qixuan Wang 2016-03-14 10:17:38 UTC

Description of problem:
Setup OSE environment on GCE, configure GCE as cloud-provider on master, enable MaxGCEPDVolumeCount predicate scheduler, then master service start failed and restart continually. Log shows "Predicate type not found for MaxGCEPDVolumeCount".

Version-Release number of selected component (if applicable):
openshift v3.1.1.6-21-gcd70c35
kubernetes v1.1.0-origin-1107-g4c8e6f4
etcd 2.1.2


How reproducible:
Always

Steps to Reproduce:
1. Configure master to aware of GCE
# cat master-config.yaml

<----------snip--------->
kubernetesMasterConfig:
  apiServerArguments:
    cloud-provider:
      - "gce"
  controllerArguments:
    cloud-provider:
      - "gce"
<----------snip--------->


2. Enable "MaxGCEPDVolumeCount" predicate in scheduler.json
# cat /etc/origin/master/scheduler.json
{
  "kind": "Policy",
  "apiVersion": "v1",
  "predicates": [
    {"name": "MaxGCEPDVolumeCount"},
    {"name": "MatchNodeSelector"},
    {"name": "PodFitsResources"},
    {"name": "PodFitsPorts"},
    {"name": "NoDiskConflict"},
    {"name": "Region", "argument": {"serviceAffinity" : {"labels" : ["region"]}}}
  ],"priorities": [
    {"name": "LeastRequestedPriority", "weight": 1},
    {"name": "ServiceSpreadingPriority", "weight": 1},
    {"name": "Zone", "weight" : 2, "argument": {"serviceAntiAffinity" : {"label": "zone"}}}
  ]
}


3. Restart master service
# service atomic-openshift-master restart


Actual results:
3. Master restart continually.

# journalctl -f -u atomic-openshift-master
Mar 14 09:30:07 10.240.0.8 systemd[1]: Started Atomic OpenShift Master.
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: I0314 09:30:08.642847   53514 nodecontroller.go:133] Sending events to api server.
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: I0314 09:30:08.643384   53514 factory.go:137] Creating scheduler from configuration: {{ } [{MaxGCEPDVolumeCo
unt <nil>} {MatchNodeSelector <nil>} {PodFitsResources <nil>} {PodFitsPorts <nil>} {NoDiskConflict <nil>} {Region 0xc20db82a90}] [{LeastRequestedPriority 1 <nil>} {Ser
viceSpreadingPriority 1 <nil>} {Zone 2 0xc20db82d90}]}
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: I0314 09:30:08.643437   53514 factory.go:146] Registering predicate: MaxGCEPDVolumeCount
Mar 14 09:30:08 10.240.0.8 atomic-openshift-master[53514]: F0314 09:30:08.643452   53514 plugins.go:123] Invalid configuration: Predicate type not found for MaxGCEPDVo
lumeCount
Mar 14 09:30:08 10.240.0.8 systemd[1]: atomic-openshift-master.service: main process exited, code=exited, status=255/n/a
Mar 14 09:30:08 10.240.0.8 systemd[1]: Unit atomic-openshift-master.service entered failed state.
Mar 14 09:30:08 10.240.0.8 systemd[1]: atomic-openshift-master.service failed.
Mar 14 09:30:08 10.240.0.8 systemd[1]: atomic-openshift-master.service holdoff time over, scheduling restart.
Mar 14 09:30:08 10.240.0.8 systemd[1]: Starting Atomic OpenShift Master...
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: W0314 09:30:09.214164   53543 start_master.go:269] assetConfig.loggingPublicURL: invalid value '', Details: 
required to view aggregated container logs in the console
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: W0314 09:30:09.214260   53543 start_master.go:269] assetConfig.metricsPublicURL: invalid value '', Details: 
required to view cluster metrics in the console
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: I0314 09:30:09.816416   53543 master_config.go:113] Successfully initialized cloud provider: "gce" from the 
config file: ""
Mar 14 09:30:09 10.240.0.8 atomic-openshift-master[53543]: I0314 09:30:09.964754   53543 start_master.go:380] Starting master on 0.0.0.0:8443 (v3.1.1.6-21-gcd70c35)


Expected results:
MaxGCEPDVolumeCount should be recognized and master service works well.


Additional info:
# cat /usr/lib/systemd/system/atomic-openshift-master.service 
[Unit]
Description=Atomic OpenShift Master
Documentation=https://github.com/openshift/origin
After=network.target
After=etcd.service
Before=atomic-openshift-node.service
Requires=network.target

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/atomic-openshift-master
Environment=GOTRACEBACK=crash
ExecStart=/usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS
LimitNOFILE=131072
LimitCORE=infinity
WorkingDirectory=/var/lib/origin/
SyslogIdentifier=atomic-openshift-master
Restart=on-failure

[Install]
WantedBy=multi-user.target
WantedBy=atomic-openshift-node.service

Comment 1 Andy Goldstein 2016-03-14 13:08:32 UTC

This is a new feature for 3.2 so it's not available in 3.1.1.6. Please retest with the 3.2 puddles.

Comment 2 Qixuan Wang 2016-03-15 06:41:14 UTC

Can't reproduce on OSE 3.2 puddle.
openshift v3.2.0.3
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5