Bug 1525162
| Summary: | [free-int] unable to start atomic-openshift-master-api due to admission plugin marshaling error | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Justin Pierce <jupierce> | ||||
| Component: | Master | Assignee: | Michal Fojtik <mfojtik> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ge liu <geliu> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.8.0 | CC: | aos-bugs, eparis, gpei, jokerman, mifiedle, mmccomas, sdodson, wmeng | ||||
| Target Milestone: | --- | Keywords: | TestBlocker | ||||
| Target Release: | 3.8.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-11-21 18:38:16 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Took the obvious approach of changing this field to an int instead of a string and moved further to an error: Couldn't init admission plugin "RunOnceDuration": json: cannot unmarshal string into Go struct field RunOnceDurationConfig.activeDeadlineSecondsOverride of type int64 Fixing that went to: Couldn't init admission plugin "ClusterResourceOverride": json: cannot unmarshal string into Go struct field ClusterResourceOverrideConfig.cpuRequestToLimitPercent of type int64 ..I continued to replace '1234' with 1234 to address the remainder of these in master-config.yml in order to make progress with v3.8 testing. maxProjects and activeDeadlineSecondsOverride are both numeric fields and must be specified as numbers, not strings scott, do we have the ability to run `oc adm diagnostics` with the MasterConfigCheck and NodeConfigCheck diagnostics with 3.8 as a pre-upgrade check? That would flag these for fixing prior to upgrade, rather than failing to come up after upgrade. > maxProjects and activeDeadlineSecondsOverride are both numeric fields and must be specified as numbers, not strings
(as is cpuRequestToLimitPercent)
(In reply to Jordan Liggitt from comment #2) > maxProjects and activeDeadlineSecondsOverride are both numeric fields and > must be specified as numbers, not strings > > scott, do we have the ability to run `oc adm diagnostics` with the > MasterConfigCheck and NodeConfigCheck diagnostics with 3.8 as a pre-upgrade > check? That would flag these for fixing prior to upgrade, rather than > failing to come up after upgrade. We could run it inside a container. If we were to do anything else we'd have to upgrade the package first which would mean that if the master were restarted via other means it'd immediately start failing. > if the installer attempts to correct the config issue before performing the upgrade
I would not expect the installer to modify config it did not generate
I think the best we can do for config we don't own is detect it by running diagnostics prior to upgrade.
I guess this is upgrade component then. :-=( We'll add a task to validate current config with the upgrade target version via a container. Does this parser change also break what might have been working API object definitions? Could I have created a rc with size "2" which would work on 3.7 but wouldn't on 3.8? Confirmed. This replicaset works fine on 3.7. Does not work at all on 3.8. (Both port and replicas are in "")
# oc create -f /tmp/nginx.yaml -n eparis
Error from server (BadRequest): error when creating "/tmp/nginx.yaml": ReplicaSet in version "v1beta1" cannot be handled as a ReplicaSet: json: cannot unmarshal string into Go struct field ReplicaSetSpec.replicas of type int32
# cat /tmp/nginx.yaml
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: "1"
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:latest
imagePullPolicy: Always
name: nginx
ports:
- containerPort: "80"
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
opened PRs to restore type coercion: https://github.com/openshift/origin/pull/17764 (3.8) https://github.com/openshift/origin/pull/17768 (3.9) *** Bug 1525828 has been marked as a duplicate of this bug. *** Fixes are merged in 3.8.19+ |
Created attachment 1366815 [details] Excerpt from master-config.yml Description of problem: During an upgrade of free-int from v3.7 to v3.8, the openshift-ansible installer timed out waiting for a master to come back online. ssh'd in to the master and found that atomic-openshift-master-api server was failing repeatedly due to: "cannot unmarshal string into Go struct field ProjectLimitBySelector.maxProjects of type int" Version-Release number of selected component (if applicable): v3.8.18 How reproducible: 100%