Bug 1777337 - Build pod label truncation can cause build failure
Summary: Build pod label truncation can cause build failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Gabe Montero
QA Contact: wewang
URL:
Whiteboard:
Depends On:
Blocks: 1804934
TreeView+ depends on / blocked
 
Reported: 2019-11-27 12:53 UTC by Mark McLoughlin
Modified: 2020-05-04 11:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: build label generation/validation was not fully conforming to k8s expectations Consequence: builds could fail with certain build config names with invalid label errors Fix: build controller and build apiserver now use complete k8s validation routines to ensure any added build labels will meet k8s label criteria Result: builds with any valid build config name will not fail because of invalid build label values
Clone Of:
: 1804934 (view as bug list)
Environment:
Last Closed: 2020-05-04 11:17:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-apiserver pull 72 0 None closed Bug 1777337: employ k8s label value validation when creating build pod build label… 2020-10-12 13:45:29 UTC
Github openshift openshift-controller-manager pull 62 0 None closed Bug 1777337: employ k8s label value validation when creating build pod build label… 2020-10-12 13:45:30 UTC
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:18:16 UTC

Description Mark McLoughlin 2019-11-27 12:53:57 UTC
Description of problem:

Trying to do a build named cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-001 is failing with this error:

Error creating: Pod "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-0001-build" is invalid: metadata.labels: Invalid value: "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue',  or 'my_value',  or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')


Version-Release number of selected component (if applicable):

This is on the ci.openshift.org cluster:

$ oc version
Client Version: openshift-clients-4.2.0-201910041700
Kubernetes Version: v1.11.0+d4cacc0


Expected results:

The label truncation happens here:

// LabelValue returns a string to use as a value for the Build                                                                                                                                        
// label in a pod. If the length of the string parameter exceeds                                                                                                                                      
// the maximum label length, the value will be truncated.                                                                                                                                             
func LabelValue(name string) string {
        if len(name) <= validation.DNS1123LabelMaxLength {
                return name
        }
        return name[:validation.DNS1123LabelMaxLength]
}



If a label must end with an alphanum, then a simple truncation improvement could be to walk back from :63 to the first alphanum

Comment 3 wewang 2020-02-12 12:16:38 UTC
Now expectedOutput: "cluster-kube-controller-manager-operator-4", I counted it, it has 43 characters, not 63 characters, and when I set expectedOutput: "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-0001-build", then unit test still pass, it should fail, right? if I am wrong, please correct me, thanks.

Comment 4 wewang 2020-02-12 12:21:43 UTC
Tested in version, error exists
4.4.0-0.nightly-2020-02-11-200858

Steps:
1. Using bc:
apiVersion: build.openshift.io/v1
kind: BuildConfig
metadata:
  name: cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-001
spec:
  postCommit: {}
  resources: {}
  runPolicy: Serial
  source:
    type: Git
    git:
      uri: https://github.com/openshift/ruby-hello-world.git
  strategy:
    sourceStrategy:
      from:
        kind: ImageStreamTag
        name: ruby:2.5
        namespace: openshift
    type: Source

2. [root@wangwen ~]# oc start-build cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-001
The Build "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-001-1" is invalid: metadata.labels: Invalid value: "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue',  or 'my_value',  or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')

Comment 5 Gabe Montero 2020-02-12 15:21:39 UTC
RE #Comment 3 @Wen

When I made the change you described, the unit test failed for me:

=== RUN   TestLabelValue
--- FAIL: TestLabelValue (0.00s)
    util_test.go:32: tc do-not-end-with-hyphen got cluster-kube-controller-manager-operator-4 for cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-0001-build instead of cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-0001-build
FAIL


With changing the section in unit_test.go to 

		{
			name:           "do-not-end-with-hyphen",
			input:          "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-0001-build",
			expectedOutput: "cluster-kube-controller-manager-operator-4.3.0.ipv6-2019-11-27-0001-build",
		},

Also, on the expected correct value of cluster-kube-controller-manager-operator-4 remember that the length is not the only factor.
As the error message in #Comment 4 notes, you cannot have "-", "_", nor "." in addition to the length.

So that eliminates the entire section ".3.0.ipv6-2019-11-27-0001-build" from the original string.

I am waiting for my dev cluster to come up and will try your BC from #Comment 4 when it is ready.

Comment 6 Gabe Montero 2020-02-12 15:53:58 UTC
Yep looks like there are some openshift-apiserver analogous changes that we'll need as well

I suspect our CI did not hit them because our CI instantiates build objects directly vs. hitting the buildconfigs instantiate endpoint

Moving back to POST to work on that scenario.

Comment 9 wewang 2020-02-21 08:07:54 UTC
There's no vaild 4.5 nightly build payload until now, will wait for it.

Comment 10 wewang 2020-03-09 03:26:17 UTC
Verified in version:
4.5.0-0.nightly-2020-03-06-190457

Steps are same with 4.4 related bug.

Comment 12 errata-xmlrpc 2020-05-04 11:17:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.