Bug 2057209 - Excessive Policy Name Length Will Prevent Child Policy Creation
Summary: Excessive Policy Name Length Will Prevent Child Policy Creation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Telco Edge
Version: 4.10
Hardware: All
OS: All
high
high
Target Milestone: ---
: 4.11.0
Assignee: Joshua Clark
QA Contact: Joshua Clark
URL:
Whiteboard:
Depends On:
Blocks: 2094413
TreeView+ depends on / blocked
 
Reported: 2022-02-22 23:19 UTC by Joshua Clark
Modified: 2022-09-15 20:53 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2094413 (view as bug list)
Environment:
Last Closed: 2022-08-18 04:08:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni cluster-group-upgrades-operator pull 145 0 None Merged Random suffix and smart truncation for talo created object names 2022-05-27 15:43:03 UTC
Red Hat Product Errata RHEA-2022:6110 0 None None None 2022-08-18 04:08:31 UTC

Description Joshua Clark 2022-02-22 23:19:45 UTC
Description of problem:

Kubernetes limits object names to 63 characters. If a policy name defined in a PolicyGenTemplate approaches this limit the Topology Aware Life-cycle Operator (TALO)cannot create child policies. When this occurs, the parent policy will remain in a "NonCompliant" state.

Version-Release number of selected component (if applicable):
4.10

How reproducible:


100%

Steps to Reproduce:
1. Install OCP with TALO and GitOps operators
2. Create a PolicyGenTemplate with a policy name and cluster name near the 63 character limit
3. Install a cluster via ZTP using GitOps and TALO
4. Verify that the parent policy remains in NonCompliant state and child policy is never created.

Actual results:

Child policy is not created.

Expected results:

TALO created child policy which eventually goes into "Compliant" state.

Additional info:

Kubernetes character limit documented here: https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.3/html/governance/governance

Comment 2 jun 2022-06-09 14:29:00 UTC
Verified with the latest upstream image: quay.io/openshift-kni/cluster-group-upgrades-operator@sha256:71180138852a342c9d55b0c730eaea5c7c708fdf87001543e44e0d31511eaf6d

apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
  resourceVersion: '204296253'
  name: cnfdf18-new-with-super-long-name
  uid: 0f5752af-2bc1-42ba-9bd4-d3e142e1efac
  creationTimestamp: '2022-06-09T14:17:06Z'
  generation: 1
  namespace: ztp-install
  ownerReferences:
    - apiVersion: cluster.open-cluster-management.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: ManagedCluster
      name: cnfdf18
      uid: 76f7eaec-3321-4858-93be-a49ea9db290d
  finalizers:
    - ran.openshift.io/cleanup-finalizer
spec:
  actions:
    afterCompletion:
      addClusterLabels:
        ztp-done: ''
      deleteClusterLabels:
        ztp-running: ''
      deleteObjects: true
    beforeEnable:
      addClusterLabels:
        ztp-running: ''
  backup: false
  clusters:
    - cnfdf18
  enable: true
  managedPolicies:
    - common-cnfdf18-subscriptions-policy
  preCaching: false
  remediationStrategy:
    maxConcurrency: 1
    timeout: 240
status:
  computedMaxConcurrency: 1
  conditions:
    - lastTransitionTime: '2022-06-09T14:20:38Z'
      message: >-
        The ClusterGroupUpgrade CR has all clusters compliant with all the
        managed policies
      reason: UpgradeCompleted
      status: 'True'
      type: Ready
  managedPoliciesContent:
    common-cnfdf18-subscriptions-policy: >-
      [{"kind":"Subscription","name":"sriov-network-operator-subscription","namespace":"openshift-sriov-network-operator"},{"kind":"Subscription","name":"ptp-operator-subscription","namespace":"openshift-ptp"},{"kind":"Subscription","name":"performance-addon-operator","namespace":"openshift-performance-addon-operator"}]
  managedPoliciesForUpgrade:
    - name: common-cnfdf18-subscriptions-policy
      namespace: ztp-common-cnfdf18
  managedPoliciesNs:
    common-cnfdf18-subscriptions-policy: ztp-common-cnfdf18
  remediationPlan:
    - - cnfdf18
  safeResourceNames:
    cnfdf18-new-with-super-long-name-common-cnfdf18-subscriptions-policy: cnfdf18-new-with-super-long-name-common-cnfdf-tqq9m
    cnfdf18-new-with-super-long-name-common-cnfdf18-subscriptions-policy-config: cnfdf18-new-with-super-long-name-common-cnfdf18-subscript-ccqzs
    cnfdf18-new-with-super-long-name-common-cnfdf18-subscriptions-policy-placement: >-
      cnfdf18-new-with-super-long-name-common-cnfdf18-subscriptions-policy-placement-nc67f
    cnfdf18-new-with-super-long-name-ztp-install-installplan-install-g4t9k: >-
      cnfdf18-new-with-super-long-name-ztp-install-installplan-install-g4t9k-dbtxj
    cnfdf18-new-with-super-long-name-ztp-install-subscription-performance-addon-operator: >-
      cnfdf18-new-with-super-long-name-ztp-install-subscription-performance-addon-operator-drhnj
    cnfdf18-new-with-super-long-name-ztp-install-subscription-ptp-operator-subscription: >-
      cnfdf18-new-with-super-long-name-ztp-install-subscription-ptp-operator-subscription-562h7
    cnfdf18-new-with-super-long-name-ztp-install-subscription-sriov-network-operator-subscription: >-
      cnfdf18-new-with-super-long-name-ztp-install-subscription-sriov-network-operator-subscription-b9rgs
  status:
    completedAt: '2022-06-09T14:20:39Z'
    currentBatchRemediationProgress:
      cnfdf18:
        state: Completed
    startedAt: '2022-06-09T14:17:06Z'

Comment 5 errata-xmlrpc 2022-08-18 04:08:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.11 CNF vRAN extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6110

Comment 6 davis phillips 2022-09-07 18:48:36 UTC
We ran into this issue setting up a POC for a customer in their lab. We had to truncate the initial names:

```
2022-09-07T18:22:59.679Z	ERROR	controller-runtime.manager.controller.clustergroupupgrade	Reconciler error	{"reconciler group": "ran.openshift.io", "reconciler kind": "ClusterGroupUpgrade", "name": "cgu-platform-upgrade-prep", "namespace": "default", "error": "Operation cannot be fulfilled on clustergroupupgrades.ran.openshift.io \"cgu-platform-upgrade-prep\": the object has been modified; please apply your changes to the latest version and try again"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214
```

The child policy was not being created.

Comment 7 jun 2022-09-08 14:18:14 UTC
@dphillip Can you provide more info? Either a CGU dump or the TALM pod log. The "object modified" reconcile error is a very generic one and it can happen in many cases. Usually it resolves by itself on the next reconcile.

Comment 8 davis phillips 2022-09-15 20:53:01 UTC
Hey Jun, we talked today and decided to upgrade the TALM operator. Thanks for taking a look at it.


Note You need to log in before you can comment on or make changes to this bug.