Bug 2004510 - openshift-gitops operator hooks gets unauthorized (401) errors during jobs executions
Summary: openshift-gitops operator hooks gets unauthorized (401) errors during jobs ex...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Telco Edge
Version: 4.8
Hardware: x86_64
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.10.0
Assignee: Ian Miller
QA Contact: yliu1
URL:
Whiteboard:
Depends On:
Blocks: 2010529
TreeView+ depends on / blocked
 
Reported: 2021-09-15 13:21 UTC by Juan Manuel Parrilla Madrid
Modified: 2022-03-10 16:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2010529 (view as bug list)
Environment:
Last Closed: 2022-03-10 16:10:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni cnf-features-deploy pull 722 0 None open Bug 2004510: ztp: Add authentication retries to ztp container 2021-09-28 16:23:15 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:11:15 UTC

Description Juan Manuel Parrilla Madrid 2021-09-15 13:21:18 UTC
Description of problem:

Once installed OCP, ACM and GitOps operator we face an issue when we use the pre and post hooks from https://github.com/openshift-kni/cnf-features-deploy/ztp/gitops-subscriptions/argocd.

The behaviour is:

- We load the deployment folder as the README.md says
- Our DU manifests (clusters and policies) are being loaded using Argo
- When the PolicyGen tries to execute the hooks an unauthorized error appears

Pre Hook
```
[root@bastion1 gitops-verizon]# oc logs siteconfig-pre-6ncnx
error: failed to create configmap: Unauthorized
ztp-hooks.presync Wed, 15 Sep 2021 13:02:10 +0000 ERROR [pre-sync-entrypoint] Config map of siteconfigs resourceVersion creation failed
```

So we debug the pod and enter in to that, the force the execution sourcing the common.sh to force login into OCP cluster and then the clusters are being loaded, but it appears again on the post hook

Post Hook
```
[root@bastion1 gitops-verizon]# oc logs siteconfig-post-9zksw
ztp-hooks.postsync Wed, 15 Sep 2021 13:04:27 +0000 ERROR [post-sync-entrypoint] Failed to get RAN sites resourceVersion
ztp-hooks.watcher 2021-09-15 13:04:28 UTC [DEBUG]             [watcher:217]: 0, siteconfigs
ztp-hooks.watcher 2021-09-15 13:04:28 UTC [ERROR]             [watcher:53]: (401)
Reason: Unauthorized
HTTP response headers: HTTPHeaderDict({'Audit-Id': '56919b9c-db7f-4d6f-a9a4-7916f94777d9', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Wed, 15 Sep 2021 13:04:28 GMT', 'Content-Length': '129'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}

Traceback (most recent call last):
  File "watcher.py", line 51, in watch_resources
    resource_version=rv, timeout_seconds=5)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api/custom_objects_api.py", line 2087, in list_cluster_custom_object_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 353, in call_api
    _preload_content, _request_timeout, _host)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 184, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 377, in request
    headers=headers)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 243, in GET
    query_params=query_params)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 233, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (401)
Reason: Unauthorized
HTTP response headers: HTTPHeaderDict({'Audit-Id': '56919b9c-db7f-4d6f-a9a4-7916f94777d9', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Wed, 15 Sep 2021 13:04:28 GMT', 'Content-Length': '129'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}


ztp-hooks.watcher 2021-09-15 13:04:28 UTC [ERROR]             [watcher:223]: 'NoneType' object is not subscriptable
Traceback (most recent call last):
  File "watcher.py", line 221, in <module>
    ApiResponseParser(resp, resourcename=sys.argv[2], debug=debug)
  File "watcher.py", line 122, in __init__
    if api_response[1] != 200:
TypeError: 'NoneType' object is not subscriptable
```

If we repeat the same workaround, debugging the node and forcing the generation it works.

Version-Release number of selected component (if applicable):

OCP: 4.8.12
ACM: 2.3.3-7
Openshift-gitops Operator: latest from channel 4.8
Platform: IPv6/Disconnected


How reproducible:

- always

Steps to Reproduce:
1. Create the NS and Secrets for your clusters
2. Have the repo update with the clusters definitions
3. clone https://github.com/openshift-kni/cnf-features-deploy and fill the https://github.com/openshift-kni/cnf-features-deploy/tree/master/ztp/gitops-subscriptions/argocd/deployment/{policies-app.yaml,cluster-app.yaml}
3. execute oc apply -k on deployment folder

Actual results:

It fails the most of the times (90%)

Expected results:

work most of the times.
Additional info:

Comment 2 Juan Manuel Parrilla Madrid 2021-10-05 13:54:04 UTC
PR https://github.com/openshift-kni/cnf-features-deploy/pull/722 tested on customer environment with:

- OCP 4.8.5  as hub
- OCP 4.8.11 as Spoke 
- ACM 2.3.3
- GitOps Operator on 4.8 channel

Comment 3 yliu1 2021-10-05 15:24:39 UTC
We did not encounter this issue in our test environment with same steps. But from regression perspective, it didn't not introduce any regression. 
Marking this as verified combining the results from Juan in comment #2.

Comment 7 errata-xmlrpc 2022-03-10 16:10:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.