Bug 2011845
| Summary: | Oc exec command returning inconsistent output when printing huge amount of data | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Petr Balogh <pbalogh> |
| Component: | oc | Assignee: | Arda Guclu <aguclu> |
| oc sub component: | oc | QA Contact: | zhou ying <yinzhou> |
| Status: | CLOSED UPSTREAM | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | aos-bugs, maszulik, mfojtik, nstielau |
| Version: | 4.8 | Keywords: | Automation |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-10-11 10:36:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Petr Balogh
2021-10-07 13:59:52 UTC
BTW the tar binary was not available in the container as well, so we could not use oc cp command. But now when I am looking at the container it already has tar in it, at least in OCS 4.8 version. So I will check also OCS 4.6 and 4.7 if this is the case, and if it's already there we can probably change the logic in the code to use oc cp instead. I'm lowering the priority since it looks like there are alternative and this problems appears to be happening less than 20% of time. This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that. Hello, I haven't tested it recently but don't think this issue got fixed, so it's still valid. We started using different approach for getting data out of pod, but for sure there was some regression introduced between mentioned versions as we saw more inconsistency of data returned by that exec command. So not sure why it's happening on large amount of data and maybe if you will just have 50 MB file full of 0 or 1 and you will try to cat it's output you might spot some error in the output. But I haven't tested on file with zeroes or 1 so maybe it's related to some symbol from binary file. Currently as we changed the approach of gathering the data I don't care about fix, but just thinking if it can affect some customer or someone as for sure some regression got introduced in that area. So on IBM cloud we see the inconsistency also when using oc cp command. https://github.com/red-hat-storage/ocs-ci/pull/4958/files#diff-1d32a061b0f98ee283c52d1f7e9ca822177982cee47f7daa76904f5d8e1c3bfdR985 https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2772/consoleFull 2022-01-05 15:25:44 14:25:31 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh noobaa-operator-6d97549cc6-ngwwg bash -c "md5sum /usr/local/bin/noobaa-operator" 2022-01-05 15:25:44 14:25:32 - MainThread - ocs_ci.ocs.resources.pod - INFO - md5sum of file /usr/local/bin/noobaa-operator: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:25:44 14:25:32 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Remote noobaa cli md5 hash: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:25:44 14:25:32 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Local noobaa cli md5 hash: ed6f3d31987a18ebb731a67a8cf076e6 2022-01-05 15:25:44 14:25:32 - MainThread - ocs_ci.utility.retry - WARNING - Binary hash doesn't match the one on the operator pod, Retrying in 15 seconds... 2022-01-05 15:25:48 14:25:47 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-6d97549cc6-ngwwg -n openshift-storage -o yaml 2022-01-05 15:25:48 14:25:48 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh noobaa-operator-6d97549cc6-ngwwg bash -c "md5sum /usr/local/bin/noobaa-operator" 2022-01-05 15:25:50 14:25:50 - MainThread - ocs_ci.ocs.resources.pod - INFO - md5sum of file /usr/local/bin/noobaa-operator: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:25:50 14:25:50 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Remote noobaa cli md5 hash: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:25:50 14:25:50 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Local noobaa cli md5 hash: ed6f3d31987a18ebb731a67a8cf076e6 2022-01-05 15:38:44 14:38:32 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-6d97549cc6-ngwwg -n openshift-storage -o yaml 2022-01-05 15:38:44 14:38:34 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh noobaa-operator-6d97549cc6-ngwwg bash -c "md5sum /usr/local/bin/noobaa-operator" 2022-01-05 15:38:44 14:38:36 - MainThread - ocs_ci.ocs.resources.pod - INFO - md5sum of file /usr/local/bin/noobaa-operator: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:38:44 14:38:36 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Remote noobaa cli md5 hash: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:38:44 14:38:36 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Local noobaa cli md5 hash: 73d3fe6d2f58e61c0e0361cfb30d5675 2022-01-05 15:38:44 14:38:36 - MainThread - ocs_ci.utility.retry - WARNING - Binary hash doesn't match the one on the operator pod, Retrying in 15 seconds... 2022-01-05 15:38:53 14:38:51 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-6d97549cc6-ngwwg -n openshift-storage -o yaml 2022-01-05 15:38:53 14:38:52 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh noobaa-operator-6d97549cc6-ngwwg bash -c "md5sum /usr/local/bin/noobaa-operator" 2022-01-05 15:38:53 14:38:53 - MainThread - ocs_ci.ocs.resources.pod - INFO - md5sum of file /usr/local/bin/noobaa-operator: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:38:53 14:38:53 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Remote noobaa cli md5 hash: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:38:53 14:38:53 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Local noobaa cli md5 hash: 73d3fe6d2f58e61c0e0361cfb30d5675 2022-01-05 15:39:06 14:39:05 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-6d97549cc6-ngwwg -n openshift-storage -o yaml 2022-01-05 15:39:06 14:39:06 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh noobaa-operator-6d97549cc6-ngwwg bash -c "md5sum /usr/local/bin/noobaa-operator" 2022-01-05 15:39:08 14:39:07 - MainThread - ocs_ci.ocs.resources.pod - INFO - md5sum of file /usr/local/bin/noobaa-operator: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:39:08 14:39:07 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Remote noobaa cli md5 hash: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:39:08 14:39:08 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Local noobaa cli md5 hash: 84135e3ece427cf19817a68a782e0ba3 2022-01-05 15:39:08 14:39:08 - MainThread - ocs_ci.utility.retry - WARNING - Binary hash doesn't match the one on the operator pod, Retrying in 15 seconds... 2022-01-05 15:39:23 14:39:23 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-6d97549cc6-ngwwg -n openshift-storage -o yaml 2022-01-05 15:39:24 14:39:23 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh noobaa-operator-6d97549cc6-ngwwg bash -c "md5sum /usr/local/bin/noobaa-operator" 2022-01-05 15:39:26 14:39:25 - MainThread - ocs_ci.ocs.resources.pod - INFO - md5sum of file /usr/local/bin/noobaa-operator: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:39:26 14:39:25 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Remote noobaa cli md5 hash: 2e5a528a73ecf8b15e8a14105be0f82a 2022-01-05 15:39:26 14:39:25 - MainThread - /home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/resources/mcg.py - INFO - Local noobaa cli md5 hash: 84135e3ece427cf19817a68a782e0ba3 you can see that Local noobaa cli md5 hash is different on several re-try attepmts. I am running once again here: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-trigger-ibmcloud-managed-1az-rhel-3w-tier1/59/ Will be able to provide you cluster for investigation. Client used in job above: 4.9.0-0.nightly-2022-01-05-035431 The underlaying cluster is IBM cloud ROKS 4.9.8 Must gather data: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-058icm1r3-t1/j-058icm1r3-t1_20220105T130458/logs/failed_testcase_ocs_logs_1641388138/test_deployment_ocs_logs/ I see the issue is discussed also here: https://github.com/kubernetes/kubernetes/issues/60140 And this is looking identical problem. When using `oc rsync` I don't see the issue. This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that. I think this is still relevant. As stated in here https://github.com/kubernetes/kubernetes/issues/60140, this is well-known upstream issue. I'd prefer closing this bug as WONTFIX because we need to wait upstream fix in any case. Feel free to re-open it if you think otherwise. |