Bug 1979871

Summary: 2 oc commands are executed at the same moment, the command that doesn’t go through should report valid error status
Product: OpenShift Container Platform Reporter: akgunjal <akgunjal>
Component: ocAssignee: Maciej Szulik <maszulik>
Status: CLOSED WONTFIX QA Contact: zhou ying <yinzhou>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.7CC: aos-bugs, fbalak, jokerman, mbukatov, mfojtik
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-21 11:41:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description akgunjal@in.ibm.com 2021-07-07 09:44:59 UTC
Description of problem:
When 2 oc commands are executed at the same moment, the command that doesn’t go through should report valid error status.

Version-Release number of selected component (if applicable):


How reproducible:
This was reproduced as part of CI tests and the QSE team advised to open a bug after discussions.

Error :- 
tests.manage.monitoring.prometheus.test_deployment_status.test_ceph_monitor_stopped 
   Reason: test setup failure
  AssertionError: Downscaled monitors [‘rook-ceph-mon-c’] were not replaced

Tier 3 :- 
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/view/Tier3/job/qe-trigger-ibmcloud-managed-1az-rhel-3w-tier3/7/testReport/tests.manage.monitoring.prometheusmetrics/test_monitoring_negative/test_monitoring_shows_mon_down/

Tier 4a :- 
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/view/Tier4/job/qe-trigger-ibmcloud-managed-1az-rhel-3w-tier4a/7/testReport/

Actual results:


Expected results:


Additional info:

Comment 1 akgunjal@in.ibm.com 2021-07-07 09:48:25 UTC
As discussed with Filip Balak from Blue squad 

“In the test is one of the monitor deployments downscaled to 0 so monitor pod is deleted. The test waits for alert to appear and then upscales it back to 1. From test logs it looks to me like the monitor was not downscaled. From logs I see that 2 oc commands were executed at the same time from different threads (but on same machine). This might have affected the oc command that should have downscale the monitor and was not downscaled.”
Test logs snapshot :- 
 2021-06-25 15:20:41,640 - MainThread - INFO - tests.manage.monitoring.conftest.stop_mon.117 - Downscaling deployment rook-ceph-mon-c to 0
 2021-06-25 15:20:41,640 - MainThread - INFO - ocs_ci.utility.utils.exec_cmd.486 - Executing command: oc -n openshift-storage scale --replicas=0 deployment/rook-ceph-mon-c
 2021-06-25 15:20:41,647 - Thread-17 - INFO - ocs_ci.utility.utils.exec_cmd.486 - Executing command: [‘oc’, ‘login’, ‘-u’, ‘apikey’, ‘-p’, ‘*****’]
 2021-06-25 15:20:47,371 - MainThread - DEBUG - ocs_ci.utility.utils.exec_cmd.499 - Command stdout: deployment.apps/rook-ceph-mon-c scaled
 2021-06-25 15:20:47,371 - MainThread - DEBUG - ocs_ci.utility.utils.exec_cmd.507 - Command stderr is empty
 2021-06-25 15:20:47,371 - MainThread - DEBUG - ocs_ci.utility.utils.exec_cmd.508 - Command return code: 0
 2021-06-25 15:20:47,372 - MainThread - INFO - tests.manage.monitoring.conftest.stop_mon.119 - Waiting for 840 seconds
 2021-06-25 15:20:59,470 - Thread-17 - DEBUG - ocs_ci.utility.utils.exec_cmd.499 - Command stdout: Login successful.
 You have access to 66 projects, the list has been suppressed. You can list all projects with ‘oc projects’
 Using project “default”.
 2021-06-25 15:20:59,471 - Thread-17 - DEBUG - ocs_ci.utility.utils.exec_cmd.507 - Command stderr is empty
 2021-06-25 15:20:59,471 - Thread-17 - DEBUG - ocs_ci.utility.utils.exec_cmd.508 - Command return code: 0

Comment 2 Maciej Szulik 2021-07-13 12:56:28 UTC
It's perfectly acceptable and expected that if you run oc commands simultaneously the results might be unexpected.
Here you're invoking a scale command in one thread and login in the other thread. The latter will modify the kubeconfig 
file which is used for scale during its operation. The suggested workaround is to use separate kubeconfig files
when dealing with multiple processes working at the same time.

Comment 3 Filip Balák 2021-07-14 10:53:19 UTC
This bug is about not reporting valid error status (an error message) when the oc command fails. If the command does not provide error message in command output when the command is executed and fails to execute the command correctly then it is a bug.

I suggest to close this as WONTFIX instead of NOTABUG if this is not going to be fixed.