Bug 1365525 - Difficult to find details on RC error through web console
Summary: Difficult to find details on RC error through web console
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 3.2.1
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Jessica Forrester
QA Contact: Yadan Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-09 12:48 UTC by Justin Pierce
Modified: 2017-08-16 19:51 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
It was difficult to find the underlying reason for a failed deployment from the project overview. The overview will now link to the Events page in these scenarios, which typically contains useful information about what went wrong.
Clone Of:
Environment:
Last Closed: 2017-08-10 05:15:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
failed-deployment-help (49.21 KB, image/png)
2017-07-07 16:42 UTC, Jessica Forrester
no flags Details
help debug link (32.32 KB, image/png)
2017-07-11 08:01 UTC, shahan
no flags Details
useful info (61.79 KB, image/png)
2017-07-11 08:02 UTC, shahan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 09:02:50 UTC

Description Justin Pierce 2016-08-09 12:48:41 UTC
Description of problem:
Several deployments failed for me during hack day. From the web console, it was difficult to find information about why. Although I could have poked around with the CLI, I wanted to stay within the GUI.
- Clicking the word "FAILED" in the overview of deployment did not take me to information about the failure. 
- The logs for the deployment# did not identify the underlying cause -- merely indicating a timeout. 
- I ultimately found the reason under Browse/Events, but these were in no way connected to my deployment. The event:
     "hellow-2 	Replication Controller 	Warning 	Failed create  	Error creating: pods "hellow-2-" is forbidden: service account jupierce1/jws-service-account was not found, retry after the service account is created
8 times in the last 3 minutes"

Version-Release number of selected component (if applicable):


How reproducible:
100% . 

Steps to Reproduce:
1. Attempted to use template: jws30-tomcat8-mysql-s2i . Do no create a JWS service account in advance. 
2. All template parameters were left as default. 

Actual results:
No apparent way to directly analyze DC's failure. 

Expected results:
Information about RC's failure somewhat correlated with DC in GUI.

Additional info:
I understand that this particular deployment failure is valid (https://bugzilla.redhat.com/show_bug.cgi?id=1313556). I'm just hoping that finding the underlying cause could have been more intuitive.

Comment 1 Samuel Padgett 2016-11-11 18:29:43 UTC
Ideally the cause of the failure would be written back to deployment status so we could display it directly. I have opened a PR that adds a link to the events tab to encourage users to check there.

https://github.com/openshift/origin-web-console/pull/864

Comment 2 Jessica Forrester 2016-11-17 17:58:42 UTC
Is this not something that is being handled with conditions? Or are conditions only showing up on the DC and not the RCs

Comment 3 Jessica Forrester 2016-12-08 21:40:49 UTC
We will have conditions on the RCs once kube 1.5 rebase lands, then we can address this

Comment 4 Jessica Forrester 2017-07-07 16:41:17 UTC
The conditions on RCs don't provide us any useful information about what might have gone wrong.  We now link you to both the log and the events for a failed deployment, which should help diagnose the problem.

Comment 5 Jessica Forrester 2017-07-07 16:42:02 UTC
Created attachment 1295345 [details]
failed-deployment-help

Comment 6 Jessica Forrester 2017-07-07 16:44:15 UTC
This is probably the best we can do as far as an improvement any time soon, there are too many reasons (events) that could be the underlying cause of a failure.

Comment 7 Jessica Forrester 2017-07-07 16:50:35 UTC
Just going to link to the overview redesign PR for this https://github.com/openshift/origin-web-console/pull/1335

Comment 8 shahan 2017-07-11 07:59:27 UTC
Checked this issue in openshift v3.6.139, now web console will display logs and events to help users analyze failure. see attachment.
BTW, QE is checking many Modified bugs to see if they're verifiable. Because fixed, moving to Verified. If have other concerns, pls tell me, thx

Comment 9 shahan 2017-07-11 08:01:14 UTC
Created attachment 1296107 [details]
help debug link

Comment 10 shahan 2017-07-11 08:02:05 UTC
Created attachment 1296108 [details]
useful info

Comment 12 errata-xmlrpc 2017-08-10 05:15:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716


Note You need to log in before you can comment on or make changes to this bug.