1365525 – Difficult to find details on RC error through web console

Bug 1365525 - Difficult to find details on RC error through web console

Summary: Difficult to find details on RC error through web console

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Management Console
Sub Component:
Version:	3.2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Jessica Forrester
QA Contact:	Yadan Pei
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-08-09 12:48 UTC by Justin Pierce
Modified:	2017-08-16 19:51 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	It was difficult to find the underlying reason for a failed deployment from the project overview. The overview will now link to the Events page in these scenarios, which typically contains useful information about what went wrong.
Clone Of:
Environment:
Last Closed:	2017-08-10 05:15:47 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
failed-deployment-help (49.21 KB, image/png) 2017-07-07 16:42 UTC, Jessica Forrester	no flags	Details
help debug link (32.32 KB, image/png) 2017-07-11 08:01 UTC, shahan	no flags	Details
useful info (61.79 KB, image/png) 2017-07-11 08:02 UTC, shahan	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2017:1716	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.6 RPM Release Advisory	2017-08-10 09:02:50 UTC

Description Justin Pierce 2016-08-09 12:48:41 UTC

Description of problem:
Several deployments failed for me during hack day. From the web console, it was difficult to find information about why. Although I could have poked around with the CLI, I wanted to stay within the GUI.
- Clicking the word "FAILED" in the overview of deployment did not take me to information about the failure. 
- The logs for the deployment# did not identify the underlying cause -- merely indicating a timeout. 
- I ultimately found the reason under Browse/Events, but these were in no way connected to my deployment. The event:
     "hellow-2 	Replication Controller 	Warning 	Failed create  	Error creating: pods "hellow-2-" is forbidden: service account jupierce1/jws-service-account was not found, retry after the service account is created
8 times in the last 3 minutes"

Version-Release number of selected component (if applicable):


How reproducible:
100% . 

Steps to Reproduce:
1. Attempted to use template: jws30-tomcat8-mysql-s2i . Do no create a JWS service account in advance. 
2. All template parameters were left as default. 

Actual results:
No apparent way to directly analyze DC's failure. 

Expected results:
Information about RC's failure somewhat correlated with DC in GUI.

Additional info:
I understand that this particular deployment failure is valid (https://bugzilla.redhat.com/show_bug.cgi?id=1313556). I'm just hoping that finding the underlying cause could have been more intuitive.

Comment 1 Samuel Padgett 2016-11-11 18:29:43 UTC

Ideally the cause of the failure would be written back to deployment status so we could display it directly. I have opened a PR that adds a link to the events tab to encourage users to check there.

https://github.com/openshift/origin-web-console/pull/864

Comment 2 Jessica Forrester 2016-11-17 17:58:42 UTC

Is this not something that is being handled with conditions? Or are conditions only showing up on the DC and not the RCs

Comment 3 Jessica Forrester 2016-12-08 21:40:49 UTC

We will have conditions on the RCs once kube 1.5 rebase lands, then we can address this

Comment 4 Jessica Forrester 2017-07-07 16:41:17 UTC

The conditions on RCs don't provide us any useful information about what might have gone wrong.  We now link you to both the log and the events for a failed deployment, which should help diagnose the problem.

Comment 5 Jessica Forrester 2017-07-07 16:42:02 UTC

Created attachment 1295345 [details]
failed-deployment-help

Comment 6 Jessica Forrester 2017-07-07 16:44:15 UTC

This is probably the best we can do as far as an improvement any time soon, there are too many reasons (events) that could be the underlying cause of a failure.

Comment 7 Jessica Forrester 2017-07-07 16:50:35 UTC

Just going to link to the overview redesign PR for this https://github.com/openshift/origin-web-console/pull/1335

Comment 8 shahan 2017-07-11 07:59:27 UTC

Checked this issue in openshift v3.6.139, now web console will display logs and events to help users analyze failure. see attachment.
BTW, QE is checking many Modified bugs to see if they're verifiable. Because fixed, moving to Verified. If have other concerns, pls tell me, thx

Comment 9 shahan 2017-07-11 08:01:14 UTC

Created attachment 1296107 [details]
help debug link

Comment 10 shahan 2017-07-11 08:02:05 UTC

Created attachment 1296108 [details]
useful info

Comment 12 errata-xmlrpc 2017-08-10 05:15:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.