Bug 1946243 - No relevant error when pg limit is reached in block pools page
Summary: No relevant error when pg limit is reached in block pools page
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Console Storage Plugin
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: gowtham
QA Contact: Shay Rozen
URL:
Whiteboard:
Depends On:
Blocks: 1959715
TreeView+ depends on / blocked
 
Reported: 2021-04-05 13:46 UTC by Shay Rozen
Modified: 2021-07-27 22:57 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1959715 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:57:19 UTC
Target Upstream Version:


Attachments (Terms of Use)
rbd pool create page after failure (173.67 KB, image/png)
2021-04-05 13:46 UTC, Shay Rozen
no flags Details
storageclass create page failure (202.03 KB, image/png)
2021-04-05 13:47 UTC, Shay Rozen
no flags Details
fixed_timed_out (28.43 KB, image/png)
2021-06-01 13:25 UTC, gowtham
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift console pull 8689 0 None open Bug 1946243: Fix pool creation timout issue when PG count is limit is… 2021-04-17 10:32:17 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:57:46 UTC

Description Shay Rozen 2021-04-05 13:46:56 UTC
Created attachment 1769267 [details]
rbd pool create page after failure

Description of problem (please be detailed as possible and provide log
snippests):
When reaching the pg limit by creating multiple pools, the console only state "Failure" instead of "Ready". I think the user should be notified when pg limit is reached.
Also in the storageclass creation you can create a pool and there the error message is "Pool as09 creation timed out. Please check if ocs-operator and rook operator are running". If you need a separate BZ for that let me know please. 


Version of all relevant components (if applicable):
ocs4.8 
ocp4.8 

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?
Only from UI

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install OCP and OCS
2. Create rbd pools till failure



Actual results:
User get the word "Failure" when pg limit is reached

Expected results:
User should get appropiate error regarding pg limit is reached.

Additional info:

Comment 1 Shay Rozen 2021-04-05 13:47:40 UTC
Created attachment 1769268 [details]
storageclass create page failure

Comment 3 Shay Rozen 2021-04-06 11:24:50 UTC
There is a same issue with pool creation in storageclass page. Maybe the fix can be for both https://bugzilla.redhat.com/show_bug.cgi?id=1890135

Comment 4 Yaniv Kaul 2021-04-06 12:22:36 UTC
I wonder what the user can do about such an alert.
How many pools are we talking about here?

Comment 6 Shay Rozen 2021-04-06 13:37:28 UTC
I hit 15 (with replication 2) till I've got the failure. With replication 3 I think it will hit sooner. Is 4.8.0-303.ci enough?
The alert can tell the user why is the failure and the fact that he needs to add capacity for more pools.

Comment 7 gowtham 2021-04-07 08:11:23 UTC
This bug needs to be fixed in backend first, Backend should pass a proper error message of failiure reason in ceph block pool yaml with phase "failiure". I agreed timeout is an UI bug, But still i can't think of any proper solution without the error message.

@Afreen if error message found in yaml is block pool list page (Default page created by OLM) will display error message along with phase?

Comment 11 gowtham 2021-04-17 10:56:22 UTC
Ui is not receiving a proper error message from the backend regarding the PG limit is reached, to fix the timeout issue I have send a PR: https://github.com/openshift/console/pull/8689. I am failing pool creating with some general error message: "Pool {poolName} got created with non ready state."

Pool CR creation is successful but status is a failure. Since UI is at CR level, I am specifying pool creation is done with some error. When PG limit is available this will become ready.

Comment 13 Shay Rozen 2021-05-31 12:36:01 UTC
@gowtham Why this bz moved to on_qa? Is there a fix that enables some error message except failure?

Comment 14 gowtham 2021-05-31 13:34:33 UTC
The fix is showing a failure error message instead of timing out the pool creation request "pool {pool-name} got created with errors". As per UI CR is created but the status is a failure.

Comment 15 Shay Rozen 2021-06-01 12:36:45 UTC
Where can I see the message. I've created pool limit yesterday and didn't see any message. Can you elaborate?

Comment 16 gowtham 2021-06-01 13:21:03 UTC
You have mentioned two errors here:
   1. Block pool creation page under OCS operator is showing "Failure" (Failure can happen for various reasons, and Ui is not receiving any failure error message from the backend. Please raise a separate BZ for a rook)
   2. Pool creation at storage class creation page is timed out without any error message (I fixed by showing generic error message)

Comment 17 gowtham 2021-06-01 13:25:13 UTC
Created attachment 1788508 [details]
fixed_timed_out

Comment 20 errata-xmlrpc 2021-07-27 22:57:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.