Bug 1332856

Summary:

Authentication invalidates when provider is down

Product:

Red Hat CloudForms Management Engine

Reporter:

Einat Pacifici <epacific>

Component:

UI - OPS

Assignee:

Greg Blomquist <gblomqui>

Status:

CLOSED ERRATA

QA Contact:

Einat Pacifici <epacific>

Severity:

high

Docs Contact:

Priority:

high

Version:

5.6.0

CC:

azellner, bazulay, cpelland, dajohnso, dron, epacific, fsimonce, hkataria, jfrey, jhardy, mfeifer, mpovolny, obarenbo, simaishi

Target Milestone:

Target Release:

5.6.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

5.6.0.11

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-06-29 15:57:15 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

Bug

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Attached: screenshot + evm.log + oc get pods result	none
Recreated evm.log and CFME screenshots	none

Description Einat Pacifici 2016-05-04 08:46:50 UTC

Created attachment 1153747 [details]
Attached: screenshot + evm.log + oc get pods result

Description of problem:
When viewing list of Pods in CFME - Containers, the list contains pods that no longer exist. 
Also, the list indicates these pods as running pods. 

Version-Release number of selected component (if applicable):
5.6.0.4

How reproducible:
Always

Steps to Reproduce:
1.In CFME - create provider and ensure there are several pods. 
2.Delete pods & add new pods

Actual results:
All pods - deleted and new are listed and are listed as running

Expected results:
Only existing pods should be listed.

Comment 2 Federico Simoncelli 2016-05-04 16:56:05 UTC

Ari, please work with Einat to understand if what she's seeing is an old issue we fixed already or if it's something new. Thanks.

Comment 3 Ari Zellner 2016-05-05 11:55:41 UTC

Unable to reproduce. Einat, this may have been already fixed. If not, lets review this together when youre available.

Comment 4 Dave Johnson 2016-05-05 18:09:46 UTC

Hey Dafna, like we discussed, assigning this to you for a retest.

Comment 6 Einat Pacifici 2016-05-08 11:58:20 UTC

This is currently blocked by: Bug 1333258

Comment 7 Barak 2016-05-15 09:19:21 UTC

Einat any updates ?

Comment 8 Einat Pacifici 2016-05-18 07:40:58 UTC

Created attachment 1158643 [details]
Recreated evm.log and CFME screenshots

Comment 9 Einat Pacifici 2016-05-18 07:43:00 UTC

Barak, this issue is still occurring. I have attached screenshots and evm.log
In the master I see: 

[root@ose-master ~]# oc get pods --all-namespaces
NAMESPACE         NAME                         READY     STATUS    RESTARTS   AGE
default           management-metrics-1-6r0z3   0/1       Pending   0          8h
default           my-pod1                      1/1       Running   0          1m
default           router-1-978yd               0/1       Pending   0          8h
default           router-1-uigub               1/1       Running   0          1d
openshift-infra   hawkular-cassandra-1-8l8uz   1/1       Running   0          8h
openshift-infra   hawkular-metrics-w2c7h       1/1       Running   0          8h
openshift-infra   heapster-taryt               1/1       Running   4          8h
openshift-infra   stress-1-jl5lv               1/1       Running   0          8h
openshift-infra   stress1-1-cy2i2              1/1       Running   0          8h

Comment 10 Ari Zellner 2016-05-26 11:07:56 UTC

Einat, Im having a hard time reproducing this and Id like your help. Please show me your env when this happens.

Comment 12 Einat Pacifici 2016-05-31 06:19:14 UTC

Dafna, the containers remain in CFME for as long as CFME is running. This means that the obsolete/deleted containers are always visible (for the 7 days that CFME was running and available).
During this time the system that openshift was on (rhevm3) went down and was brought back up. 
As a result, new containers were created. 
The visible result was that CFME was showing the old as well as the new containers.

Comment 13 Federico Simoncelli 2016-06-01 08:59:34 UTC

As discussed with Ari yesterday this is a side-effect of authentication failures that are deactivating the provider refresh workers.

Ari please add more information here.

Comment 14 Jason Frey 2016-06-03 18:31:59 UTC

Federico, is there a separate BZ about the authentication failures that are deactivating the provider refresh workers?  I ask because *that* one should be marked as blocker as well, if so.

Comment 15 Federico Simoncelli 2016-06-03 20:34:19 UTC

(In reply to Jason Frey from comment #14)
> Federico, is there a separate BZ about the authentication failures that are
> deactivating the provider refresh workers?

Jason, not that I know of. I was thinking to use this one (unless you think it could be misleading).

> I ask because *that* one should be marked as blocker as well, if so.

Yes this BZ is a blocker because it is the one about authentication failures that are preventing the refresh worker from running (therefore "CFME shows pods that do not exist in openshift", as reported in the subject of the BZ).

Ari should confirm that (needinfo added in comment 13) or maybe update us in case he found any additional issue.

Comment 16 Ari Zellner 2016-06-06 21:36:11 UTC

This is a provider infrastructure problem with a possible fix here: https://github.com/ManageIQ/manageiq/pull/8912

Comment 17 Dave Johnson 2016-06-08 22:41:55 UTC

PR Merged and backported

Comment 19 errata-xmlrpc 2016-06-29 15:57:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1348