Bug 1506546

Summary:	Virt-who polls job status too quickly [rhel-7.3.z]
Product:	Red Hat Enterprise Linux 7	Reporter:	Oneata Mircea Teodor <toneata>
Component:	virt-who	Assignee:	candlepin-bugs
Status:	CLOSED ERRATA	QA Contact:	Eko <hsun>
Severity:	medium	Docs Contact:
Priority:	high
Version:	7.4	CC:	candlepin-bugs, csnyder, hsun, rjerrido, toneata, yuefliu
Target Milestone:	rc	Keywords:	Triaged, ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	virt-who-0.17-15.el7_3	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1503700	Environment:
Last Closed:	2017-12-13 08:03:01 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1503700
Bug Blocks:

Description Oneata Mircea Teodor 2017-10-26 10:00:47 UTC

This bug has been copied from bug #1503700 and has been proposed to be backported to 7.3 z-stream (EUS).

Comment 3 Eko 2017-10-30 08:19:37 UTC

Check this issue with virt-who-0.17-14.el7_3, when no Retry-After setting for 429 code, the default 30 seconds is not be used, virt-who will check the job state immediately.


1). if no 429 code, virt-who will check the job state after 15 seconds [PASS]
2017-10-30 15:52:27,600 [virtwho.main DEBUG] MainProcess(6060):MainThread @executor.py:send_report:105 - Report for config "esx.conf" sent

==== waiting for 15s to check the job status ====

2017-10-30 15:52:42,621 [virtwho.main DEBUG] MainProcess(6060):MainThread @subscriptionmanager.py:_connect:121 - Authenticating with RHSM username admin
2017-10-30 15:52:42,952 [virtwho.main DEBUG] MainProcess(6060):MainThread @subscriptionmanager.py:check_report_state:227 - Checking status of job hypervisor_update_4ce60f8a-4040-42d0-899f-6b3dcab33151

2). if there is a 429 code, but no Retry-After setting, virt-who will waiting for 30 seconds to check the job state by default, but actually, no waiting, the default 30s is not be used. [FAILED]

2017-10-30 16:07:26,965 [virtwho.main DEBUG] MainProcess(6267):MainThread @executor.py:run:287 - HTTP 429 received during job polling
2017-10-30 16:07:26,969 [virtwho.main DEBUG] MainProcess(6267):MainThread @subscriptionmanager.py:_connect:121 - Authenticating with RHSM username admin

==== no waiting, not 30 seconds to retry by default ====

2017-10-30 16:07:27,329 [virtwho.main DEBUG] MainProcess(6267):MainThread @subscriptionmanager.py:check_report_state:227 - Checking status of job hypervisor_update_200b2461-117e-4123-b2a0-9e52c544697e


3). if there is a 429 code, and Retry-After is setting to 10,  virt-who will waiting for 10 seconds to check the job state. [PASS]
2017-10-30 16:11:21,783 [virtwho.main DEBUG] MainProcess(6267):MainThread @executor.py:run:287 - HTTP 429 received during job polling

==== waiting for 10 seconds as Retry-After setting ====

2017-10-30 16:11:31,798 [virtwho.main DEBUG] MainProcess(6267):MainThread @subscriptionmanager.py:_connect:121 - Authenticating with RHSM username admin
2017-10-30 16:11:32,144 [virtwho.main DEBUG] MainProcess(6267):MainThread @subscriptionmanager.py:check_report_state:227 - Checking status of job hypervisor_update_200b2461-117e-4123-b2a0-9e52c544697e


4). if there is a 500 or 404 error code, virt-who will show the error message and exit.

Comment 4 Eko 2017-11-01 03:48:55 UTC

check this with Chris, a new patch will fix this issue, mark it verified, when the new build is available, we will test it again.

Comment 6 Eko 2017-11-03 03:34:48 UTC

verified in virt-who-0.17-16.el7_3,
1. start virt-who service the first time, will wait for 15s to check the job state [PASS]
2. if 429 code received, but no Retry-After setting,will wait for 30s to check the job state [PASS]
3. if 429 code received, and set Retry-After to 10, will wait for 10s to check the job state [PASS]
4. if 404/500 code received, virt-who will be killed [PASS]
5. if two or more config files in /etc/virt-who.d, all the reports will be sent together, and then check the job state one by one(every 15s) [PASS]

Comment 8 errata-xmlrpc 2017-12-13 08:03:01 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3447