Bug 1564442 - virt-who hangs on check_report_state
Summary: virt-who hangs on check_report_state
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Subscriptions - virt-who
Version: 6.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: Unspecified
Assignee: Barnaby Court
QA Contact: Katello QA List
URL:
Whiteboard: scale_lab
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-06 09:57 UTC by sbadhwar
Modified: 2019-09-26 14:05 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-09 14:36:09 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description sbadhwar 2018-04-06 09:57:16 UTC
Description of problem:
While trying to test virt-who at Scale, we discovered that virt-who hangs quite a lot of times at the step check_report_state for the created jobs.

Virt-who was ran in one shot mode with debug messages enabled trying to report the hosts running in VMWare ESXi environment. The setup consisted of reporting 100 hosts with 6000 virtual machines.

The debug output is as follows:
2018-04-06 05:40:09,619 [rhsm.connection DEBUG] MainProcess(1213):Thread-3 @connection.py:_request:516 - Making request: POST /rhsm/hypervisors/Default_Organization?reporter_id=gprfc023.sbu.lab.eng.bos.redhat.com-e542c5cdcd5a42b9afddad8b4529e928&cloaked=False&env=Library
2018-04-06 05:40:12,391 [rhsm.connection INFO] MainProcess(1213):Thread-3 @connection.py:_request:552 - Response: status=200, request="POST /rhsm/hypervisors/Default_Organization?reporter_id=gprfc023.sbu.lab.eng.bos.redhat.com-e542c5cdcd5a42b9afddad8b4529e928&cloaked=False&env=Library"
2018-04-06 05:40:12,398 [virtwho.destination_8759358284855132892 DEBUG] MainProcess(1213):Thread-3 @subscriptionmanager.py:_connect:128 - Authenticating with RHSM username admin
2018-04-06 05:40:12,398 [rhsm.connection INFO] MainProcess(1213):Thread-3 @connection.py:__init__:822 - Connection built: host=gprfc018.sbu.lab.eng.bos.redhat.com port=443 handler=/rhsm auth=basic username=admin
2018-04-06 05:40:12,399 [rhsm.connection DEBUG] MainProcess(1213):Thread-3 @connection.py:_request:516 - Making request: GET /rhsm/status/
2018-04-06 05:40:12,444 [rhsm.connection INFO] MainProcess(1213):Thread-3 @connection.py:_request:552 - Response: status=200, request="GET /rhsm/status/"
2018-04-06 05:40:12,444 [virtwho.destination_8759358284855132892 DEBUG] MainProcess(1213):Thread-3 @subscriptionmanager.py:check_report_state:263 - Checking status of job hypervisor_update_9e23fe58-4f70-4902-9ac7-326ae9c8f6f8
2018-04-06 05:40:12,446 [rhsm.connection DEBUG] MainProcess(1213):Thread-3 @connection.py:_request:516 - Making request: GET /rhsm/jobs/hypervisor_update_9e23fe58-4f70-4902-9ac7-326ae9c8f6f8?result_data=True
2018-04-06 05:40:12,498 [rhsm.connection INFO] MainProcess(1213):Thread-3 @connection.py:_request:552 - Response: status=200, request="GET /rhsm/jobs/hypervisor_update_9e23fe58-4f70-4902-9ac7-326ae9c8f6f8?result_data=True"
2018-04-06 05:40:12,498 [virtwho.destination_8759358284855132892 DEBUG] MainProcess(1213):Thread-3 @subscriptionmanager.py:check_report_state:277 - Job hypervisor_update_9e23fe58-4f70-4902-9ac7-326ae9c8f6f8 not finished

On quering the API endpoint for the same job, the following output was generated:

{
  "id": "hypervisor_update_9e23fe58-4f70-4902-9ac7-326ae9c8f6f8",
  "state": "FINISHED",
  "startTime": "2018-04-06T09:40:12+0000",
  "finishTime": "2018-04-06T09:40:16+0000",
  "result": "Created: 0, Updated: 0, Unchanged:100, Failed: 0",
  "principalName": "foreman_admin",
  "targetType": "owner",
  "targetId": "Default_Organization",
  "ownerId": "Default_Organization",
  "correlationId": "6b0d058c",
  "resultData": "[cloaked]",
  "statusPath": "/jobs/hypervisor_update_9e23fe58-4f70-4902-9ac7-326ae9c8f6f8",
  "done": true,
  "group": "async group",
  "created": "2018-04-06T09:40:12+0000",
  "updated": "2018-04-06T09:40:16+0000"
}

Indicating the Job has already finished.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
The job was stuck at waiting state

Expected results:
virt-who one shot should finish automatically

Additional info:

Comment 1 Marek Hulan 2018-04-06 10:24:34 UTC
This does not seem to be related to configuration plugin, but virt-who itself. Moving of configuration plugin component. Chris, could you please take a look?

Comment 2 Eko 2018-04-09 06:27:04 UTC
What is your virt-who package version? the latest virt-who for RHEL7.5 is virt-who-0.21.5-1.el7

Comment 3 sbadhwar 2018-04-09 06:33:32 UTC
(In reply to Eko from comment #2)
> What is your virt-who package version? the latest virt-who for RHEL7.5 is
> virt-who-0.21.5-1.el7

The package version for virt-who is: virt-who-0.20.4-1.el7sat.noarch running on RHEL 7.4

Comment 6 Kevin Howell 2018-04-09 14:36:09 UTC
Closing CURRENTRELEASE per comment 5.

Comment 7 sthirugn@redhat.com 2018-04-09 19:57:20 UTC
Satellite 6.3 currently ships virt-who version 0.20.4-1.  To resolve this specific bug reported here, use virt-who version 0.21.5-1.el7 which is shipped with rhel 7.5.


Note You need to log in before you can comment on or make changes to this bug.