Bug 1724312 - Unable to receive "generalize" event from Azure after generalizing an instance
Summary: Unable to receive "generalize" event from Azure after generalizing an instance
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.10.6
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: GA
: 5.12.0
Assignee: Daniel Berger
QA Contact: John Dupuy
Red Hat CloudForms Documentation
URL:
Whiteboard:
Depends On:
Blocks: 1733383 1767784
TreeView+ depends on / blocked
 
Reported: 2019-06-26 18:18 UTC by John Dupuy
Modified: 2020-03-25 18:02 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1733383 1767784 (view as bug list)
Environment:
Last Closed: 2020-03-25 18:02:34 UTC
Category: Bug
Cloudforms Team: Azure
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description John Dupuy 2019-06-26 18:18:28 UTC
Description of problem:
We have an automated test case that will search for the events generated after generalizing and capturing a VM in Azure, testing that CFME is able to correctly parse and receive Azure events. After provisioning and capturing the VM, the test looks for the events "virtualMachines_generalize_EndRequest" and "virtualMachines_capture_EndRequest". I can see the latter being written to the DB in the event_streams table. However, the former is not received and never written to the DB. Both events appear in the activity log of the Azure web portal.  

Version-Release number of selected component (if applicable):
5.10.6.1

How reproducible:
100%

Steps to Reproduce:
1. Add Azure as a provider in CFME
2. Provision an instance
3. In Azure, generalize and capture the instance
4. Verify that the events appear in the Azure portal

Actual results:
CFME receives the "virtualMachines_capture_EndRequest" event but not the "virtualMachines_generalize_EndRequest" event

Expected results:
CFME should receive both events. 

Additional info:
This test is passing in 5.10.6.0 and 5.10.5.1, so I believe the changes were introduced in 5.10.6.1. I am marking this as a regression. This PR may be related? https://github.com/ManageIQ/manageiq-providers-azure/pull/335/files


Here is a comparison of the events received in 5.10.6.1 and 5.10.5.1 for the same VM:

5.10.6.1:
 8955 | virtualMachines_write_EndRequest        | AZURE  |         | test-captu-mynk
 8956 | virtualMachines_write_EndRequest        | AZURE  |         | test-captu-mynk
 8957 | virtualMachines_generalize_BeginRequest | AZURE  |         | test-captu-mynk
 8958 | virtualMachines_capture_BeginRequest    | AZURE  |         | test-captu-mynk
 8959 | virtualMachines_capture_EndRequest      | AZURE  |         | test-captu-mynk
 8960 | virtualMachines_capture_EndRequest      | AZURE  |         | test-captu-mynk

5.10.5.1:
 165 | virtualMachines_write_EndRequest      | AZURE  |         | test-captu-mynk
 166 | virtualMachines_deallocate_EndRequest | AZURE  |         | test-captu-mynk
 168 | virtualMachines_write_EndRequest      | AZURE  |         | test-captu-mynk
 175 | publicIPAddresses_delete_BeginRequest | AZURE  |         | 
 176 | publicIPAddresses_delete_EndRequest   | AZURE  |         | 
 169 | virtualMachines_generalize_EndRequest | AZURE  |         | test-captu-mynk
 170 | virtualMachines_capture_BeginRequest  | AZURE  |         | test-captu-mynk
 171 | virtualMachines_capture_EndRequest    | AZURE  |         | test-captu-mynk
 172 | virtualMachines_capture_EndRequest    | AZURE  |         | test-captu-mynk

Comment 4 dmetzger 2019-06-27 16:45:14 UTC
are there logs available?

Comment 5 John Dupuy 2019-06-27 18:38:08 UTC
I just ran this test 5 times in a row in order to generate logs on a fresh appliance, and wanted to share my results here. Of those 5 tests, only 2 passed (test-captu-rlkv, test-captu-wmcv). 1 test failed to generate the events in Azure, and the VM was deleted prematurely (test-captu-i1nc). The other 2 remaining tests generated the events in Azure, but only one or None were received by CFME (test-captu-jdrb, test-captu-fexo). 

So this is not 100% reproducible as I suggested yesterday, appears to be 50% or so at best. 

Here is the a DB query from 5 test runs, a change in VM name signifies a different test run. I will also attach the logs from this new appliance. 

vmdb_production=# select id, event_type, vm_name, source from event_streams where source = 'AZURE' and vm_name is not NULL;
 id  |               event_type                |     vm_name     | source 
-----+-----------------------------------------+-----------------+--------
 105 | virtualMachines_write_EndRequest        | test-captu-rlkv | AZURE
 106 | virtualMachines_deallocate_BeginRequest | test-captu-rlkv | AZURE
 107 | virtualMachines_deallocate_EndRequest   | test-captu-rlkv | AZURE
 109 | virtualMachines_generalize_EndRequest   | test-captu-rlkv | AZURE
 110 | virtualMachines_deallocate_EndRequest   | test-captu-rlkv | AZURE
 112 | virtualMachines_capture_EndRequest      | test-captu-rlkv | AZURE
 113 | virtualMachines_delete_EndRequest       | test-captu-rlkv | AZURE
 134 | virtualMachines_write_EndRequest        | test-captu-wmcv | AZURE
 135 | virtualMachines_deallocate_BeginRequest | test-captu-wmcv | AZURE
 136 | virtualMachines_deallocate_EndRequest   | test-captu-wmcv | AZURE
 138 | virtualMachines_generalize_BeginRequest | test-captu-wmcv | AZURE
 139 | virtualMachines_generalize_EndRequest   | test-captu-wmcv | AZURE
 140 | virtualMachines_capture_EndRequest      | test-captu-wmcv | AZURE
 143 | virtualMachines_delete_BeginRequest     | test-captu-wmcv | AZURE
 144 | virtualMachines_delete_EndRequest       | test-captu-wmcv | AZURE
 170 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 173 | virtualMachines_deallocate_BeginRequest | test-captu-i1nc | AZURE
 174 | virtualMachines_deallocate_EndRequest   | test-captu-i1nc | AZURE
 176 | virtualMachines_delete_EndRequest       | test-captu-i1nc | AZURE
 181 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 182 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 183 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 184 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 185 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 186 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 187 | virtualMachines_write_EndRequest        | test-captu-i1nc | AZURE
 188 | virtualMachines_delete_EndRequest       | test-captu-i1nc | AZURE
 203 | virtualMachines_write_EndRequest        | test-captu-jdrb | AZURE
 205 | virtualMachines_deallocate_EndRequest   | test-captu-jdrb | AZURE
 207 | virtualMachines_deallocate_EndRequest   | test-captu-jdrb | AZURE
 209 | virtualMachines_delete_BeginRequest     | test-captu-jdrb | AZURE
 210 | virtualMachines_delete_EndRequest       | test-captu-jdrb | AZURE
 233 | virtualMachines_write_BeginRequest      | test-captu-fexo | AZURE
 234 | virtualMachines_write_BeginRequest      | test-captu-fexo | AZURE
 235 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 236 | virtualMachines_deallocate_BeginRequest | test-captu-fexo | AZURE
 237 | virtualMachines_deallocate_EndRequest   | test-captu-fexo | AZURE
 244 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 245 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 246 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 247 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 248 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 249 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 250 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 251 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 252 | virtualMachines_write_EndRequest        | test-captu-fexo | AZURE
 253 | virtualMachines_capture_EndRequest      | test-captu-fexo | AZURE
 254 | virtualMachines_capture_EndRequest      | test-captu-fexo | AZURE

Comment 11 Daniel Berger 2019-07-08 19:42:15 UTC
Just curious, what happens if you disable targeted refresh? Look for:

:allow_targeted_refresh: true

Change it to false, and restart the app.

Comment 12 Daniel Berger 2019-07-08 19:47:53 UTC
You could also try tinkering with the event catcher poll settings and see if that helps.

Comment 13 Alexander Zagaynov 2019-07-08 22:14:43 UTC
John, which command do you use to check if event is coming to MiQ?

I did trace all event processing chain and found that it works well, I see this event stored in `EmsEvent` table 
(using command `EmsEvent.where(event_type: 'virtualMachines_generalize_EndRequest').last` in rails console).

P.S. I don't see this event in vm's timeline, do you mean that? In that case this might be something in UI.

Comment 19 Sudhir Mallamprabhakara 2019-07-10 14:19:57 UTC
moving to 5.10.8

Comment 20 Alexander Zagaynov 2019-07-10 15:00:44 UTC
John, can you please modify your query according to Adam's comment #17 and check is that still a bug or not?

About `vm_name` - this might needs to be fixed, but it is something very different. 
Citing our discussion with Adam in Gitter's room:
> when we reconnect events we dont' set the vm_name like we do if the vm is already there
> https://github.com/ManageIQ/manageiq/blob/ae5f50feb2efcc5b552fe9b358c2a7a2539d0f4b/app/models/vm_or_template.rb#L836-L862

Comment 21 Alexander Zagaynov 2019-07-10 15:34:03 UTC
Here is commands I use in the Rails console:

`EmsEvent.where(event_type: 'virtualMachines_generalize_EndRequest').where(vm_or_template_id: nil).count`
- returns zero (it means all those events are attached to vms)

`EmsEvent.where(event_type: 'virtualMachines_generalize_EndRequest').where.not(vm_or_template_id: nil).count`
- returns those events amount

`EmsEvent.where(event_type: 'virtualMachines_generalize_EndRequest').map { |r| [r.vm_or_template_id, r.vm_name] }`
- shows mentioned properties of those events

Comment 22 John Dupuy 2019-07-10 18:00:01 UTC
I've run this test 4 times again now and modified my query and I am still seeing the same issue, sometimes the events are received and sometimes not. Below are the results of my query after running the test 4 times. In the Azure UI, I can see that the "Generalize" event completed successfully. But CFME is sometimes receiving only the "BeginRequest" or neither the "BeginRequest" or the "EndRequest". The "virtualMachines_generalize_EndRequest" event was received 2/4 times. 

vmdb_production=#  select id, vm_or_template_id, event_type, vm_name, source from event_streams where source = 'AZURE' and vm_or_template_id is not NULL;
 id  | vm_or_template_id |               event_type                |     vm_name     | source 
-----+-------------------+-----------------------------------------+-----------------+--------
 105 |                28 | virtualMachines_write_EndRequest        |                 | AZURE
 104 |                28 | virtualMachines_write_EndRequest        |                 | AZURE
 103 |                28 | virtualMachines_write_BeginRequest      |                 | AZURE
 102 |                28 | virtualMachines_write_BeginRequest      |                 | AZURE
 107 |                28 | virtualMachines_deallocate_EndRequest   | test-captu-ghoj | AZURE
 109 |                28 | virtualMachines_generalize_BeginRequest | test-captu-ghoj | AZURE
 110 |                28 | virtualMachines_generalize_EndRequest   | test-captu-ghoj | AZURE
 111 |                28 | virtualMachines_capture_EndRequest      | test-captu-ghoj | AZURE
 112 |                28 | virtualMachines_capture_EndRequest      | test-captu-ghoj | AZURE
 124 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 125 |                29 | virtualMachines_deallocate_BeginRequest | test-captu-nhtc | AZURE
 126 |                29 | virtualMachines_deallocate_EndRequest   | test-captu-nhtc | AZURE
 128 |                29 | virtualMachines_deallocate_EndRequest   | test-captu-nhtc | AZURE
 130 |                29 | virtualMachines_capture_EndRequest      | test-captu-nhtc | AZURE
 140 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 141 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 142 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 143 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 144 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 145 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 146 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 147 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 148 |                29 | virtualMachines_write_EndRequest        | test-captu-nhtc | AZURE
 165 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 167 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 168 |                30 | virtualMachines_deallocate_BeginRequest | test-captu-8rae | AZURE
 169 |                30 | virtualMachines_deallocate_EndRequest   | test-captu-8rae | AZURE
 173 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 174 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 175 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 176 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 177 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 178 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 179 |                30 | virtualMachines_generalize_BeginRequest | test-captu-8rae | AZURE
 180 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 181 |                30 | virtualMachines_write_EndRequest        | test-captu-8rae | AZURE
 182 |                30 | virtualMachines_capture_EndRequest      | test-captu-8rae | AZURE
 199 |                32 | virtualMachines_write_EndRequest        |                 | AZURE
 201 |                32 | virtualMachines_deallocate_BeginRequest | test-captu-zwi8 | AZURE
 202 |                32 | virtualMachines_deallocate_EndRequest   | test-captu-zwi8 | AZURE
 204 |                30 | virtualMachines_delete_BeginRequest     | test-captu-8rae | AZURE
 205 |                30 | virtualMachines_delete_EndRequest       | test-captu-8rae | AZURE
 210 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 211 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 212 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 213 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 214 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 215 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 216 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 217 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 218 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 219 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 220 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 221 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 222 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 223 |                32 | virtualMachines_write_EndRequest        | test-captu-zwi8 | AZURE
 224 |                32 | virtualMachines_generalize_BeginRequest | test-captu-zwi8 | AZURE
 225 |                32 | virtualMachines_generalize_EndRequest   | test-captu-zwi8 | AZURE
 226 |                32 | virtualMachines_capture_EndRequest      | test-captu-zwi8 | AZURE
 227 |                32 | virtualMachines_capture_EndRequest      | test-captu-zwi8 | AZURE

Comment 25 CFME Bot 2019-07-25 21:52:06 UTC
New commit detected on ManageIQ/manageiq-providers-azure/ivanchuk:

https://github.com/ManageIQ/manageiq-providers-azure/commit/8ebfb8ade2cbc8051dd6c357ff19644df23adf43
commit 8ebfb8ade2cbc8051dd6c357ff19644df23adf43
Author:     Adam Grare <agrare>
AuthorDate: Wed Jul 24 12:03:45 2019 -0400
Commit:     Adam Grare <agrare>
CommitDate: Wed Jul 24 12:03:45 2019 -0400

    Merge pull request #338 from AlexanderZagaynov/BZ-1724312_unable_to_receive_events

    fix wrong timestamp format

    (cherry picked from commit 6b90aaac83e228a4fa52a31123dadf43f8ce42e2)

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1724312

 app/models/manageiq/providers/azure/cloud_manager/event_catcher/stream.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comment 29 Alexander Zagaynov 2019-08-14 14:29:59 UTC
Moving to POST as there is already merged PR which should solve that problem: https://github.com/ManageIQ/manageiq-providers-azure/pull/343

Comment 30 John Dupuy 2019-08-15 18:30:00 UTC
Moving this back to ON_DEV. After generalizing 6 VM's I received only 1 "virtualMachines_generalize_EndRequest" even though all of them are present in the Azure web portal. I thought this would be fixed by the fix for BZ 1730819, but it does not appear so. In BZ 1730819's case, the events were received on CFME but just disassociated from the VM name. Here we are only sometimes receiving the "virtualMachines_generalize_EndRequest", no matter whether it is associated with the VM.

Comment 31 Daniel Berger 2019-09-06 12:49:52 UTC
Working with John on this, digging into the logs, it appears to be an ordering issue with the way Azure marks the eventTimestamp property.

To (mostly) quote John, "The capture event appears slightly before the generalize in the Azure UI, even though the generalize event has a timestamp before the capture event. It may be that we pick up the capture event, the "@since" instance variable is updated to the timestamp of that event, then the generalize event shows up a little later with a timestamp before that of the capture event, but we don't catch it because the "@since" instance variable has been updated to the later timestamp."

Comment 32 John Dupuy 2019-09-06 13:45:41 UTC
This example scenario may help to explain what we think is happening:

E1-O -> E2-O -> E2-S -> E1-S

E: Event
O: Occurs -> Event Timestamp
S: Submitted -> Submit Timestamp

So right now our query gets E2 because it is submitted first, and then bases the next query on E2-O. So we will never get E1-O in our query, because we are searching for events with an eventTimestamp > E2-O, and E1-O < E2-O. 

We think this would apply to events in general, not just specifically the "generalize" event. I happened to see it here because our test case happened to trigger this scenario.

Comment 33 Daniel Berger 2019-09-06 14:26:55 UTC
We could have solved this using a different timestamp field called submissionTimestamp, but unfortunately Azure won't let us filter on it, only on eventTimestamp:

https://docs.microsoft.com/en-us/rest/api/monitor/activitylogs/list

Comment 34 Daniel Berger 2019-09-17 12:56:28 UTC
In progress: https://github.com/ManageIQ/manageiq-providers-azure/pull/350


Note You need to log in before you can comment on or make changes to this bug.