Bug 950199 - Correlation ids are being re-used and don't appear to match up correctly in the engine and vdsm logs.
Summary: Correlation ids are being re-used and don't appear to match up correctly in t...
Keywords:
Status: CLOSED DUPLICATE of bug 844335
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.3
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: 3.2.0
Assignee: Nobody's working on this, feel free to take it
QA Contact:
URL:
Whiteboard: infra
: 950192 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-09 20:14 UTC by Gordon Watson
Modified: 2016-02-10 19:25 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-24 07:02:31 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Gordon Watson 2013-04-09 20:14:46 UTC
Description of problem:

While analysing a customer's problem, I tried to use a correlation id in the engine logs to find relevant entries in the vdsm logs. What I was found was;

1) The correlation id appears to have been re-used.
2) Specific entries in the engine logs with a specific correlation id do not match anything appropiate in the vdsm logs, however other entries with this correlation id do exist in the vdsm logs.

See below for an example.


Version-Release number of selected component (if applicable):

RHEV-M:  rhevm-3.1.0-50.el6 (3.1.3)

Hosts:

There are four RHEL6.4 hosts in a cluster.

- Host-A;

Mar 11 13:19:29 Installed: vdsm-4.10.2-1.6.el6.x86_64
Mar 15 11:39:21 Updated: vdsm-4.10.2-1.8.el6ev.x86_64
Mar 28 09:32:50 Updated: libvirt-0.10.2-18.el6_4.2.x86_64


- Host-B;

Mar 11 13:14:28 Installed: vdsm-4.10.2-1.6.el6.x86_64
Mar 11 12:23:06 Updated: libvirt-0.10.2-18.el6.x86_64
Mar 15 11:54:21 Updated: vdsm-4.10.2-1.8.el6ev.x86_64


- Host-C;

current = vdsm-4.10.2-1.8.el6ev.x86_64
Mar 14 03:43:16 Updated: libvirt-0.10.2-18.el6.x86_64


- Host-D;

current = vdsm-4.10.2-1.8.el6ev.x86_64
Mar 22 13:54:59 Updated: libvirt-0.10.2-18.el6_4.2.x86_64


How reproducible:

Not.

Steps to Reproduce:
1. n/a
2.
3.
 
 
Actual results:

An example of this was when looking in the engine logs for a snapshot problem related to VM with id '18ed3b02-0a8f-4f25-87df-ce16922f01c3'. The engine log contains relevant entries with correlation id (69567332) on 2013-03-27, as shown immediately below, but none of the vdsm logs do. The vdsm logs contain entries with this correlation id for other dates for other activities, e.g. on Host-A there is an entry for 'call vmGetMigrationStatus with ('3fbd3711-05fe-47cb-aa44-1a300398760f',)". This is a different VM than the one that the snapshot was being created for. 


-  engine.log;

2013-03-27 16:21:29,362 ERROR [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-84) [69567332] BaseAsyncTask::LogEndTaskFailure: Task 93a60605-f562-40de-95ce-c64faf8c21c4 (Parent Command CreateAllSnapshotsFromVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
2013-03-27 16:21:29,362 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-84) [69567332] SPMAsyncTask::PollTask: Polling task 4b676118-dba7-40fa-9496-4e30cc117208 (Parent Command CreateAllSnapshotsFromVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanSuccess'.
2013-03-27 16:21:29,368 ERROR [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-84) [69567332] BaseAsyncTask::LogEndTaskFailure: Task 4b676118-dba7-40fa-9496-4e30cc117208 (Parent Command CreateAllSnapshotsFromVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
2013-03-27 16:21:29,368 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-84) [69567332] SPMAsyncTask::PollTask: Polling task 985b9f5d-9dfb-489f-af00-f5256dced657 (Parent Command CreateAllSnapshotsFromVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanSuccess'.
2013-03-27 16:21:29,374 ERROR [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-84) [69567332] BaseAsyncTask::LogEndTaskFailure: Task 985b9f5d-9dfb-489f-af00-f5256dced657 (Parent Command CreateAllSnapshotsFromVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
2013-03-27 16:21:29,374 INFO  [org.ovirt.engine.core.bll.EntityAsyncTask] (QuartzScheduler_Worker-84) [69567332] EntityAsyncTask::EndAction: Ending action for 5 tasks (entity ID: 18ed3b02-0a8f-4f25-87df-ce16922f01c3): calling EndAction for action type CreateAllSnapshotsFromVm.
2013-03-27 16:21:29,374 INFO  [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-3-thread-44) [69567332] EntityAsyncTask::EndCommandAction [within thread] context: Attempting to EndAction CreateAllSnapshotsFromVm, executionIndex: 0
2013-03-27 16:21:29,398 ERROR [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (pool-3-thread-44) [69567332] Ending command with failure: org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand
2013-03-27 16:21:29,374 INFO  [org.ovirt.engine.core.bll.EntityAsyncTask] (QuartzScheduler_Worker-84) [69567332] EntityAsyncTask::EndActionIfNecessary: All tasks of entity 18ed3b02-0a8f-4f25-87df-ce16922f01c3 has ended -> executing EndAction



-  Host-A;

$ grep 69567332 vdsm.log*
vdsm.log:Thread-762100::DEBUG::2013-03-28 09:25:12,361::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3fbd3711-05fe-47cb-aa44-1a300398760f',) {} flowID [69567332]
vdsm.log:Thread-762101::DEBUG::2013-03-28 09:25:12,408::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetMigrationStatus with ('3fbd3711-05fe-47cb-aa44-1a300398760f',) {} flowID [69567332]
vdsm.log:Thread-762515::DEBUG::2013-03-28 09:36:16,367::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call getCapabilities with () {} flowID [69567332]
vdsm.log.36:Thread-346098::DEBUG::2013-03-22 04:04:31,273::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('96f27e3e-e9d9-4e99-8804-315921aaea71',) {} flowID [69567332]
vdsm.log.42:Thread-213938::DEBUG::2013-03-19 14:41:44,418::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('8e07e1bd-0591-4f21-8dc6-8f41abb77d16',) {} flowID [69567332]
vdsm.log.42:Thread-214135::DEBUG::2013-03-19 14:45:49,151::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('8e07e1bd-0591-4f21-8dc6-8f41abb77d16',) {} flowID [69567332]
vdsm.log.43:Thread-201692::DEBUG::2013-03-19 09:11:35,985::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3fbd3711-05fe-47cb-aa44-1a300398760f',) {} flowID [69567332]
vdsm.log.43:Thread-204145::DEBUG::2013-03-19 10:19:22,128::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3048f649-3208-4c8e-8f77-b88b813edc5d',) {} flowID [69567332]
vdsm.log.43:Thread-210333::DEBUG::2013-03-19 13:03:08,389::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('8e07e1bd-0591-4f21-8dc6-8f41abb77d16',) {} flowID [69567332]
vdsm.log.44:Thread-165176::DEBUG::2013-03-18 16:01:52,318::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('6a7a25be-007a-4a41-b226-2814039a20f6',) {} flowID [69567332]
vdsm.log.45:Thread-156135::DEBUG::2013-03-18 11:56:27,479::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('6a7a25be-007a-4a41-b226-2814039a20f6',) {} flowID [69567332]
vdsm.log.45:Thread-158840::DEBUG::2013-03-18 13:09:43,272::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('8e07e1bd-0591-4f21-8dc6-8f41abb77d16',) {} flowID [69567332]
vdsm.log.45:Thread-160346::DEBUG::2013-03-18 13:48:57,638::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('cbfb1a77-8acf-4eb5-9ce1-7b088a1379aa',) {} flowID [69567332]



-  Host-B;

$ grep 69567332 vdsm.log
Thread-665089::DEBUG::2013-03-28 09:48:53,879::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmCreate with ({'custom': {'device_29febd85-8d77-49ab-af64-1d4a6ef58233device_0290795c-e504-497e-8369-7c3237eb8e09device_c8afd480-1c6d-4bef-9b81-1b0cbf625f99': 'VmDevice {vmId=22a2c712-49e9-4efa-9b2b-d5971a1a6727, deviceId=c8afd480-1c6d-4bef-9b81-1b0cbf625f99, device=unix, type=channel, bootOrder=0, specParams={}, address={port=1, bus=0, controller=0, type=virtio-serial}, managed=false, plugged=true, readOnly=false, alias=channel0}', 'device_29febd85-8d77-49ab-af64-1d4a6ef58233device_0290795c-e504-497e-8369-7c3237eb8e09device_c8afd480-1c6d-4bef-9b81-1b0cbf625f99device_edf01b84-3ad8-4d99-b0ea-282dd8c5ee84device_75045c31-0aab-477d-843e-8e42c19c7f18': 'VmDevice {vmId=22a2c712-49e9-4efa-9b2b-d5971a1a6727, deviceId=75045c31-0aab-477d-843e-8e42c19c7f18, device=spicevmc, type=channel, bootOrder=0, specParams={}, address={port=3, bus=0, controller=0, type=virtio-serial}, managed=false, plugged=true, readOnly=false, alias=channel2}', 'device_29febd85-8d77-49ab-af64-1d4a6ef58233device_0290795c-e504-497e-8369-7c3237eb8e09': 'VmDevice {vmId=22a2c712-49e9-4efa-9b2b-d5971a1a6727, deviceId=0290795c-e504-497e-8369-7c3237eb8e09, device=virtio-serial, type=controller, bootOrder=0, specParams={}, address={bus=0x00, domain=0x0000, type=pci, slot=0x04, function=0x0}, managed=false, plugged=true, readOnly=false, alias=virtio-serial0}', 'device_29febd85-8d77-49ab-af64-1d4a6ef58233': 'VmDevice {vmId=22a2c712-49e9-4efa-9b2b-d5971a1a6727, deviceId=29febd85-8d77-49ab-af64-1d4a6ef58233, device=ide, type=controller, bootOrder=0, specParams={}, address={bus=0x00, domain=0x0000, type=pci, slot=0x01, function=0x1}, managed=false, plugged=true, readOnly=false, alias=ide0}', 'device_29febd85-8d77-49ab-af64-1d4a6ef58233device_0290795c-e504-497e-8369-7c3237eb8e09device_c8afd480-1c6d-4bef-9b81-1b0cbf625f99device_edf01b84-3ad8-4d99-b0ea-282dd8c5ee84': 'VmDevice {vmId=22a2c712-49e9-4efa-9b2b-d5971a1a6727, deviceId=edf01b84-3ad8-4d99-b0ea-282dd8c5ee84, device=unix, type=channel, bootOrder=0, specParams={}, address={port=2, bus=0, controller=0, type=virtio-serial}, managed=false, plugged=true, readOnly=false, alias=channel1}'}, 'keyboardLayout': 'en-us', 'kvmEnable': 'true', 'pitReinjection': 'false', 'acpiEnable': 'true', 'emulatedMachine': 'rhel6.3.0', 'vmId': '22a2c712-49e9-4efa-9b2b-d5971a1a6727', 'devices': [{'device': 'qxl', 'specParams': {'vram': '65536'}, 'type': 'video', 'deviceId': '11d3ce8b-e43b-433a-8a96-82abfc5db44c', 'address': {'bus': '0x00', ' slot': '0x02', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}}, {'index': '2', 'iface': 'ide', 'address': {' controller': '0', ' target': '0', 'unit': '0', ' bus': '1', ' type': 'drive'}, 'specParams': {'path': ''}, 'readonly': 'true', 'deviceId': 'ed2eb0c1-e850-411c-b11a-5faee5a85059', 'path': '', 'device': 'cdrom', 'shared': 'false', 'type': 'disk'}, {'index': 0, 'iface': 'virtio', 'format': 'cow', 'bootOrder': '1', 'poolID': '6de1c3c1-9e75-4c61-8bc6-b8a59472f6d9', 'volumeID': '0a120f01-685e-4703-9dcc-139b735c6bbc', 'imageID': 'e439941c-6d42-488d-bde0-52f3bc9bb11f', 'specParams': {}, 'readonly': 'false', 'domainID': '1cfd5b71-6d87-4e89-ae9c-9f4cf06330bd', 'optional': 'false', 'deviceId': 'e439941c-6d42-488d-bde0-52f3bc9bb11f', 'address': {'bus': '0x00', ' slot': '0x05', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}, 'device': 'disk', 'shared': 'false', 'propagateErrors': 'off', 'type': 'disk'}, {'nicModel': 'pv', 'macAddr': '00:1a:4a:a8:10:1f', 'network': 'rhevm', 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'deviceId': 'b04e0f30-7560-47f3-b410-ed5be20ccbae', 'address': {'bus': '0x00', ' slot': '0x03', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'}, 'device': 'bridge', 'type': 'interface'}, {'device': 'memballoon', 'specParams': {'model': 'virtio'}, 'type': 'balloon', 'deviceId': '3bce38f4-7e0f-4563-b494-5775e12bfa0e'}], 'smp': '2', 'vmType': 'kvm', 'timeOffset': '-2', 'memSize': 1024, 'spiceSslCipherSuite': 'DEFAULT', 'cpuType': 'Nehalem', 'spiceSecureChannels': 'smain,sinputs,scursor,splayback,srecord,sdisplay,susbredir,ssmartcard', 'smpCoresPerSocket': '1', 'vmName': 'xxxxxx', 'display': 'qxl', 'transparentHugePages': 'true', 'nice': '0'},) {} flowID [69567332]

$ xzgrep 69567332 vdsm.log*xz
vdsm.log.16.xz:Thread-370015::DEBUG::2013-03-22 15:42:54,003::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3048f649-3208-4c8e-8f77-b88b813edc5d',) {} flowID [69567332]
vdsm.log.16.xz:Thread-370016::DEBUG::2013-03-22 15:42:54,022::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('5680f631-5ce4-4e98-8e8d-bc4ff64be588',) {} flowID [69567332]
vdsm.log.16.xz:Thread-370860::DEBUG::2013-03-22 16:02:26,039::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3048f649-3208-4c8e-8f77-b88b813edc5d',) {} flowID [69567332]
vdsm.log.16.xz:Thread-370924::DEBUG::2013-03-22 16:03:27,131::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3048f649-3208-4c8e-8f77-b88b813edc5d',) {} flowID [69567332]
vdsm.log.16.xz:Thread-370954::DEBUG::2013-03-22 16:03:58,087::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3048f649-3208-4c8e-8f77-b88b813edc5d',) {} flowID [69567332]
vdsm.log.16.xz:Thread-371797::DEBUG::2013-03-22 16:22:46,434::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('22a2c712-49e9-4efa-9b2b-d5971a1a6727',) {} flowID [69567332]
vdsm.log.25.xz:Thread-212800::DEBUG::2013-03-19 14:36:08,105::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('e7a281ee-1f65-4332-8746-3652d6490d66',) {} flowID [69567332]
vdsm.log.25.xz:Thread-212830::DEBUG::2013-03-19 14:36:39,374::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('e7a281ee-1f65-4332-8746-3652d6490d66',) {} flowID [69567332]
vdsm.log.27.xz:Thread-166542::DEBUG::2013-03-18 16:53:22,082::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('3fbd3711-05fe-47cb-aa44-1a300398760f',) {} flowID [69567332]
vdsm.log.28.xz:Thread-155942::DEBUG::2013-03-18 12:07:09,401::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('5680f631-5ce4-4e98-8e8d-bc4ff64be588',) {} flowID [69567332]
vdsm.log.28.xz:Thread-162027::DEBUG::2013-03-18 14:52:58,918::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('22a2c712-49e9-4efa-9b2b-d5971a1a6727',) {} flowID [69567332]



-  Host-C;

$ grep 69567332 vdsm.log
$

$ xzgrep 69567332 vdsm.log*xz
vdsm.log.27.xz:Thread-415968::DEBUG::2013-03-21 12:05:58,654::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('9252666e-2987-4357-8546-a8e1240c7de3',) {} flowID [69567332]
vdsm.log.28.xz:Thread-411311::DEBUG::2013-03-21 10:28:38,175::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('9252666e-2987-4357-8546-a8e1240c7de3',) {} flowID [69567332]



-  Host-D;

$ grep 69567332 vdsm.log
Thread-144940::DEBUG::2013-03-28 09:48:53,454::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('22a2c712-49e9-4efa-9b2b-d5971a1a6727',) {} flowID [69567332]
Thread-144941::DEBUG::2013-03-28 09:48:53,469::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmDestroy with ('22a2c712-49e9-4efa-9b2b-d5971a1a6727',) {} flowID [69567332]

$ xzgrep 69567332 vdsm.log*xz
vdsm.log.4.xz:Thread-41095::DEBUG::2013-03-26 09:49:50,168::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('5680f631-5ce4-4e98-8e8d-bc4ff64be588',) {} flowID [69567332]
vdsm.log.5.xz:Thread-1706::DEBUG::2013-03-25 15:42:04,502::BindingXMLRPC::903::vds::(wrapper) client [<ip_addr>]::call vmGetStats with ('8e07e1bd-0591-4f21-8dc6-8f41abb77d16',) {} flowID [69567332]



Expected results:

I was expecting to be able to find appropriate "errors" in the vdsm logs containing the correlation id found in the engine log.


Additional info:

Comment 3 Itamar Heim 2013-04-14 06:41:27 UTC
bug 844335 may be relevant here

Comment 4 Itamar Heim 2013-04-14 06:42:40 UTC
*** Bug 950192 has been marked as a duplicate of this bug. ***

Comment 8 Yair Zaslavsky 2013-04-22 10:34:18 UTC
From what I saw - Thre are two cases of re-use -
a. re-use is due to the fact that the child commands of CreateAllSnapshotsFromVmCommand exist.
There's no problem with that.
b. Re-use in getVmStatis (vdsm side)

Can we get engine.log and vdsm.log?

In reply to comment #3 -
Yes, this bug is relevant.


Note You need to log in before you can comment on or make changes to this bug.