Bug 1477600 - [REST] diskattachment doesn't always show the logical_name value
[REST] diskattachment doesn't always show the logical_name value
Status: NEW
Product: ovirt-guest-agent
Classification: oVirt
Component: General (Show other bugs)
---
Unspecified Unspecified
high Severity high (vote)
: ovirt-4.1.8
: ---
Assigned To: Tomáš Golembiovský
Lilach Zitnitski
: Automation, Regression
: 1465488 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-02 08:44 EDT by Lilach Zitnitski
Modified: 2017-09-17 12:00 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑4.1+
rule-engine: ovirt‑4.2+
rule-engine: blocker+


Attachments (Terms of Use)
logs (406.68 KB, application/zip)
2017-08-02 08:44 EDT, Lilach Zitnitski
no flags Details
vdsm logs (135.45 KB, application/x-gzip)
2017-08-30 03:49 EDT, Raz Tamir
no flags Details
ovirt-guest-agent log (22.92 KB, text/plain)
2017-08-30 08:15 EDT, Raz Tamir
no flags Details

  None (edit)
Description Lilach Zitnitski 2017-08-02 08:44:17 EDT
Description of problem:
/api/vms/.../diskattachments should have for each disk its logical name as appears in the vm, but sometimes this value is missing from the diskattachments 

Version-Release number of selected component (if applicable):
ovirt-engine-4.1.4.2-0.1.el7.noarch
vdsm-4.19.23-1.el7ev.x86_64

How reproducible:
~50%

Steps to Reproduce:
1. create vm with disks
2. start the vm
3. go to /api/vms/vm_id/diskattachments

Actual results:
sometimes the logical name of the disk is missing 

Expected results:
the logical name should be under diskattachments 

Additional info:
I saw it mainly in the automation tests and this is the respond to GET diskattachments

<disk_attachments>
    <disk_attachment href="/ovirt-engine/api/vms/df490cd2-e19c-4c20-a5a2-5a99968285e9/diskattachments/529f6ed2-adda-48e4-aac0-0c92259e1bf1" id="529f6ed2-adda-48e4-aac0-0c92259e1bf1"
>
        <active>true</active>
        <bootable>true</bootable>
        <interface>virtio</interface>
        <pass_discard>false</pass_discard>
        <read_only>false</read_only>
        <uses_scsi_reservation>false</uses_scsi_reservation>
        <disk href="/ovirt-engine/api/disks/529f6ed2-adda-48e4-aac0-0c92259e1bf1" id="529f6ed2-adda-48e4-aac0-0c92259e1bf1"/>
        <vm href="/ovirt-engine/api/vms/df490cd2-e19c-4c20-a5a2-5a99968285e9" id="df490cd2-e19c-4c20-a5a2-5a99968285e9"/>
    </disk_attachment>

also, the VM was started at 11:52 and at 11:59 the logical_name still didn't show up.
Comment 1 Lilach Zitnitski 2017-08-02 08:44 EDT
Created attachment 1308205 [details]
logs
Comment 2 Allon Mureinik 2017-08-02 11:49:07 EDT
Lilach, can you please explain the difference between this bug and bug 1465488? Isn't bug 1465488 just a specific case (direct lun) of this general case (any disk)?
Comment 3 Lilach Zitnitski 2017-08-03 03:05:46 EDT
(In reply to Allon Mureinik from comment #2)
> Lilach, can you please explain the difference between this bug and bug
> 1465488? Isn't bug 1465488 just a specific case (direct lun) of this general
> case (any disk)?

Yes I guess, I thought is should be in different bug because it's not always repeating itself. 
I'll add this bug's description in the first bug as a comment.
Comment 4 Raz Tamir 2017-08-06 15:19:37 EDT
Raising severity as this fails random cases in our automation
Comment 5 Tal Nisan 2017-08-29 11:43:47 EDT
Tried to reproduce, it takes time indeed until the logical names data is populated in the engine but this info is coming from the guest agent, from the storage aspect we have not much to do about it, data comes in from the guest agent, stored in the DB and displayed though the REST.
Moving to guest agent to investigate their aspect of the bug.
Comment 6 Raz Tamir 2017-08-30 03:49 EDT
Created attachment 1319890 [details]
vdsm logs
Comment 7 Yaniv Kaul 2017-08-30 04:01:18 EDT
Can you attach guest agent logs, to ensure it sees the disks?
If we add a minute to the timeout, does it eliminate the issue? 5 minutes? (only if you fail of course)?
Comment 8 Tomáš Golembiovský 2017-08-30 07:04:10 EDT
(In reply to Yaniv Kaul from comment #7)
> Can you attach guest agent logs, to ensure it sees the disks?

You'll have to enable debug output to actually see anything useful.
I don't see any communication between VDSM and GA in the log vdsm.log so GA log would definitely help. 

> If we add a minute to the timeout, does it eliminate the issue? 5 minutes?
> (only if you fail of course)?

You can also try changing the refresh timeout in GA configuration to see if it
helps. In /etc/ovirt-guest-agent.conf change 'report_disk_usage' in section
'[general]'. The default is 300 seconds (5 minutes).
Comment 9 Raz Tamir 2017-08-30 08:15:07 EDT
ovirt guset agent logs attached.

in the logs I can see that the VM see the device:
Dummy-1::DEBUG::2017-08-30 15:09:14,741::VirtIoChannel::209::root::Written {"__name__": "disks-usage", "disks": [{"path": "/", "total": 8914993152, "used": 1280544768, "fs": "xfs"}, {"path": "/boot", "total": 70566
7072, "used": 150638592, "fs": "ext3"}], "mapping": {"b54eb41c-496b-4270-a": {"name": "/dev/vda"}, "QEMU_DVD-ROM_QM00003": {"name": "/dev/sr0"}}}

Also, from inside the guest:
#lsblk
NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0                  11:0    1 1024M  0 rom  
vda                 252:0    0   10G  0 disk 
├─vda1              252:1    0  700M  0 part /boot
├─vda2              252:2    0    1G  0 part [SWAP]
└─vda3              252:3    0  8.3G  0 part 
  └─VolGroup01-root 253:0    0  8.3G  0 lvm  /

The output of GET command to /api/vms/VM_ID/diskattachments:
<disk_attachments>
<disk_attachment href="/ovirt-engine/api/vms/df70f7ae-3071-4bc3-956d-ecfee93385ad/diskattachments/b54eb41c-496b-4270-a834-26476c884411" id="b54eb41c-496b-4270-a834-26476c884411">
<active>true</active>
<bootable>true</bootable>
<interface>virtio</interface>
<pass_discard>false</pass_discard>
<read_only>false</read_only>
<uses_scsi_reservation>false</uses_scsi_reservation>
<disk href="/ovirt-engine/api/disks/b54eb41c-496b-4270-a834-26476c884411" id="b54eb41c-496b-4270-a834-26476c884411"/>
<vm href="/ovirt-engine/api/vms/df70f7ae-3071-4bc3-956d-ecfee93385ad" id="df70f7ae-3071-4bc3-956d-ecfee93385ad"/>
</disk_attachment>
</disk_attachments>

The VM is running for more than 30 minutes now
Comment 10 Raz Tamir 2017-08-30 08:15 EDT
Created attachment 1320054 [details]
ovirt-guest-agent log
Comment 11 Raz Tamir 2017-08-30 09:49:45 EDT
As requested by Tomáš I executed 'vdsm-client Host getVMFullList' on the host that running the VM:
...
        "vmId": "df70f7ae-3071-4bc3-956d-ecfee93385ad", 
        "guestDiskMapping": {
            "b54eb41c-496b-4270-a": {
                "name": "/dev/vda"
            }, 
            "QEMU_DVD-ROM_QM00003": {
                "name": "/dev/sr0"
            }
...
Comment 12 Tomáš Golembiovský 2017-08-30 10:04:19 EDT
Aparently GA works properly and the disk mapping is returned to the VDSM. This seems to me more like an engine bug.
Comment 13 Tal Nisan 2017-09-04 04:34:35 EDT
(In reply to Raz Tamir from comment #11)
> As requested by Tomáš I executed 'vdsm-client Host getVMFullList' on the
> host that running the VM:
> ...
>         "vmId": "df70f7ae-3071-4bc3-956d-ecfee93385ad", 
>         "guestDiskMapping": {
>             "b54eb41c-496b-4270-a": {
>                 "name": "/dev/vda"
>             }, 
>             "QEMU_DVD-ROM_QM00003": {
>                 "name": "/dev/sr0"
>             }
> ...

Raz, can you check if after getting the correct result 'select device_id, logical_name from vm_device where vm_id=VM_GUID;' will show the correct logical names?
Comment 14 Raz Tamir 2017-09-04 07:32:44 EDT
No,

The logical_name column is empty:
engine=# select device_id, logical_name from vm_device where vm_id='df70f7ae-3071-4bc3-956d-ecfee93385ad';
              device_id               | logical_name 
--------------------------------------+--------------
 0e471884-09a8-4d1c-9997-12b1d220df16 | 
 94757b8e-8050-423a-814b-ada369012a30 | 
 c06b62b0-dbd3-4e3d-a2b0-bdcaee9cfce4 | 
 224ca422-181d-4b47-a601-8d6587eb9ce6 | 
 24c1f41e-00ad-4806-8900-566f75f9b9eb | 
 329ad607-f890-4886-9e88-c254444d58a6 | 
 52b17e05-f1bd-46e3-b716-15a6a07196b9 | 
 73fc3277-d973-4657-ae45-72f4824667db | 
 8a315737-74a1-43c9-9ae0-a25436d20f9f | 
 a62a7bf9-7c03-45a2-9790-058836dd4ce3 | 
 a8e5ec7c-a1f5-4ad4-a64b-85b64fef1575 | 
 b54eb41c-496b-4270-a834-26476c884411 | 
 d14045a4-2dfa-475f-99d9-1a42683c03b3 | 
 edfbde30-4f7a-431b-8248-d7344f0d04ee | 
 0059b1cf-3ef8-49b0-9387-c8d7bf93164d | 
(15 rows)
Comment 15 Tal Nisan 2017-09-04 08:35:22 EDT
Seems like VDSM doesn't report it to Engine then, Tomas any idea why?
Comment 16 Michal Skrivanek 2017-09-04 08:38:22 EDT
no, it is reported by VDSM, as checked in comment #11, it's not properly processed on engine side it seems
Comment 17 Tal Nisan 2017-09-04 08:52:05 EDT
VDSM knows about it yes, question is if the stats are passed correctly to the Engine or the Engine doesn't process them right, either way my question is why
Comment 18 Tomáš Golembiovský 2017-09-12 15:41:05 EDT
*** Bug 1465488 has been marked as a duplicate of this bug. ***
Comment 19 Michal Skrivanek 2017-09-12 17:01:57 EDT
I do not understand, is the problem about a delay in reporting or that some disk logical names for some VMs are never populated, despite being seen reported by vdsm at that time?
The matching may be wrong, but the vdsm logs are missing creation xml.

In either case, please add timelines, especially when you try to compare data from different sources
Comment 20 Raz Tamir 2017-09-13 08:30:36 EDT
(In reply to Michal Skrivanek from comment #19)
> I do not understand, is the problem about a delay in reporting or that some
> disk logical names for some VMs are never populated, despite being seen
> reported by vdsm at that time?
The disk logical name is never presented in REST API
> The matching may be wrong, but the vdsm logs are missing creation xml.
> 
> In either case, please add timelines, especially when you try to compare
> data from different sources

The logical device name didn't show even after 48 hours so I'm not sure what timeline are you expecting?
Comment 21 Michal Skrivanek 2017-09-14 06:02:39 EDT
(In reply to Raz Tamir from comment #20)
> (In reply to Michal Skrivanek from comment #19)
> > I do not understand, is the problem about a delay in reporting or that some
> > disk logical names for some VMs are never populated, despite being seen
> > reported by vdsm at that time?
> The disk logical name is never presented in REST API
> > The matching may be wrong, but the vdsm logs are missing creation xml.
> > 
> > In either case, please add timelines, especially when you try to compare
> > data from different sources
> 
> The logical device name didn't show even after 48 hours

good. yeah, minutes would be the worst case. Anything more than that qualifies as "never":) So the problem is that it never appears. Thanks

> so I'm not sure what timeline are you expecting?
here I expect a more concrete timeline of actions you describe, but since the above answer is "never" it doesn't matter much anymore.
Then we only miss the vdsm.log containing creation of that VM to verify the device id
Comment 22 Arik 2017-09-14 08:05:50 EDT
I suspect that the engine didn't poll the logical name from VDSM because the computed hash of the devices didn't change. Could you please reproduce this state and then add a NIC to the running VM and see if the logical name is updated?
Comment 23 Raz Tamir 2017-09-17 04:17:54 EDT
Arik,

I hotplugged disk and nic and nothing changed - I still cannot see the disk logical name in rest api bug the guest sees the new devices
Comment 24 Arik 2017-09-17 08:54:18 EDT
(In reply to Raz Tamir from comment #23)
And is it still missing in the database (using the query used in comment 14)?
Comment 25 Raz Tamir 2017-09-17 12:00:20 EDT
Yes, it is still missing from the DB:

engine=# select device_id, logical_name from vm_device where vm_id='ddd3a804-e3dc-4710-9769-8db4b5e18157';
              device_id               | logical_name 
--------------------------------------+--------------
 7387ead8-dba3-4e65-a7f6-3fc79ff860f2 | 
 7858e20a-999c-4b92-ba37-6a0e0719b3f7 | 
 683c0e36-c6cb-48bc-9262-bb979f6ba0b9 | 
 751aeb96-2810-490e-9a69-e3b8332aebb8 | 
 7a132485-c67f-4990-bafa-56e7f205fd01 | 
 7c7ca816-ca2a-4a61-83fe-336c4508c526 | 
 9a7cf75c-6acf-455b-b978-1ee23ece5c05 | 
 abff4c60-8464-4250-8545-029dc9c9035d | 
 c01324c8-72f4-4097-b2ae-029ca0d38345 | 
 c1ba522b-4923-49d1-b0c3-0436e0f48214 | 
 e7b790f3-a7c6-4945-abaa-8931be49a05e | 
 ed714e68-8d6f-47c8-bf05-2c13bc5338ba | 
 f8dfdf65-289a-4cd4-9ee9-d3f868844001 | 
 06b04bd5-b3fa-4b95-8f71-9c5168e08c4d | 
 3ed7d3e5-579a-4299-a834-a9409064b72b | 
 47151e0c-8652-4873-a3eb-37de3eb15756 | 
 50f41895-e719-454d-84c1-beec0581f1f8 | 
(17 rows)

Note You need to log in before you can comment on or make changes to this bug.