Bug 1432847 - [downstream clone] no bootable device found with more than one virtio-scsi disc
Summary: [downstream clone] no bootable device found with more than one virtio-scsi disc
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.0.6
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Maor
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On: 1020622 1420869
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-16 08:57 UTC by nijin ashok
Modified: 2022-02-17 17:23 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1420869
Environment:
Last Closed: 2017-11-15 14:55:59 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-43308 0 None None None 2021-08-30 13:21:36 UTC
Red Hat Knowledge Base (Solution) 2973041 0 None None None 2017-03-18 08:55:51 UTC

Description nijin ashok 2017-03-16 08:57:08 UTC
+++ This bug was initially created as a clone of Bug #1420869 +++

Description of problem:
when creating a new VM with more than one virtio-scsi disks the vm does not boot.
at first he tries to boot from pxe, a couple of minutes you see no bootable device in the console.

One disk is marked as bootable, and the disk is bootable. changing the boot disk to virtio fixes the problem.

Version-Release number of selected component (if applicable):
4.1.0.4-1.el7.centos

the problem already exist in verion 4.0

How reproducible:


Steps to Reproduce:
1. create a new vm
2. attach e.g. the ubuntu xenial cloud image and mark it as bootable, use virtio-scsi
3. create and attach a blank virtio-scsi disk

Actual results:
the machine does not find a bootable disk

Expected results:
the vm finds and boot from disk

Additional info:

--- Additional comment from Tomas Jelinek on 2017-02-10 02:47:21 EST ---

can you please provide engine and vdsm logs?

--- Additional comment from Stefan Wandl on 2017-02-10 04:15 EST ---

teh related machine is named test2, i have only added those vdsm logs which are somehow related to the test2 vm. The vm was created at 2017-02-09 11:31

--- Additional comment from Tomas Jelinek on 2017-02-17 06:19:26 EST ---

Hi Stefan,

I was not able to simulate this nor I have found anything suspicious in the logs, so I need to ask some questions:

- please make sure that in the boot options of the VM the Hard Disk is the first (edit vm -> boot options)
- please make sure in Resource Allocation tab the "VirtIO-SCSI Enabled" checkbox is checked
- if you have both, please try to run the VM as "run once" and in "Boot Options" check the "Enable menu to select boot device" checkbox and also the "Start in Pause Mode". This makes the VM pause at the very beginning. Connect the console to it and unpause it and enter the boot menu (press ESC in the SPICE window). And let us know what is listed there.

Thank you.

--- Additional comment from Stefan Wandl on 2017-02-17 09:51 EST ---

screenshot of the boot problem after disk selection from the boot menu

--- Additional comment from Stefan Wandl on 2017-02-17 09:58:51 EST ---

Hi Tomas,

VirtIO-SCSI is enabled
Hard Disk is the first in boot options (i did not touch this value)

it does not matter if the disks are in one storage domain or different one.
i tried to create a new vm with one attached disk (ubuntu cloud image and one created disk) and created a new vm from template (the template has identical settings) with no difference

I have tested it on oVirt 4.0 and 4.1 (different installations)


could it be somehow connected with the naming of the disk?

BR
Stefan

--- Additional comment from Tomas Jelinek on 2017-02-22 06:32:51 EST ---

I was trying with various setups with various disks and Im still unable to simulate this...

but still, there are two options:
1: either engine sends the wrong boot order, or
2: some issue with the lower layers

To check a bit more the first:
When looking at the logs I see only attempts with two virtio scsi disks. Could you please provide for comparision an engine log when only one bootable disk is there so the VM boots properly and one where there are two so it does not?

To check a bit more the second:
Please provide for both runs (with one disk and with two) the output of 
virsh -r dumpxml <vm name> 
from the host when the VM is running.

Also please attach the libvirt and qemu logs (I don't expect much there, but...)

Thank you
Tomas

--- Additional comment from Stefan Wandl on 2017-02-22 10:34 EST ---

virsh dumpxml, enginelog and qemu log

--- Additional comment from Stefan Wandl on 2017-02-22 10:44:17 EST ---

Hi Tomas,

this time it was quite hard to reproduce the problem so you will find a lot of entries in the logs with the same machine names. In the end i was "successfull".

here are the timestamps:

2017-02-22 14:47:26
copy root disk finished

2017-02-22 14:48:13
create new vm called testboot with attached root disk and an newly create log 


2017-02-22 14:51:43
clone machine testboot2

2017-02-22 14:52:28,026
"run once" testboot2 -> machine is booting
"run once" testboot -> machine is NOT booting


As far as i can see the problem appears whenever the boot disk is not on scsi scsi0-0-0-0. I have double checked the results with my 3 disk template and got the same result: whenever the boot disk is on scsi0-0-0-0 it is booting.

Stefan

--- Additional comment from Tomas Jelinek on 2017-02-23 08:07:30 EST ---

So, to determine which disk is the boot we have 2 "flags":
1: the "isBoot" flag on disk
2: the "bootOrder" on device

If the disk (attachment) has isBoot==true (in engine DB) we send "index=0" to VDSM.
If it's device has bootOrder > 0 (in engine DB) we send the "bootOrder=N" to the VDSM.
(simplifying here, but more or less...)

so, what I see in the logs:
-------------------------------------------------------
testboot2:
the boot image ID is 74766bda-8c9c-4c61-aa9b-149da942c9df because:

- The isBoot is true for it since the logs contain:
Bootable disk '74766bda-8c9c-4c61-aa9b-149da942c9df' set to index '0'

- The bootOrder == 1 since we send bootOrder=1

e.g. the for this VM the boot disk is the 74766bda-8c9c-4c61-aa9b-149da942c9df and it is correctly configured on both device and disk (attachment).
-------------------------------------------------------
testboot:
the boot image ID is 5c1c497a-20f2-49b9-a961-dae95516a411 because:

- The isBoot is true for it since the logs contain:
Bootable disk '5c1c497a-20f2-49b9-a961-dae95516a411f' set to index '0'

- The bootOrder == 1 since we send bootOrder=1

e.g. the for this VM the boot disk is the 5c1c497a-20f2-49b9-a961-dae95516a411 and it is correctly configured on both device and disk (attachment).
-------------------------------------------------------

So, long story short, it really looks like for this two VMs you have different disks configured as boot disks while from one the VM boots properly and from the other not.
Can you please double check in the VM main tab -> pick VM -> check the disks sub tab which disk is configured as bootable? For testboot and testboot2 it should be different disks (at least this is what the logs are telling me).

--- Additional comment from Stefan Wandl on 2017-02-23 08:23:18 EST ---

double checked: the larger disks with 10GB and the alias "root" have the OS(isbootable) flag set

--- Additional comment from Tomas Jelinek on 2017-03-02 08:45:44 EST ---

@Karen:
it seems that if you have more virtio-scsi disks than the bootindex is ignored and always the one which is on scsi0-0-0-0 is booting. Do you know if there is some limitation or bug around this in qemu?

--- Additional comment from Karen Noel on 2017-03-02 09:51:54 EST ---

(In reply to Tomas Jelinek from comment #11)
> @Karen:
> it seems that if you have more virtio-scsi disks than the bootindex is
> ignored and always the one which is on scsi0-0-0-0 is booting. Do you know
> if there is some limitation or bug around this in qemu?

Paolo?

--- Additional comment from Tomas Jelinek on 2017-03-14 05:43:00 EDT ---

Hey Paolo, any news?

Comment 1 nijin ashok 2017-03-16 10:06:05 UTC
Cloned this bug to downstream as we have a customer case.

The issue was easily reproducible with the below versions.

qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64
vdsm-4.18.21.1-1.el7ev.x86_64

Only the disk with alias scsi0-0-0-0 was selected as boot device and the boot order provided from the RHV/libvirt is completely ignored in case the VM is having multiple Virtio-SCSI disk. It works fine with Virtio disks.

Comment 2 Michal Skrivanek 2017-03-17 10:48:19 UTC
pending upstream investigation

Comment 3 Tomas Jelinek 2017-03-28 11:34:51 UTC
The issue is in seabios which does not recognize a virtio-scsi boot device if put on other LUN than 0.

We can either wait for fixing the seabios bug or in meanwhile we can change the way how the virtio-scsi disks are attached (see https://bugzilla.redhat.com/show_bug.cgi?id=1420869#c18)

To work around this issue on existing setup the address on vm_device table has to be modified.

Moving to storage to decide how to proceed.

Comment 5 Allon Mureinik 2017-04-09 15:50:33 UTC
This bug depends on the platform bug 1020622.
Pushing out to RHV 4.2 when we'll hopefully have this fix.

Comment 6 Allon Mureinik 2017-09-12 13:19:08 UTC
To quote my comment on bug 1420869:

I'm not sure I understand the issue here, but:
If the host should require it, please send a patch to vdsm's spec file.
If the guest should require it, it's up to the guest to use an updated bios, and this BZ should be closed.

Comment 7 Allon Mureinik 2017-11-15 14:55:59 UTC
There's no AI on RHV here.
If anything is to require this new seabios version it's qemu-kvm-[rh[ev]].
bug 1420869 tracks that work.


Note You need to log in before you can comment on or make changes to this bug.