1420869 – no bootable device found with more than one virtio-scsi disc

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1420869 - no bootable device found with more than one virtio-scsi disc

Summary: no bootable device found with more than one virtio-scsi disc

Keywords:
Status:	CLOSED DUPLICATE of bug 1020622
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	seabios
Sub Component:
Version:	7.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	pre-dev-freeze
Target Release:	---
Assignee:	Paolo Bonzini
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1020622
Blocks:	1432847
TreeView+	depends on / blocked

Reported:	2017-02-09 17:05 UTC by Stefan Wandl
Modified:	2021-06-10 11:55 UTC (History)
CC List:	18 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1432847 (view as bug list)
Environment:
Last Closed:	2017-11-17 14:03:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
VDSM and Engine Logs (16.81 MB, application/x-gzip) 2017-02-10 09:15 UTC, Stefan Wandl	no flags	Details
boot problem screenshot (90.37 KB, image/png) 2017-02-17 14:51 UTC, Stefan Wandl	no flags	Details
virsh xml dump, enginelog and qemu log (576.80 KB, application/x-gzip) 2017-02-22 15:34 UTC, Stefan Wandl	no flags	Details
View All

Description Stefan Wandl 2017-02-09 17:05:36 UTC

Description of problem:
when creating a new VM with more than one virtio-scsi disks the vm does not boot.
at first he tries to boot from pxe, a couple of minutes you see no bootable device in the console.

One disk is marked as bootable, and the disk is bootable. changing the boot disk to virtio fixes the problem.

Version-Release number of selected component (if applicable):
4.1.0.4-1.el7.centos

the problem already exist in verion 4.0

How reproducible:


Steps to Reproduce:
1. create a new vm
2. attach e.g. the ubuntu xenial cloud image and mark it as bootable, use virtio-scsi
3. create and attach a blank virtio-scsi disk

Actual results:
the machine does not find a bootable disk

Expected results:
the vm finds and boot from disk

Additional info:

Comment 1 Tomas Jelinek 2017-02-10 07:47:21 UTC

can you please provide engine and vdsm logs?

Comment 2 Stefan Wandl 2017-02-10 09:15:52 UTC

Created attachment 1249002 [details]
VDSM and Engine Logs

teh related machine is named test2, i have only added those vdsm logs which are somehow related to the test2 vm. The vm was created at 2017-02-09 11:31

Comment 3 Tomas Jelinek 2017-02-17 11:19:26 UTC

Hi Stefan,

I was not able to simulate this nor I have found anything suspicious in the logs, so I need to ask some questions:

- please make sure that in the boot options of the VM the Hard Disk is the first (edit vm -> boot options)
- please make sure in Resource Allocation tab the "VirtIO-SCSI Enabled" checkbox is checked
- if you have both, please try to run the VM as "run once" and in "Boot Options" check the "Enable menu to select boot device" checkbox and also the "Start in Pause Mode". This makes the VM pause at the very beginning. Connect the console to it and unpause it and enter the boot menu (press ESC in the SPICE window). And let us know what is listed there.

Thank you.

Comment 4 Stefan Wandl 2017-02-17 14:51:31 UTC

Created attachment 1252784 [details]
boot problem screenshot

screenshot of the boot problem after disk selection from the boot menu

Comment 5 Stefan Wandl 2017-02-17 14:58:51 UTC

Hi Tomas,

VirtIO-SCSI is enabled
Hard Disk is the first in boot options (i did not touch this value)

it does not matter if the disks are in one storage domain or different one.
i tried to create a new vm with one attached disk (ubuntu cloud image and one created disk) and created a new vm from template (the template has identical settings) with no difference

I have tested it on oVirt 4.0 and 4.1 (different installations)


could it be somehow connected with the naming of the disk?

BR
Stefan

Comment 6 Tomas Jelinek 2017-02-22 11:32:51 UTC

I was trying with various setups with various disks and Im still unable to simulate this...

but still, there are two options:
1: either engine sends the wrong boot order, or
2: some issue with the lower layers

To check a bit more the first:
When looking at the logs I see only attempts with two virtio scsi disks. Could you please provide for comparision an engine log when only one bootable disk is there so the VM boots properly and one where there are two so it does not?

To check a bit more the second:
Please provide for both runs (with one disk and with two) the output of 
virsh -r dumpxml <vm name> 
from the host when the VM is running.

Also please attach the libvirt and qemu logs (I don't expect much there, but...)

Thank you
Tomas

Comment 7 Stefan Wandl 2017-02-22 15:34:24 UTC

Created attachment 1256503 [details]
virsh xml dump, enginelog and qemu log

virsh dumpxml, enginelog and qemu log

Comment 8 Stefan Wandl 2017-02-22 15:44:17 UTC

Hi Tomas,

this time it was quite hard to reproduce the problem so you will find a lot of entries in the logs with the same machine names. In the end i was "successfull".

here are the timestamps:

2017-02-22 14:47:26
copy root disk finished

2017-02-22 14:48:13
create new vm called testboot with attached root disk and an newly create log 


2017-02-22 14:51:43
clone machine testboot2

2017-02-22 14:52:28,026
"run once" testboot2 -> machine is booting
"run once" testboot -> machine is NOT booting


As far as i can see the problem appears whenever the boot disk is not on scsi scsi0-0-0-0. I have double checked the results with my 3 disk template and got the same result: whenever the boot disk is on scsi0-0-0-0 it is booting.

Stefan

Comment 9 Tomas Jelinek 2017-02-23 13:07:30 UTC

So, to determine which disk is the boot we have 2 "flags":
1: the "isBoot" flag on disk
2: the "bootOrder" on device

If the disk (attachment) has isBoot==true (in engine DB) we send "index=0" to VDSM.
If it's device has bootOrder > 0 (in engine DB) we send the "bootOrder=N" to the VDSM.
(simplifying here, but more or less...)

so, what I see in the logs:
-------------------------------------------------------
testboot2:
the boot image ID is 74766bda-8c9c-4c61-aa9b-149da942c9df because:

- The isBoot is true for it since the logs contain:
Bootable disk '74766bda-8c9c-4c61-aa9b-149da942c9df' set to index '0'

- The bootOrder == 1 since we send bootOrder=1

e.g. the for this VM the boot disk is the 74766bda-8c9c-4c61-aa9b-149da942c9df and it is correctly configured on both device and disk (attachment).
-------------------------------------------------------
testboot:
the boot image ID is 5c1c497a-20f2-49b9-a961-dae95516a411 because:

- The isBoot is true for it since the logs contain:
Bootable disk '5c1c497a-20f2-49b9-a961-dae95516a411f' set to index '0'

- The bootOrder == 1 since we send bootOrder=1

e.g. the for this VM the boot disk is the 5c1c497a-20f2-49b9-a961-dae95516a411 and it is correctly configured on both device and disk (attachment).
-------------------------------------------------------

So, long story short, it really looks like for this two VMs you have different disks configured as boot disks while from one the VM boots properly and from the other not.
Can you please double check in the VM main tab -> pick VM -> check the disks sub tab which disk is configured as bootable? For testboot and testboot2 it should be different disks (at least this is what the logs are telling me).

Comment 10 Stefan Wandl 2017-02-23 13:23:18 UTC

double checked: the larger disks with 10GB and the alias "root" have the OS(isbootable) flag set

Comment 11 Tomas Jelinek 2017-03-02 13:45:44 UTC

@Karen:
it seems that if you have more virtio-scsi disks than the bootindex is ignored and always the one which is on scsi0-0-0-0 is booting. Do you know if there is some limitation or bug around this in qemu?

Comment 12 Karen Noel 2017-03-02 14:51:54 UTC

(In reply to Tomas Jelinek from comment #11)
> @Karen:
> it seems that if you have more virtio-scsi disks than the bootindex is
> ignored and always the one which is on scsi0-0-0-0 is booting. Do you know
> if there is some limitation or bug around this in qemu?

Paolo?

Comment 13 Tomas Jelinek 2017-03-14 09:43:00 UTC

Hey Paolo, any news?

Comment 14 Paolo Bonzini 2017-03-16 09:49:35 UTC

What is the QEMU command line?

Comment 15 Tomas Jelinek 2017-03-17 08:45:33 UTC

Paolo, you can look at the attached "virsh xml dump, enginelog and qemu log".

Comment 16 Paolo Bonzini 2017-03-17 13:34:15 UTC

Of the two virtio-scsi disks, only one has a <boot order="1"> element. Therefore, the other disk is not part of the boot order.

Comment 17 Tomas Jelinek 2017-03-21 08:26:10 UTC

right. And the one which has the <boot order="1"> is the disk we need to boot from. Yet, it does not boot from it, unless it is on scsi0-0-0-0.

Comment 18 Paolo Bonzini 2017-03-22 09:27:16 UTC

Please try putting it on target 1 LUN 0 instead of target 0 LUN 1, that should work.  Usually with real hardware separate disks will be assigned to separate targets rather than separate LUNs.

If that works, this is bug 1020622.

Comment 19 Yaniv Kaul 2017-03-27 07:56:33 UTC

(In reply to Paolo Bonzini from comment #18)
> Please try putting it on target 1 LUN 0 instead of target 0 LUN 1, that
> should work.  Usually with real hardware separate disks will be assigned to
> separate targets rather than separate LUNs.
> 
> If that works, this is bug 1020622.

Tomas, is that the case?

Comment 20 Tomas Jelinek 2017-03-28 11:31:20 UTC

indeed, it is the case.

Moving to storage to decide if changing the address assignment as per comment 18 is something we want to do or not.

@Stefan: to work around this issue, you can go to DB to vm_device table and update the address field.

Comment 21 Tal Nisan 2017-03-28 16:51:54 UTC

I don't see any reason to change it since the bug is in the fact that LUN 1 cannot be boot from thus it's should be dependent on bug 1020622

Comment 22 Allon Mureinik 2017-04-09 15:48:25 UTC

This bug depends on the platform bug 1020622, which is targeted to RHEL 7.4.
Pushing out to oVirt 4.2 when a proper fix from the platform should hopefully be available.

Comment 23 Allon Mureinik 2017-09-03 14:50:08 UTC

(In reply to Allon Mureinik from comment #22)
> This bug depends on the platform bug 1020622, which is targeted to RHEL 7.4.
> Pushing out to oVirt 4.2 when a proper fix from the platform should
> hopefully be available.
Tal, is there any AI for us besides requiring this fix?

Comment 24 Tal Nisan 2017-09-04 08:31:01 UTC

(In reply to Allon Mureinik from comment #23)
> (In reply to Allon Mureinik from comment #22)
> > This bug depends on the platform bug 1020622, which is targeted to RHEL 7.4.
> > Pushing out to oVirt 4.2 when a proper fix from the platform should
> > hopefully be available.
> Tal, is there any AI for us besides requiring this fix?

No, that should fix the bug

Comment 25 Allon Mureinik 2017-09-12 12:48:08 UTC

(In reply to Tal Nisan from comment #24)
> (In reply to Allon Mureinik from comment #23)
> > (In reply to Allon Mureinik from comment #22)
> > > This bug depends on the platform bug 1020622, which is targeted to RHEL 7.4.
> > > Pushing out to oVirt 4.2 when a proper fix from the platform should
> > > hopefully be available.
> > Tal, is there any AI for us besides requiring this fix?
> 
> No, that should fix the bug

I'm not sure I understand the issue here, but:
If the host should require it, please send a patch to vdsm's spec file.
If the guest should require it, it's up to the guest to use an updated bios, and this BZ should be closed.

Comment 26 Robert Scheck 2017-09-20 08:31:55 UTC

Do I get it right that RHV 4.2 should address this issue? We are currently
using RHV 4.1 (GSS ticket #01908243) and also affected.

Comment 27 Tal Nisan 2017-09-25 12:05:37 UTC

It's not about the guest, QEMU uses the SeaBIOS binary when running the VM so I guess QEMU should require it.
I've checked and for RHEL/Centos 7.4 the fixed package is available but it seems that not in Fedora

Comment 28 Allon Mureinik 2017-11-15 14:46:49 UTC

(In reply to Tal Nisan from comment #27)
> It's not about the guest, QEMU uses the SeaBIOS binary when running the VM
> so I guess QEMU should require it.

Agreed.
Moving this bug to qemu-kvm-rhev so their devs can decide if they want to bump the requirement to seabios>=1.10.2-3.el7

Comment 30 Michal Skrivanek 2017-11-15 19:10:13 UTC

(In reply to Allon Mureinik from comment #28)
> (In reply to Tal Nisan from comment #27)
> > It's not about the guest, QEMU uses the SeaBIOS binary when running the VM
> > so I guess QEMU should require it.
> 
> Agreed.
> Moving this bug to qemu-kvm-rhev so their devs can decide if they want to
> bump the requirement to seabios>=1.10.2-3.el7

I do not believe anything is needed. That seabios version was released in 7.4 GA so all 7.4 hosts have that, so all RHV 4.1 hosts have that.
F26 is not a supported host OS so that's irrelevant.

Comment 31 Robert Scheck 2017-11-15 20:24:10 UTC

Isn't RHV 4.1 based on 7.4 GA?

Comment 32 Paolo Bonzini 2017-11-17 14:03:03 UTC


*** This bug has been marked as a duplicate of bug 1020622 ***

Note You need to log in before you can comment on or make changes to this bug.