Bug 1737268 - Disk on gluster 4k storage not bootable after successful installaion
Summary: Disk on gluster 4k storage not bootable after successful installaion
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1592916
TreeView+ depends on / blocked
 
Reported: 2019-08-04 21:49 UTC by Nir Soffer
Modified: 2019-08-05 16:15 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-05 16:15:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
tests scipts, test output, traces, vm xml, qemu log (884.11 KB, application/x-xz)
2019-08-04 21:49 UTC, Nir Soffer
no flags Details

Description Nir Soffer 2019-08-04 21:49:37 UTC
Created attachment 1600488 [details]
tests scipts, test output, traces, vm xml, qemu log

Description of problem:

Using <blockio logical_block_size="4096" physical_block_size="4096"> in libvirt
xml seems to work around the issue described in bug 1737256, and provisioning 
a VM is successful. However the disk is not bootable on the next boot.

Reproduced with Fedora 29 and Alpine 3.10.1 iso.

Version-Release number of selected component (if applicable):
bug 1737256 

How reproducible:
Always.

Steps to Reproduce:
1. Add <blockio> element to libvirt xml, or add
   logical_block_size,physical_block_size to qemu command line.
2. Start the VM from iso
3. Install
4. Boot

Actual results:
Boot fail with: No bootable device

Expected results:
Boot successful.

Additional info:

For the storage and host details, see bug 1737256.

Here is example flow using alpine iso, tested by running qemu directly,
based on qemu command line created by libvirt when running the VM using oVirt.

name=qemu-fuse
images=/rhev/data-center/mnt/glusterSD/voodoo4.tlv.redhat.com\:_gv0/de566475-5b67-4987-abf3-3dc98083b44c/images
disk=4163cc03-c5ef-4956-ac0a-6e04699ce2e5/51de3a28-5c0a-4db1-a51c-b09c437b6f39
cdrom=3119fe7e-f576-46c3-880b-9590cb4619da/46efd27f-3bbe-4183-91f7-912c163b8eac

truncate -s 0 $images/$disk
truncate -s 1g $images/$disk

# Provision vm from iso.
strace -f -o $name-provision.trace qemu-kvm \
        -object iothread,id=iothread1 \
        -device virtio-scsi-pci,iothread=iothread1,id=bus1,bus=pci.0,addr=0x5 \
        -drive file=$images/$disk,format=raw,cache=none,if=none,id=drive1 \
        -device scsi-hd,bus=bus1.0,drive=drive1,id=disk1,logical_block_size=4096,physical_block_size=4096,write-cache=on \
        -cdrom $images/$cdrom \
        -m 1024 \
        -nographic

# Start the vm.
strace -f -o $name-run.trace qemu-kvm \
        -object iothread,id=iothread1 \
        -device virtio-scsi-pci,iothread=iothread1,id=bus1,bus=pci.0,addr=0x5 \
        -drive file=$images/$disk,format=raw,cache=none,if=none,id=drive1 \
        -device scsi-hd,bus=bus1.0,drive=drive1,id=disk1,logical_block_size=4096,physical_block_size=4096,write-cache=on \
        -m 1024 \
        -nographic

Here is output from the guest:

1. Checking devices

We can see that the guest see the expected block size.

localhost:~# grep -s "" /sys/block/sda/queue/*
/sys/block/sda/queue/add_random:1
/sys/block/sda/queue/chunk_sectors:0
/sys/block/sda/queue/dax:0
/sys/block/sda/queue/discard_granularity:4096
/sys/block/sda/queue/discard_max_bytes:1073741824
/sys/block/sda/queue/discard_max_hw_bytes:1073741824
/sys/block/sda/queue/discard_zeroes_data:0
/sys/block/sda/queue/fua:0
/sys/block/sda/queue/hw_sector_size:4096
/sys/block/sda/queue/io_poll:1
/sys/block/sda/queue/io_poll_delay:-1
/sys/block/sda/queue/iostats:1
/sys/block/sda/queue/logical_block_size:4096
/sys/block/sda/queue/max_discard_segments:1
/sys/block/sda/queue/max_hw_sectors_kb:32767
/sys/block/sda/queue/max_integrity_segments:0
/sys/block/sda/queue/max_sectors_kb:1280
/sys/block/sda/queue/max_segment_size:65536
/sys/block/sda/queue/max_segments:126
/sys/block/sda/queue/minimum_io_size:4096
/sys/block/sda/queue/nomerges:0
/sys/block/sda/queue/nr_requests:256
/sys/block/sda/queue/optimal_io_size:0
/sys/block/sda/queue/physical_block_size:4096
/sys/block/sda/queue/read_ahead_kb:128
/sys/block/sda/queue/rotational:1
/sys/block/sda/queue/rq_affinity:1
/sys/block/sda/queue/scheduler:[mq-deadline] kyber none
/sys/block/sda/queue/write_cache:write back
/sys/block/sda/queue/write_same_max_bytes:2147479552
/sys/block/sda/queue/write_zeroes_max_bytes:2147479552
/sys/block/sda/queue/zoned:none


2. Installing


localhost:~# setup-alpine
...

localhost:~# apk add sfdisk syslinux
...

localhost:~# setup-disk
Available disks are:
  sda   (1.1 GB QEMU     QEMU HARDDISK   )
Which disk(s) would you like to use? (or '?' for help or 'none') [sda]
The following disk is selected:
  sda   (1.1 GB QEMU     QEMU HARDDISK   )
How would you like to use it? ('sys', 'data', 'lvm' or '?' for help) [?] sys
WARNING: The following disk(s) will be erased:
  sda   (1.1 GB QEMU     QEMU HARDDISK   )
WARNING: Erase the above disk(s) and continue? [y/N]: y
Creating file systems...
Installing system on /dev/sda3:
/mnt/boot is device /dev/sda1
100% ████████████████████████████████████████████==> initramfs: creating /boot/initramfs-virt
/boot is device /dev/sda1

Installation is complete. Please reboot.

localhost:~# sfdisk -l /dev/sda
Disk /dev/sda: 1 GiB, 1073741824 bytes, 262144 sectors
Disk model: QEMU HARDDISK   
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x9773962b

Device     Boot Start    End Sectors  Size Id Type
/dev/sda1  *      256  25855   25600  100M 83 Linux
/dev/sda2       25856  91391   65536  256M 82 Linux swap / Solaris
/dev/sda3       91392 262143  170752  667M 83 Linux

localhost:~# poweroff
...


3. Next boot fails

SeaBIOS (version ?-20190712_051036-bde29493c9324747a3ac6dd355c75f9e-2.fc29)


iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+3FF91220+3FED1220 C980



Booting from Hard Disk...
Boot failed: could not read the boot disk

Booting from Floppy...
Boot failed: could not read the boot disk

Booting from ROM...
iPXE (PCI 00:03.0) starting execution...ok
iPXE initialising devices...ok



iPXE 1.0.0+ -- Open Source Network Boot Firmware -- http://ipxe.org
Features: DNS HTTP iSCSI TFTP AoE ELF MBOOT PXE bzImage Menu PXEXT

net0: 52:54:00:12:34:56 using 82540em on 0000:00:03.0 (open)
  [Link:up, TX:0 TXE:0 RX:0 RXE:0]
Configuring (net0 52:54:00:12:34:56).............. ok
net0: 10.0.2.15/255.255.255.0 gw 10.0.2.2
net0: fec0::5054:ff:fe12:3456/64 gw fe80::2
net0: fe80::5054:ff:fe12:3456/64
Nothing to boot: No such file or directory (http://ipxe.org/2d03e13b)
No more network devices

No bootable device.


The same flow was reproduced using Fedora 29 iso in oVirt.
See the attached vm.xm and qemu.log for the oVirt vm details.

I tested both fuse and libgfapi, same results in both cases. I included only 
output from the fuse tests.

Comment 1 Kevin Wolf 2019-08-05 16:15:47 UTC
The BIOS interfaces simply don't support 4k native disks, so no booting from native 4k disks with BIOS, neither in VMs nor on real hardware. There's nothing we can do about this.

If you need BIOS to access a disk, you need to keep it a logical 512 bytes disk. (I understand that bug 1737256 may mean we have a problem there, but unfortunately, switching to a virtual 4k native disks isn't the easy solution/workaround.)


Note You need to log in before you can comment on or make changes to this bug.