Created attachment 1600488 [details] tests scipts, test output, traces, vm xml, qemu log Description of problem: Using <blockio logical_block_size="4096" physical_block_size="4096"> in libvirt xml seems to work around the issue described in bug 1737256, and provisioning a VM is successful. However the disk is not bootable on the next boot. Reproduced with Fedora 29 and Alpine 3.10.1 iso. Version-Release number of selected component (if applicable): bug 1737256 How reproducible: Always. Steps to Reproduce: 1. Add <blockio> element to libvirt xml, or add logical_block_size,physical_block_size to qemu command line. 2. Start the VM from iso 3. Install 4. Boot Actual results: Boot fail with: No bootable device Expected results: Boot successful. Additional info: For the storage and host details, see bug 1737256. Here is example flow using alpine iso, tested by running qemu directly, based on qemu command line created by libvirt when running the VM using oVirt. name=qemu-fuse images=/rhev/data-center/mnt/glusterSD/voodoo4.tlv.redhat.com\:_gv0/de566475-5b67-4987-abf3-3dc98083b44c/images disk=4163cc03-c5ef-4956-ac0a-6e04699ce2e5/51de3a28-5c0a-4db1-a51c-b09c437b6f39 cdrom=3119fe7e-f576-46c3-880b-9590cb4619da/46efd27f-3bbe-4183-91f7-912c163b8eac truncate -s 0 $images/$disk truncate -s 1g $images/$disk # Provision vm from iso. strace -f -o $name-provision.trace qemu-kvm \ -object iothread,id=iothread1 \ -device virtio-scsi-pci,iothread=iothread1,id=bus1,bus=pci.0,addr=0x5 \ -drive file=$images/$disk,format=raw,cache=none,if=none,id=drive1 \ -device scsi-hd,bus=bus1.0,drive=drive1,id=disk1,logical_block_size=4096,physical_block_size=4096,write-cache=on \ -cdrom $images/$cdrom \ -m 1024 \ -nographic # Start the vm. strace -f -o $name-run.trace qemu-kvm \ -object iothread,id=iothread1 \ -device virtio-scsi-pci,iothread=iothread1,id=bus1,bus=pci.0,addr=0x5 \ -drive file=$images/$disk,format=raw,cache=none,if=none,id=drive1 \ -device scsi-hd,bus=bus1.0,drive=drive1,id=disk1,logical_block_size=4096,physical_block_size=4096,write-cache=on \ -m 1024 \ -nographic Here is output from the guest: 1. Checking devices We can see that the guest see the expected block size. localhost:~# grep -s "" /sys/block/sda/queue/* /sys/block/sda/queue/add_random:1 /sys/block/sda/queue/chunk_sectors:0 /sys/block/sda/queue/dax:0 /sys/block/sda/queue/discard_granularity:4096 /sys/block/sda/queue/discard_max_bytes:1073741824 /sys/block/sda/queue/discard_max_hw_bytes:1073741824 /sys/block/sda/queue/discard_zeroes_data:0 /sys/block/sda/queue/fua:0 /sys/block/sda/queue/hw_sector_size:4096 /sys/block/sda/queue/io_poll:1 /sys/block/sda/queue/io_poll_delay:-1 /sys/block/sda/queue/iostats:1 /sys/block/sda/queue/logical_block_size:4096 /sys/block/sda/queue/max_discard_segments:1 /sys/block/sda/queue/max_hw_sectors_kb:32767 /sys/block/sda/queue/max_integrity_segments:0 /sys/block/sda/queue/max_sectors_kb:1280 /sys/block/sda/queue/max_segment_size:65536 /sys/block/sda/queue/max_segments:126 /sys/block/sda/queue/minimum_io_size:4096 /sys/block/sda/queue/nomerges:0 /sys/block/sda/queue/nr_requests:256 /sys/block/sda/queue/optimal_io_size:0 /sys/block/sda/queue/physical_block_size:4096 /sys/block/sda/queue/read_ahead_kb:128 /sys/block/sda/queue/rotational:1 /sys/block/sda/queue/rq_affinity:1 /sys/block/sda/queue/scheduler:[mq-deadline] kyber none /sys/block/sda/queue/write_cache:write back /sys/block/sda/queue/write_same_max_bytes:2147479552 /sys/block/sda/queue/write_zeroes_max_bytes:2147479552 /sys/block/sda/queue/zoned:none 2. Installing localhost:~# setup-alpine ... localhost:~# apk add sfdisk syslinux ... localhost:~# setup-disk Available disks are: sda (1.1 GB QEMU QEMU HARDDISK ) Which disk(s) would you like to use? (or '?' for help or 'none') [sda] The following disk is selected: sda (1.1 GB QEMU QEMU HARDDISK ) How would you like to use it? ('sys', 'data', 'lvm' or '?' for help) [?] sys WARNING: The following disk(s) will be erased: sda (1.1 GB QEMU QEMU HARDDISK ) WARNING: Erase the above disk(s) and continue? [y/N]: y Creating file systems... Installing system on /dev/sda3: /mnt/boot is device /dev/sda1 100% ████████████████████████████████████████████==> initramfs: creating /boot/initramfs-virt /boot is device /dev/sda1 Installation is complete. Please reboot. localhost:~# sfdisk -l /dev/sda Disk /dev/sda: 1 GiB, 1073741824 bytes, 262144 sectors Disk model: QEMU HARDDISK Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x9773962b Device Boot Start End Sectors Size Id Type /dev/sda1 * 256 25855 25600 100M 83 Linux /dev/sda2 25856 91391 65536 256M 82 Linux swap / Solaris /dev/sda3 91392 262143 170752 667M 83 Linux localhost:~# poweroff ... 3. Next boot fails SeaBIOS (version ?-20190712_051036-bde29493c9324747a3ac6dd355c75f9e-2.fc29) iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+3FF91220+3FED1220 C980 Booting from Hard Disk... Boot failed: could not read the boot disk Booting from Floppy... Boot failed: could not read the boot disk Booting from ROM... iPXE (PCI 00:03.0) starting execution...ok iPXE initialising devices...ok iPXE 1.0.0+ -- Open Source Network Boot Firmware -- http://ipxe.org Features: DNS HTTP iSCSI TFTP AoE ELF MBOOT PXE bzImage Menu PXEXT net0: 52:54:00:12:34:56 using 82540em on 0000:00:03.0 (open) [Link:up, TX:0 TXE:0 RX:0 RXE:0] Configuring (net0 52:54:00:12:34:56).............. ok net0: 10.0.2.15/255.255.255.0 gw 10.0.2.2 net0: fec0::5054:ff:fe12:3456/64 gw fe80::2 net0: fe80::5054:ff:fe12:3456/64 Nothing to boot: No such file or directory (http://ipxe.org/2d03e13b) No more network devices No bootable device. The same flow was reproduced using Fedora 29 iso in oVirt. See the attached vm.xm and qemu.log for the oVirt vm details. I tested both fuse and libgfapi, same results in both cases. I included only output from the fuse tests.
The BIOS interfaces simply don't support 4k native disks, so no booting from native 4k disks with BIOS, neither in VMs nor on real hardware. There's nothing we can do about this. If you need BIOS to access a disk, you need to keep it a logical 512 bytes disk. (I understand that bug 1737256 may mean we have a problem there, but unfortunately, switching to a virtual 4k native disks isn't the easy solution/workaround.)