Bug 1090713

Summary:

Met error messages while the guest OS booting up on a nbd device

Product:

Red Hat Enterprise Linux 7

Reporter:

Ruifeng <rbian>

Component:

qemu-kvm

Assignee:

Hanna Czenczek <hreitz>

Status:

CLOSED WONTFIX

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

7.0

CC:

dyuan, eblake, hhuang, hreitz, juzhang, mzhan, pkrempa, rbalakri, sluo, virt-maint

Target Milestone:

Keywords:

Reopened

Target Release:

7.1

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-10-16 13:59:39 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Console screen-shot and log messages	none
guest-kernel-log.txt	none

Description Ruifeng 2014-04-24 02:19:18 UTC

Created attachment 889154 [details]
Console screen-shot and log messages

Description:
Start a guest with only one nbd device and with a linux OS installed on the nbd device, "virsh create" return successfully, but meet error messages while the guest OS booting up.


Version:
libvirt-client-1.1.1-29.el7.x86_64
qemu-kvm-1.5.3-60.el7.x86_64
kernel-3.10.0-121.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. start nbd server on the server host.
#nbd-server 10809 /var/lib/libvirt/images/r7g-qcow2.img
 
2. create a domain with xml as bellow.
# virsh create r7g-nbd.xml
Domain r7g-qcow2 created from r7g-nbd.xml

# virsh dumpxml r7g-qcow2| grep disk -A 7
    <disk type='network' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source protocol='nbd'>
        <host name='10.66.106.34' port='10809'/>
      </source>
      <target dev='sda' bus='scsi'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

# ps -ef| grep qemu-kvm| grep nbd
qemu     15724     1 99 09:58 ?        00:00:47 /usr/libexec/qemu-kvm -name r7g-qcow2
......
-drive file=nbd:10.66.106.34:10809,if=none,id=drive-scsi0-0-0-0,format=qcow2
-device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1
......

3. login to the guest, the console output error messages and fail to booting up.


Actual results：
As described.

Expected results:
The guest can boot up successfully.

Additional info:
Attached screen-shot and log files.

Comment 1 Peter Krempa 2014-04-24 06:44:33 UTC

Machine log file hints to an I/O error when accessing the disk, same with the screenshot.

libvirt's machine log file:
2014-04-18 02:50:54.370+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name rhel7 -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu SandyBridge -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid b1410a75-f447-4dcc-bec6-a3b8b2e09539 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-kvm-pit-reinjection -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot menu=on,strict=on -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x9 -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=nbd:10.66.106.26:10809,if=none,id=drive-scsi0-0-0-0,format=qcow2 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,event_idx=off,netdev=hostnet0,id=net0,mac=52:54:00:95:79:af,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/rhel7.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -device usb-tablet,id=input0 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8
Warning: option deprecated, use lost_tick_policy property of kvm-pit instead.
char device redirected to /dev/pts/1 (label charconsole0)
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 0.568000 ms, bitrate 454101995 bps (433.065410 Mbps)
red_dispatcher_set_cursor_peer: 
inputs_connect: inputs channel client create
block I/O error in device 'drive-scsi0-0-0-0': Invalid argument (22)
block I/O error in device 'drive-scsi0-0-0-0': Input/output error (5)
block I/O error in device 'drive-scsi0-0-0-0': Input/output error (5)
block I/O error in device 'drive-scsi0-0-0-0': Input/output error (5)
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed

Comment 2 Peter Krempa 2014-04-24 06:46:05 UTC

Moving to qemu to investigate further.

Comment 3 juzhang 2014-04-25 02:18:33 UTC

Hi Sluo,

Can you have a try by using qemu-kvm CML directly?

Best Regards,
Junyi

Comment 4 Sibiao Luo 2014-04-25 03:27:44 UTC

(In reply to juzhang from comment #3)
> Hi Sluo,
> 
> Can you have a try by using qemu-kvm CML directly?
> 
Yes, hit this issue with qemu-kvm command line directly.

host info:
# uname -r && rpm -q qemu-kvm-rhev
3.10.0-121.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev.x86_64
# rpm -q nbd
nbd-2.9.20-3.el7.x86_64
guest info:
# uname -r
3.10.0-121.el7.x86_64

How reproducible:
3/5

Steps:
1. start nbd-server to export the system image with absolute path on the NBD server host.
# nbd-server 12345 /home/RHEL-7.0-20140409.0_Server_x86_64.qcow2bk

** (process:20912): WARNING **: Specifying an export on the command line is deprecated.

** (process:20912): WARNING **: Please use a configuration file instead.

2. launch a KVM guest with the exported the system image.
e.g:# /usr/libexec/qemu-kvm...-drive file=nbd:10.66.83.171:12345,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native -device virtio-scsi-pci,id=scsi0,addr=0x4,bus=pci.0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-system-disk,id=system-disk,bootindex=1

Results:
after step 2, KVM guest fail to boot up and HMP flooded with "nbd.c:nbd_receive_reply():L746: read failed".
(qemu) block I/O error in device 'drive-system-disk': Invalid argument (22)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
nbd.c:nbd_receive_reply():L746: read failed
......

And guest kernel log output "XFS (dm-0): metadata I/O error: block 0xdf1b70 ("xfs_trans_read_buf_map") error", I will attach the guest kernel log later.

Best Regards,
sluo

Comment 5 Sibiao Luo 2014-04-25 03:28:15 UTC

Created attachment 889518 [details]
guest-kernel-log.txt

Comment 6 Sibiao Luo 2014-04-25 03:28:47 UTC

My qemu-kvm command line:
# /usr/libexec/qemu-kvm -S -M pc -cpu host -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -no-kvm-pit-reinjection -usb -device usb-tablet,id=input0 -name sluo -uuid 990ea161-6b67-41b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,chardev=channel1,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port2 -drive file=nbd:10.66.83.171:12345,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native -device virtio-scsi-pci,id=scsi0,addr=0x4,bus=pci.0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-system-disk,id=system-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:01:02:B6:40:11,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x6 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/ttyS0,server,nowait -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -vnc :1 -monitor stdio

Comment 8 Hanna Czenczek 2014-06-18 21:37:15 UTC

Hi Sluo,

I get the same errors with upstream qemu when trying to boot from a non-preallocated qcow2 image. I don't know the internals of nbd-server, but from Wireshark and qemu's NBD debug output it appears to me that nbd-server does not allow writes beyond the end of the disk image (and stalls all other requests after such a write).

A non-preallocated qcow2 image of course may grow on write accesses, that's why as soon as a new cluster is allocated (at the end of the disk image) and data is written there, nbd-server returns an error and subsequently stalls. Converting the image to a preallocated one (qemu-img convert -O qcow2 -o preallocation=metadata rhel7.qcow2 rhel7-preallocated.qcow2) and then using that result works for me.

Could you try that, too? If it works, I guess this not really qemu's fault but rather nbd-server's which does not allow growing disk images. The best thing qemu could do is to try to sense whether the NBD server allows growing and if it doesn't, somehow pass that through to the image format drivers which must then prevent new allocation of clusters. But the result would practically be the same (EIO or ENOSPC on boot), therefore it's probably not worth it.

If this (preallocation) does not fix the problem for you, I'll try again with RHEL6's qemu.

Thank you,
Max

Comment 9 Sibiao Luo 2014-06-19 05:09:14 UTC

(In reply to Max Reitz from comment #8)
> Hi Sluo,
> 
> I get the same errors with upstream qemu when trying to boot from a
> non-preallocated qcow2 image. I don't know the internals of nbd-server, but
> from Wireshark and qemu's NBD debug output it appears to me that nbd-server
> does not allow writes beyond the end of the disk image (and stalls all other
> requests after such a write).
> 
> A non-preallocated qcow2 image of course may grow on write accesses, that's
> why as soon as a new cluster is allocated (at the end of the disk image) and
> data is written there, nbd-server returns an error and subsequently stalls.
> Converting the image to a preallocated one (qemu-img convert -O qcow2 -o
> preallocation=metadata rhel7.qcow2 rhel7-preallocated.qcow2) and then using
> that result works for me.
> 
> Could you try that, too? If it works, I guess this not really qemu's fault
> but rather nbd-server's which does not allow growing disk images. The best
> thing qemu could do is to try to sense whether the NBD server allows growing
> and if it doesn't, somehow pass that through to the image format drivers
> which must then prevent new allocation of clusters. But the result would
> practically be the same (EIO or ENOSPC on boot), therefore it's probably not
> worth it.
> 
> If this (preallocation) does not fix the problem for you, I'll try again
> with RHEL6's qemu.
> 
Thanks for your analysis，retried with your instruction which works well if specified the preallocation=metadata for disk image.

hostinfo:
3.10.0-123.4.2.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev.x86_64
nbd-2.9.20-3.el7.x86_64

# qemu-img create -f qcow2 -o preallocation=metadata metadata-data-disk.qcow2 10G
Formatting 'metadata-data-disk.qcow2', fmt=qcow2 size=10737418240 encryption=off cluster_size=65536 preallocation='metadata' lazy_refcounts=off 

# nbd-server 12345 /home/metadata-data-disk.qcow2

** (process:28742): WARNING **: Specifying an export on the command line is deprecated.

** (process:28742): WARNING **: Please use a configuration file instead.

e,g:...-drive file=nbd:10.66.104.53:12345,format=qcow2,if=none,id=drive-scsi-disk -device virtio-scsi-pci,id=scsi0,addr=0x8 -device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk1

Best Regards,
sluo

Comment 10 Sibiao Luo 2014-06-19 05:30:58 UTC

(In reply to Max Reitz from comment #8)
> 
> If this (preallocation) does not fix the problem for you, I'll try again
> with RHEL6's qemu.
> 
rhel6: Bug 1111028 - QEMU fail to open NBD device image which exported by the nbd-server

Comment 11 Hanna Czenczek 2014-06-20 17:26:20 UTC

Oh, I somehow thought this bug was about RHEL 6; sorry. Please read the "RHEL6" as "RHEL7", then.

Anyway, considering that my reasoning seems to be correct (nbd-server simply cannot grow image files), I think qemu cannot do anything to work around this behavior. The best thing we could do is to document it, but I don't know where to do so.

Therefore I think this is definitely not a bug in qemu and most probably not even a bug in nbd-server (only a kind of (perhaps even documented) behavior which makes it difficult to export non-raw images over NBD using nbd-server), but simply a problem inherent to NBD.

Maybe the best qemu could do is to emit a warning if a image format which needs its files to be able to grow (such as qcow2) is used non-read-only over NBD. I feel like this would not be completely trivial to implement, though, thus I'll close this BZ for now. If you feel like qemu should indeed emit a warning and it would significantly improve its behavior in this case, feel free to reopen it and I'll do my best to implement it (or if you have any other suggestion).

Max

Comment 12 Sibiao Luo 2014-06-30 02:30:42 UTC

(In reply to Max Reitz from comment #11)
> Oh, I somehow thought this bug was about RHEL 6; sorry. Please read the
> "RHEL6" as "RHEL7", then.
> 
> Anyway, considering that my reasoning seems to be correct (nbd-server simply
> cannot grow image files), I think qemu cannot do anything to work around
> this behavior. The best thing we could do is to document it, but I don't
> know where to do so.
> 
> Therefore I think this is definitely not a bug in qemu and most probably not
> even a bug in nbd-server (only a kind of (perhaps even documented) behavior
> which makes it difficult to export non-raw images over NBD using
> nbd-server), but simply a problem inherent to NBD.
> 
> Maybe the best qemu could do is to emit a warning if a image format which
> needs its files to be able to grow (such as qcow2) is used non-read-only
> over NBD. I feel like this would not be completely trivial to implement,
> though, thus I'll close this BZ for now. If you feel like qemu should indeed
> emit a warning and it would significantly improve its behavior in this case,
> feel free to reopen it and I'll do my best to implement it (or if you have
> any other suggestion).
> 
From KVM QE's POV we should add a friendly warning message prompt for KVM user about the reasion. It's same to bug 1111028#c4, please correct me if any mistake, thanks in advance.

Best Regards,
sluo

Comment 13 Hanna Czenczek 2014-07-01 12:26:58 UTC

Okay, I'll take a look into it. I'll try to output a warning if an image format which may require growth on write accesses is used non-read-only over a non-growable protocol. If it's too ugly to implement, I probably won't do it, though.

Comment 14 Sibiao Luo 2014-07-02 08:13:43 UTC

(In reply to Max Reitz from comment #13)
> Okay, I'll take a look into it. I'll try to output a warning if an image
> format which may require growth on write accesses is used non-read-only over
> a non-growable protocol. If it's too ugly to implement, I probably won't do
> it, though.
Thanks a lot, i think you can let the QEMU quit with just a warning prompt error message for users instead of 'I/O error'. Just as Paolo Bonzini said in bug 1111028#c5: RHEL6 is not able to use good error messages because it only uses errno to generate errors, the RHEL7 changes to improve the error messages were huge and cannot be backported.

Best Regards,
sluo

Comment 15 Hanna Czenczek 2014-10-16 13:59:39 UTC

As it turned out, fixing this bug goes along with the restructuring of (at least) some parts of the block layer which is currently taking place. Therefore, this will still take some time to be implemented even upstream, and I cannot make any promises on when it will be there. Due to how many dependencies the solution will probably have in the end, I will not bring this to RHEL (both 6 and 7), however. Therefore, closing this BZ as WONTFIX.

Max

Comment 16 Eric Blake 2017-11-17 18:12:44 UTC

(In reply to Max Reitz from comment #11)

> Anyway, considering that my reasoning seems to be correct (nbd-server simply
> cannot grow image files), I think qemu cannot do anything to work around
> this behavior. The best thing we could do is to document it, but I don't
> know where to do so.
> 
> Therefore I think this is definitely not a bug in qemu and most probably not
> even a bug in nbd-server (only a kind of (perhaps even documented) behavior
> which makes it difficult to export non-raw images over NBD using
> nbd-server), but simply a problem inherent to NBD.

There is a proposal to add NBD_CMD_RESIZE to the NBD spec; which WILL allow resizing non-raw images over NBD; but it will not land any sooner than upstream qemu 2.12.  Until that time, I agree that the inability to grow an image can lead to enough obscure errors that...

> 
> Maybe the best qemu could do is to emit a warning if a image format which
> needs its files to be able to grow (such as qcow2) is used non-read-only
> over NBD. I feel like this would not be completely trivial to implement,
> though, thus I'll close this BZ for now. If you feel like qemu should indeed
> emit a warning and it would significantly improve its behavior in this case,
> feel free to reopen it and I'll do my best to implement it (or if you have
> any other suggestion).

...issuing a warning about use of NBD with non-raw format may indeed be worthwhile.

Comment 17 Hanna Czenczek 2017-11-20 16:04:12 UTC

(In reply to Eric Blake from comment #16)
> There is a proposal to add NBD_CMD_RESIZE to the NBD spec; which WILL allow
> resizing non-raw images over NBD; but it will not land any sooner than
> upstream qemu 2.12.  Until that time, I agree that the inability to grow an
> image can lead to enough obscure errors that...

But is there a point in exporting non-raw images over NBD?

> ...issuing a warning about use of NBD with non-raw format may indeed be
> worthwhile.

I think I once sent some patches for that.  The warning was to be emitted generally when you try to use a format which requires implicit growth of the underlying file, but one of the issues was the good old LVM thing (using qcow2 directly in LVM volumes and growing them manually when necessary).

Max