Bug 1495090

Summary: Transfer a file about 10M failed from host to guest through spapr-vty device
Product: Red Hat Enterprise Linux 7 Reporter: Min Deng <mdeng>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Min Deng <mdeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.4-AltCC: dgibson, knoel, lmiksik, lvivier, michen, mrezanin, qzhang, rbalakri, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.10.0-9.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-11 00:36:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Min Deng 2017-09-25 07:42:15 UTC
Description of problem:
Transfer a file about 10M failed from host to guest through spapr-vty device 

Version-Release number of selected component (if applicable):
qemu-kvm-2.9.0-23.el7a.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch

How reproducible:
3/3

Steps to Reproduce:
1.boot up a guest with the following cli,
  /usr/libexec/qemu-kvm -name mdeng -sandbox off -machine pseries -nodefaults -chardev socket,id=serial_id_serial0,path=/tmp/tt2,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=rhel741-new.qcow2 -device scsi-hd,id=image1,drive=drive_image1,bootindex=0 -drive file=test1.img,if=none,id=drive-scsi0-0-0,media=disk,cache=writethrough,format=qcow2,werror=stop,rerror=stop,aio=native,cache.direct=on -device scsi-hd,drive=drive-scsi0-0-0,bus=virtio_scsi_pci0.0,scsi-id=0,lun=12,id=mm -device virtio-net-pci,mac=9a:2b:2c:2d:2f:00,id=id6b5tKj,vectors=4,netdev=idXB7qte,bus=pci.0,addr=0x5 -netdev tap,id=idXB7qte,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1 -m 6G,maxmem=200G,slots=256 -smp 8,maxcpus=16,cores=4,threads=1,sockets=2 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :2 -rtc base=utc,clock=host -enable-kvm -monitor stdio -qmp tcp:0:4445,server,nowait -chardev socket,id=console0,path=/tmp/console0,server,nowait -device spapr-vty,chardev=console0 -chardev socket,id=serial0,path=/tmp/serial0,server,nowait -device spapr-vty,chardev=serial0

2.transfer a small size file on host to guest,it was successful.

3.dd a file about 10M and send it from host to guest 
Host,#cat a.file |nc -U /tmp/serial0
Guest,#cat /dev/hvc1 > big

Actual results:
On the host,it looked like that the file has been already transferred.
On guest,check the size of the file within guest,it was zero.

Expected results:
The file can be transferred entirely.

Additional info:
Through a small size file,it could not be reproduced.

Comment 1 Min Deng 2017-09-25 07:43:39 UTC
The issue also can be reproduced on P8 + RHEL7.4.

Comment 2 Laurent Vivier 2017-09-25 16:41:12 UTC
I'm able to reproduce the problem with nc/cat but I'm not sure they are the good tools to transfer data over a serial line.

For instance, if I use minicom I'm able to transfer the 10MB file from host to guest:

- in the guest:

    # minicom -D /dev/hvc1

- in the host

    # minicom -D unix#/tmp/serial0

    Then: Ctrl-A Z
          Send files -> 'S'
          Select "zmodem"
          Select your file(s) to send ("Space to tag")
          'Enter' to select '[Okay]'

On both sides, 'Enter', Ctrl-A x, 'Yes' to exit
Then check the file has been copied on the guest.

I think spapr-vty cannot manage data stream if a flow control protocol is not used. Not sure we can consider that has a bug.

Comment 3 David Gibson 2017-09-25 23:29:13 UTC
So, modem transfer protocols are important to handle error correction on a physical serial line as well as flow control.  In the case of the hypervisor console there shouldn't be transfer errors though.

It's rather odd that the file ends up zero size with nc/cat, though, rather than getting at least some data.

However, whether or not it's a real bug, it's not urgent for Pegas 1.0.  Postponing.

Comment 4 David Gibson 2017-11-20 03:19:56 UTC
I'm able to successfully transfer a large file of text.  However, when I try transferring binary files things go wrong.  This may be a bug (or unexpected behaviour in netcat), still looking.

Comment 5 David Gibson 2017-11-20 04:12:27 UTC
Ok.

The major problem here is that the hvc device on the guest side defaults to a mode which is not safe for binary data.  It reverts to that mode whenever closed, so a simple 'cat' will never work properly for binary data.

To deal with this the guest side command should instead be:
     # (stty raw -echo; cat > outfile) < /dev/hvc1

The stty command puts the hvc1 terminal into raw mode which should be safe for binary transfers, and disables echo (so it doesn't spew garbage on the nc side).

On one of my testcases I'm still getting a single character dropped in the middle of the file.  It appears to be deterministic, but I haven't yet worked out why that's happening.

Comment 6 David Gibson 2017-11-20 05:50:04 UTC
Specifically, transferring host to guest, in stty raw mode, the vty seems to drop 0x00 (\0) bytes that immediately follow 0x0d (\r) bytes.  That's kind of weird.

I'm not sure if there's some stty setting I've missed which would affect this, or if it's an actual bug in the vty code.

I don't yet know if the character is lossed passing through qemu, or in the guest side vty code, but I strongly suspect the guest.

Comment 7 David Gibson 2017-11-20 07:03:20 UTC
Turns out the problem is in qemu.. sort of.

The guest side hvc driver, in hvterm_raw_get_chars() explicitly removes \0 characters after \r characters.  This is apparently due to a bug in the PowerVM hypervisor which erroneously inserted a \0 after every \r.

Because this workaround is baked into existing guests, we should probably make qemu's implementation bug-for-bug compatible.

I've made a draft implementation for upstream.

Comment 8 David Gibson 2017-11-27 04:40:28 UTC
Fix is merged upstream, brewing a backport at:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=14627550

Comment 11 Miroslav Rezanina 2017-11-28 10:51:46 UTC
Fix included in qemu-kvm-rhev-2.10.0-9.el7

Comment 13 Min Deng 2017-12-06 07:37:45 UTC
Hi David,
   According to comment0 QE re-tested bug and the build info are following as below,
   kernel-4.14.0-11.el7a.ppc64le
   qemu-kvm-rhev-2.10.0-11.el7.ppc64le
   SLOF-20170724-2.git89f519f.el7.noarch

   QE still can reproduce the issue while transferring 12M file from host to guest.(it did complete transferring on host side)
  #cat a.file |nc -U /tmp/serial0
  #cat /dev/hvc1 > big
  #ll
  #-rw-r--r--. 1 root root  0 Oct 24 04:29 big  -> size still was zero.
  if transferring was like this 
  echo caaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa |nc -U /tmp/serial0
  #-rw-r--r--. 1 root root 4096 Oct 24 04:40 big  -> size increased

  could you please double check this issue ? Thanks a lot.

  Min

Comment 14 David Gibson 2017-12-06 09:22:50 UTC
Note you'll still need the stty commands from comment 5, or the hvc won't be in a suitable mode for binary transfers.

Comment 15 Min Deng 2017-12-11 08:58:26 UTC
Verified the bug on the following builds
kernel-3.10.0-820.el7.ppc64le (host and guest)
qemu-kvm-rhev-2.10.0-11.el7.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch
steps,cli please refer to comment0
  1.#dd if=/dev/zero of=12M bs=1M count=12
     (about 12M)
  2.#cat 12M|nc -U /tmp/serial0  - host 

  3.#(stty raw -echo; cat > outfile) < /dev/hvc2  -guest

  Actual results,
  Made a contrast for the two files from host and guest
  [root@ibm-p8-rhevm-04 mdeng]# md5sum 12M  - host 
  efeebdda98ec1d7fb2ad83d23f0713bf  12M
  [root@dhcp47-137 home]# md5sum outfile  - guest 
  efeebdda98ec1d7fb2ad83d23f0713bf  outfile

  Expected results,
  The file can be transferred successfully and size is the same.

  Base above test results,the bug has been fixed already,thanks.

Comment 17 errata-xmlrpc 2018-04-11 00:36:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104