Bug 1416202

Summary: virt-install parser error : Input is not proper UTF-8, indicate encoding
Product: [Community] Virtualization Tools Reporter: Richard W.M. Jones <rjones>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: NEW --- QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, crobinso, gscrivan, libvirt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
--debug output none

Description Richard W.M. Jones 2017-01-24 20:52:24 UTC
Description of problem:

$ virt-install --name undercloud --ram 4096 --disk path=undercloud.qcow2,size=6 --vcpus 4 --os-type linux --os-variant rhel7 --graphics none --console pty,target_type=serial --location http://download.devel.redhat.com/released/RHEL-7/7.3/Server/x86_64/os/ --noreboot --noautoconsole --initrd-inject=/var/tmp/ks.cfg --extra-args "ks=file:/ks.cfg console=ttyS0,115200n8 serial"

Entity: line 2: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xD4 0x3C 0x2F 0x6E
  <name>�</name>
        ^

Starting install...
Retrieving file vmlinuz...                                  | 5.1 MB  00:00     
Retrieving file initrd.img...                               |  43 MB  00:03     
Allocating 'undercloud.qcow2'                               | 6.0 GB  00:00     

It seems to continue as normal despite the strange error.

Also the domain does have the correct name:

$ virsh list --all
 Id    Name                           State
----------------------------------------------------
 2     undercloud                     running

Version-Release number of selected component (if applicable):

virt-install-1.4.0-5.fc25.noarch

How reproducible:

100%

Steps to Reproduce:
1. See description above.

Comment 1 Richard W.M. Jones 2017-01-24 20:54:13 UTC
$ locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=

Comment 2 Cole Robinson 2017-02-05 20:34:54 UTC
I can't reproduce with latest rawhide and:

# locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=


Not really sure what could be causing it, I expect something busted elsewhere in the stack. Maybe virt-install --debug will give a hint

Comment 3 Richard W.M. Jones 2017-02-05 21:24:48 UTC
It's very strange - I *can* reproduce it even with
virt-install 1.4.0-5.fc26

I'll attach the --debug output ...

Comment 4 Richard W.M. Jones 2017-02-05 21:25:22 UTC
Created attachment 1247862 [details]
--debug output

Comment 5 Cole Robinson 2017-02-09 21:54:30 UTC
The culprit seems to be:

[Sun, 05 Feb 2017 21:23:26 virt-install 32649] DEBUG (virt-install:183) Distilled --disk options: ['path=undercloud.qcow2,size=6']
Entity: line 2: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xD4 0x3C 0x2F 0x6E
  <name>Ô</name>
        ^
[Sun, 05 Feb 2017 21:23:26 virt-install 32649] DEBUG (xmlbuilder:669) Error parsing xml=
<volume type='file'>
  <name>Ô</name>
  <key>/tmp/Ô</key>
  <source>
  </source>
  <capacity unit='bytes'>0</capacity>
  <allocation unit='bytes'>0</allocation>
  <target>
    <path>/tmp/Ô</path>
    <format type='raw'/>
    <permissions>
      <mode>0664</mode>
      <owner>1000</owner>
      <group>1000</group>
      <label>unconfined_u:object_r:user_tmp_t:s0</label>
    </permissions>
    <timestamps>
      <atime>1486329730.221568703</atime>
      <mtime>1484586393.429055281</mtime>
      <ctime>1484586393.429055281</ctime>
    </timestamps>
  </target>
</volume>


Libvirt is generating non-UTF-8 XML. It should be skipping that file.

Does that file have weird control characters in it that aren't reflected in the log?

Comment 6 Richard W.M. Jones 2017-02-09 21:59:02 UTC
Oh you're right, I've got a file in /tmp which has a non-UTF-8 name
(I think I was testing something for another bug).

cd /tmp
ls ?
''$'\324'

Anyway as you say libvirt is generating incorrect XML.

A reproducer is:

touch ''$'\324'
and in the same directory run a virt-install command.