Description of problem: Provisioning of a xen paravirt guest through virt-install hangs when kernel command line arguments are provided in 'x' option of virt-install. Version-Release number of selected component (if applicable): python-virtinst-0.400.3-5.el5-noarch libvirt-0.6.3-20.1.el5_4-x86_64 libvirt-python-0.6.3-20.1.el5_4-x86_64 How reproducible: Consistently Steps to Reproduce: 1.Run the following command: virt-install --vnc -n s0301 -r 1024 --vcpus=1 -f /dev/VolGroup01/s0301-xvda -f /dev/VolGroup01/s0301-xvdb -f /dev/VolGroup01/s0301-xvdc -f /dev/VolGroup01/s0301-xvdd -b virbr0 -l http://gss-sat.gsslab.pnq.redhat.com/ks/dist/ks-rhel-i386-server-5-u1 -p -x "ks=http://gss-sat.gsslab.pnq.redhat.com/ks/cfg/org/1/label/linto-rhel5U1 ip=192.168.122.121 netmask=255.255.255.0 gateway=192.168.122.1 dns=192.168.122.1 t=abcd1234" 2.virt-install hangs as described in 'Actual results' Actual results: virt-install hangs at the following stage: Starting install... Retrieving file vmlinuz... | 2.0 MB 00:00 Retrieving file initrd.img... | 5.0 MB 00:00 At this time, virt-install can be seen to consume almost 100% cpu. Expected results: Guest provisioning should continue without hanging as follows: Starting install... Retrieving file vmlinuz... | 2.0 MB 00:00 Retrieving file initrd.img... | 5.0 MB 00:00 Creating domain... | 0 B 00:01 Here, the virt-viewer screen pops up at this stage to continue with the guest installation. virt-install should run smoothly without consuming much cpu resources. Additional info: 1.Guest installation works fine using virt-manager with the same options that was showing a problem with virt-install. 2.The customer sees the issue when any arbitrary string of length 10 characters is specified in the 't' option.Guest provisioning through virt-install succeeds if the string specified in the 't' option is longer or shorter than 10. Red Hat Support has been able to reproduce it with when any arbitrary string of length 8 characters is specified in the 't' option. However,the length of the string in the 't' option or the length of the string specified in the 'x' option of virt-install may not be relevant.
That looks weird, a priori not libvirt related since the equivalent is working fine with virt-manager, Daniel
Hmm, I can't really reproduce. I need a bit more info. Is virt-install really what takes up 100% CPU? Please verify with top or similar that it's not libvirt or xen Can a reproducer try to simplify the command line? For example, are 4 block devices required to reproduce this? What about one block dev, or one flat file, or even using --nodisks? Is an explicit -b option required, or does omitting it and using the default xen bridge work fine? Please also attach the full output of virt-install --debug of a failed run.
Created attachment 442210 [details] Output of virt-install --debug
Created attachment 442448 [details] XML that reproduces issue Okay, with Linto's help I've managed to reproduce on my own machine. This appears to be either a libvirt or xen issue. With the attached XML, virsh create $XMLFILE hangs. Doesn't even seem to require the kernel,initrd, or disk images to be present on the machine.
Here's the backtrace: Thread 1 (Thread 0x2aaaac3ea610 (LWP 15918)): #0 0x00002aaaabaad236 in _IO_default_xsputn (f=0x7fffffffb1d0, data=0x651410, n=1082) at genops.c:457 #1 0x00002aaaaba85503 in _IO_vfprintf_internal (s=0x7fffffffb1d0, format=<value optimized out>, ap=0x7fffffffb360) at vfprintf.c:1587 #2 0x00002aaaabb25f38 in ___vsnprintf_chk (s=0x648691 "", maxlen=<value optimized out>, flags=1, slen=<value optimized out>, format=0x2aaaaad52fd1 "%s", args=0x7fffffffb360) at vsnprintf_chk.c:65 #3 0x00002aaaaacdcfb6 in virBufferVSprintf (buf=0x7fffffffb470, format=0x2aaaaad52fd1 "%s") at buf.c:229 #4 0x00002aaaaad2bfe8 in xend_op_ext (xend=0x624b30, name=<value optimized out>, key=0x2aaaaad52ce8 "config") at xend_internal.c:521 #5 xend_op (xend=0x624b30, name=<value optimized out>, key=0x2aaaaad52ce8 "config") at xend_internal.c:567 #6 0x00002aaaaad2f74e in xenDaemonDomainCreateXML (xend=0x624b30, sexpr=0x648280 "(vm (name 's0301')(memory 1024)(maxmem 1024)(vcpus 1)(uuid 'b48911fb-b655-6958-4da6-496b5f5c1ef9')(on_poweroff 'destroy')(on_reboot 'destroy')(on_crash 'destroy')(image (linux (kernel '/var/lib/xen/vi"...) at xend_internal.c:968 #7 0x00002aaaaad2f875 in xenDaemonCreateXML (conn=0x624b30, xmlDesc=<value optimized out>, flags=<value optimized out>) at xend_internal.c:3986 #8 0x00002aaaaad28344 in xenUnifiedDomainCreateXML (conn=0x648691, xmlDesc=0x646270 "<domain type='xen'>\n <name>s0301</name>\n <currentMemory>1048576</currentMemory>\n <memory>1048576</memory>\n <uuid>b48911fb-b655-6958-4da6-496b5f5c1ef9</uuid>\n <os>\n <type arch='x86_64'>linux</t"..., flags=0) at xen_unified.c:579 #9 0x00002aaaaacf2eb3 in virDomainCreateXML (conn=0x624b30, xmlDesc=0x646270 "<domain type='xen'>\n <name>s0301</name>\n <currentMemory>1048576</currentMemory>\n <memory>1048576</memory>\n <uuid>b48911fb-b655-6958-4da6-496b5f5c1ef9</uuid>\n <os>\n <type arch='x86_64'>linux</t"..., flags=0) at libvirt.c:1582 #10 0x000000000041075a in cmdCreate (ctl=0x7fffffffc1a0, cmd=0x61f5c0) at virsh.c:941 ---Type <return> to continue, or q <return> to quit--- #11 0x0000000000410bdc in vshCommandRun (ctl=0x7fffffffc1a0, cmd=0x61f5c0) at virsh.c:6556 #12 0x00000000004114c4 in main (argc=1, argv=0x7fffffffc308) at virsh.c:7509 (gdb) l buf.c:229 224 return; 225 226 size = buf->size - buf->use - 1; 227 va_start(argptr, format); 228 va_copy(locarg, argptr); 229 while (((count = vsnprintf(&buf->content[buf->use], size, format, 230 locarg)) < 0) || (count >= size - 1)) { 231 buf->content[buf->use] = 0; 232 va_end(locarg); 233 (gdb) 234 grow_size = (count > 1000) ? count : 1000; 235 if (virBufferGrow(buf, grow_size) < 0) 236 return; 237 238 size = buf->size - buf->use - 1; 239 va_copy(locarg, argptr); 240 } 241 va_end(locarg); 242 buf->use += count; 243 buf->content[buf->use] = '\0'; So something wrong with the while() check that makes us spin forever?
Created attachment 442484 [details] Fix off-by-one error that caused an infinite loop Patch fixes things for when applied against the 5.5 package. The reason this was so picky is that the generated Xen Sexpr had to be _exactly_ the right size to trigger this bug :)
Patch was sent upstream as https://www.redhat.com/archives/libvir-list/2010-September/msg00020.html
Sent to rhvirt-patches as http://post-office.corp.redhat.com/archives/rhvirt-patches/2010-September/msg00371.html (18af6f4e64e97095bc95df25fb4a092cbbd6474c upstream)
Fix built into libvirt-0.8.2-4.el5
Verified this bug with libvirt-0.8.2-8.el5 on RHEL5u6 Server X86_64 (KVM and Xenpv), RHEL5u6 Client i386 Xenpv and RHEL5u6 Server IA64 Xenpv, and PASSED. Reproduced on old libvirt package, such as: libvirt-0.6.3-20.el5 # virt-install -n s0301 -r 1024 -f /dev/VolGroup01/s0301-xvda -f /dev/VolGroup01/s0301-xvdb -f /dev/VolGroup01/s0301-xvdc -f /dev/VolGroup01/s0301-xvdd -b virbr0 -l http://download.englab.nay.redhat.com/pub/rhel/released/RHEL-5-Server/U4/i386/os/ -p -x "ks=http://home.englab.nay.redhat.com/~nzhang/http/test.cfg ip=192.168.122.121 netmask=255.255.255.0 gateway=192.168.122.1 dns=192.168.122.1 t=abcd1234" Starting install... Retrieving file .treeinfo... | 437 B 00:00 Retrieving file vmlinuz... | 2.1 MB 00:01 Retrieving file initrd.img... | 6.6 MB 00:00 Creating domain... The guest is installed successfully on libvirt-0.8.2-8.el5, so this bug is fixed.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0060.html