Created attachment 832644 [details]
output of vdsClient -s 0 getVdsCaps
Description of problem:
adding cloud-init config to ubuntu 12.04 vm sometimes results in boot-loop
Version-Release number of selected component (if applicable):
vdsm version 4.12.1-4.el6.x86_64
Steps to Reproduce:
1.make an ubuntu server 12.04 x64 minimal install with just ssh in a vm on a local dc
2.shutdown the vm either from inside or via webadmin
3. start the vm via "run once" through webadmin with the following options:
set hostname, fill in network details:
IP:10.0.1.22 Netmask: 255.255.255.252 Gateway: 10.0.1.21
Fill in 3 IPs for DNS-Servers
set a new root password
select an aditional file to be placed under /root/myfile
encoded as plain text
according to vdsm.log, the cd-rom.iso with the meta-data does not get correctly
transmitted via the export-iso domain, if I interpret the log correctly.
ubuntu 12.04 boots just fine with the cd attached
the vm hangs during boot, most times during the message:
fsck from util-linux 2.20.1
/dev/Vda1: clean; 55941/1179648 files, 380701/4718336 blocks
it is accessible via VNC.
pay addition to the following line:
VM Channels Listener::DEBUG::2013-12-04 14:54:08,395::vmChannels::91::vds::(_handle_timeouts) Timeout on fileno 111.
to me this looks like a file can't get transfered? or is this message misleading?
here is the output of vdsClient -s 0 getVdsCaps in the cloudinit-vdsClient.log
Created attachment 832645 [details]
Created attachment 832647 [details]
output of:"vdsClient -s 0 list long vms test1234" (the vm)
I noticed "emulatedMachine = rhel6.4.0"
but in webadmin "Operating System" is set to "Ubuntu Precise Pangolin LTS"
I don't know if this matters?
I'm not 100% sure, but I think I can reproduce this with a clean fresh
ubuntu server 12.04.3 adm 64.iso install
when installation completes and you reboot out of the setup, you end
up in the fsck during boot of ubuntu.
seems to have nothing to do with cloud-init afaik.
maybe the installed ubuntu does not release the cd-rom after reboot?
the installer says in the end, that you should remove the cd, but how
to do this in ovirt the right way? should I shutdown the vm through an external command?
I assume the following:
if I start a vm via "run once" and attach a cd in the run once dialog, then the system gets rebooted (from inside
or outside shouldn't matter), when the system reboots, the attached cd-rom
should get automatically removed from ovirt, shouldn't it?
I uploaded the virsh dumpxml from this new vm (vr00001)
and you can see, that the ubuntu.iso is still attached
furthermore the <os> <type machine='rhel6.4.0'> bothers me, as I selected
the same goes for:
Is this a bug?
Created attachment 832694 [details]
virsh dumpxml from new vm from scratch
I installed a new ubuntu 12.04 vm
I documented every action during the installation, so I can provide additional
details if needed.
The System hangs after reboot after installation.
The CD-ROM is still attached, according to the xml.
it hangs at "fsck" during boot process.
please set needinfo on a specific person.
Created attachment 832729 [details]
Screenshot of the fsck hang
Screenshot of the fsck hang
This is how the vm was created through webadmin, step-by-step:
Based on Template: Blank (default)
Operating System: Ubuntu Precise Pangolin LTS
Optimized for: Server
Description: ubuntu12.04.3 grundimage001
Stateless:No (default) Start in Pause Mode: No (default) Delecte Protection: Yes
Memory Size: 2048 MB
Total Virtual CPUs: 2
Cores per Virtual Socket: 1 (default)
Virtual Sockets: 2 (default)
Time Zone:default (GMT) (grayed out) (default)
Domain: (grayed out) (default)
VNC Keyboard Layout: de
USB Support: Disabled (grayed out) (default)
Monitors: 1 (grayed out) Single PCI: no (grayed out) (default)
Smartcard Enabled: no (grayed out) (default)
Disable strict user checking: yes (default)
Soundcard enabled: no (default)
VirtIO Console Device Enabled: no (default)
Highly Available: no (default)
Priority for Run/Migration queue: low (default)
Watchdog Model: empty (default)
Watchdog Action: none (default)
CPU Shares: Disabled (default)
CPU Pinning topology: (grayed out) (default)
Physical Memory Guaranteed: 1024 MB
Memory Balloon Device Enabled: yes (default)
Storage Allocation: (Available only when a template is selected)
Thin (grayed out)
Clone (grayed out)
VirtIO-SCSI Enabled: yes (default)
First Device: Hard Disk (default)
Second Device: [None] (default)
Attach CD: no (default)
Linux Boot Options:
kernel path: empty (default)
initrd path: empty (default)
kernel parameters: empty (default)
Configure Virtual Disks:
Attach Disk: no (default)
Internal or External(Direct Lun): Internal (default)
Description: empty (default)
Allocation Policy: Thin Provision (default)
Storage Domain: DataMasterLc1 (default) -> local storage DC
Wipe After Delete: (grayed out) (default)
Is Bootable: yes (default)
Is Shareable: no (default)
Attach Floppy: no(default)
Attach CD: yes ubuntu-12.04.3-server-amd64.iso
Run Stateless: no (default)
Start in Pause Mode: no (default)
All other options (linux boot options, initial run, host(any host in cluster), display protocol(vnc), custom properties) stay on default (empty if not noted otherwise)
Created attachment 833027 [details]
virsh dumpxml after reboot
I rebooted the new vm via webadmin ("shutdown" and then "run").
but as you can see, the cd-rom was still attached.
the boot still hangs at fsck.
I did another shutdown on the vm via webadmin and then booted it again
via "run once" but without specifying a cd-rom, now the system boots successfull.
The dumpxml differs in the following way:
diff -y --suppress-common-lines vr00001dumpxml_after_run_through_webadmin vr00001dumpxml_after_run_once_through_webadmin
<domain type='kvm' id='20'> | <domain type='kvm' id='21'>
<boot dev='hd'/> <
> <boot order='1'/>
> <boot order='2'/>
<graphics type='vnc' port='5904' autoport='yes' listen='0 | <graphics type='vnc' port='5904' autoport='yes' listen='0
<label>system_u:system_r:svirt_t:s0:c762,c886</label> | <label>system_u:system_r:svirt_t:s0:c279,c348</label>
<imagelabel>system_u:object_r:svirt_image_t:s0:c762,c886< | <imagelabel>system_u:object_r:svirt_image_t:s0:c279,c348<
so my conclusion so far:
simple "run" does state <"boot dev='hd'/>
but does not state any boot order.
I'll dig into the ubuntu logs. maybe it's an issue on the ubuntu side.
Created attachment 833042 [details]
ubuntu hangs on boot, after cloud-init cd attached via run once
this time, the same vm processed a little further than "fsck".
but it maybe that these messages got suppressed during the earlier boots.
I added "debug --verbose" to GRUB_CMDLINE_LINUX_DEFAULT="" in /etc/default/grub
and updated grub via update-grub (as root).
I started the vm via "run once" and submitted cloud-init configuration data.
I also enabled the virtio serial console to try to connect via virsh.
The connection gets started, but I can't do anything with it:
the terminal hangs after displaying:
Connected to domain vr00001
Escape character is ^]
Another Test with Ubuntu 13.10 in the same local storage DC does not show
If this should be kernel related, here are the kernels:
ubuntu 12.04.3 LTS: 3.8.0-29-generic #42~precise1-Ubuntu SMP x86_64 (does not work)
ubuntu 13.10: 3.11.0-14-generic #21-Ubuntu SMP x86_64 (does work)
After manually ejecting the cd via webadmin, the vm can be rebooted and starts
correctly(but without the needed cd!)
I added "console=ttyS0,38400n8" to GRUB_CMDLINE_LINUX_DEFAULT="" in /etc/default/grub
and updated grub via update-grub (as root).
This solved the problem for me.
Should oVirt handle the use of ttys in a different way to prevent this happening?
well, not in a generic way, it's up to the distro what is the default kernel line and where does it redirect console to. The serial console typically needs an explicit redirection (by console=ttySx) AFAIK for all serial consoles. It's completely different from "real" console access at QEMU display level via VNC or SPICE.
I'm curious, is the port speed specification needed?
Regarding port speed: I'm atm testing this.
If I don't add a reasonable high port speed the system does not boot, instead the cpu usage in webadmin for this vm goes to 100%. so I'd say, yes, it is needed indeed.
setting target release to current version for consideration and review. please do not push non-RFE bugs to an undefined target release to make sure bugs are reviewed for relevancy, fix, closure, etc.
Regarding target release 3.4.0:
I doubt that it will work in 3.4. , since I'm not aware of anyone doing any
more testing or even bugfixing on this.
Also, there's still the problem that the attached "run once" iso is still
attached after a reboot.
In my opinion, iso should not get loaded when the vm reboots.
Should I create a separate BZ for this?
(In reply to Sven Kieske from comment #17)
> Also, there's still the problem that the attached "run once" iso is still
> attached after a reboot.
> In my opinion, iso should not get loaded when the vm reboots.
> Should I create a separate BZ for this?
the problem is qemu doesn't know to handle detaching the iso as part of reboot, so either qemu needs this logic, or vdsm needs to detect the qemu reboot, detach the iso, re-start qemu. this requires a separate RFE.
I did file an RFE as BZ1054070
RFE from comment #19 should get into 3.6, the original issue is guest OS specific setting
Sven, I did't understand..so you had the "serial console" option enabled all the time? Does it work ok without it? Was the CD issues related to comment #12?
(In reply to Michal Skrivanek from comment #20)
> RFE from comment #19 should get into 3.6, the original issue is guest OS
> specific setting
> Sven, I did't understand..so you had the "serial console" option enabled all
> the time? Does it work ok without it? Was the CD issues related to comment
As per Comment #12:
No I did test it without serial console option
which leads to the vm not booting successful.
When I add this option it boots.
This is the case when you fire up a vm, cloned from a template
and start it the very first time with cloud-init metadata (which gets attached as a cd-rom).
It seems to be only the case for ubuntu 12.04 (at least centos6.5 and debian7 work okay, I didn't test anything else).
Please ask if you need more information.
I hope I can re-test this soon with a newer ovirt-engine version.
hm, weird. likely some kernel/virtio issue on that specific ubuntu version.
this low prio bug didn't make it for 3.6 beta cutoff, moving to 4.0
This is an automated message.
This Bugzilla report has been opened on a version which is not maintained anymore.
Please check if this bug is still relevant in oVirt 3.5.4.
If it's not relevant anymore, please close it (you may use EOL or CURRENT RELEASE resolution)
If it's an RFE please update the version to 4.0 if still relevant.
This is an automated message.
This Bugzilla report has been opened on a version which is not maintained
Please check if this bug is still relevant in oVirt 3.5.4 and reopen if still
I don't know if this still is an issue in 4.0.
Given that ubuntu 12.04 is also not the latest release anymore I'm fine with this being closed.