Created attachment 832644 [details] output of vdsClient -s 0 getVdsCaps Description of problem: adding cloud-init config to ubuntu 12.04 vm sometimes results in boot-loop Version-Release number of selected component (if applicable): ovirt-engine 3.3.1-2 vdsm version 4.12.1-4.el6.x86_64 How reproducible: Steps to Reproduce: 1.make an ubuntu server 12.04 x64 minimal install with just ssh in a vm on a local dc 2.shutdown the vm either from inside or via webadmin 3. start the vm via "run once" through webadmin with the following options: set hostname, fill in network details: IP:10.0.1.22 Netmask: 255.255.255.252 Gateway: 10.0.1.21 Fill in 3 IPs for DNS-Servers set a new root password select an aditional file to be placed under /root/myfile encoded as plain text Actual results: according to vdsm.log, the cd-rom.iso with the meta-data does not get correctly transmitted via the export-iso domain, if I interpret the log correctly. Expected results: ubuntu 12.04 boots just fine with the cd attached Additional info: the vm hangs during boot, most times during the message: fsck from util-linux 2.20.1 /dev/Vda1: clean; 55941/1179648 files, 380701/4718336 blocks it is accessible via VNC. pay addition to the following line: VM Channels Listener::DEBUG::2013-12-04 14:54:08,395::vmChannels::91::vds::(_handle_timeouts) Timeout on fileno 111. to me this looks like a file can't get transfered? or is this message misleading? here is the output of vdsClient -s 0 getVdsCaps in the cloudinit-vdsClient.log
Created attachment 832645 [details] vdsm.log (excerpt)
Created attachment 832647 [details] output of:"vdsClient -s 0 list long vms test1234" (the vm) I noticed "emulatedMachine = rhel6.4.0" but in webadmin "Operating System" is set to "Ubuntu Precise Pangolin LTS" I don't know if this matters?
I'm not 100% sure, but I think I can reproduce this with a clean fresh ubuntu server 12.04.3 adm 64.iso install when installation completes and you reboot out of the setup, you end up in the fsck during boot of ubuntu. seems to have nothing to do with cloud-init afaik. maybe the installed ubuntu does not release the cd-rom after reboot? the installer says in the end, that you should remove the cd, but how to do this in ovirt the right way? should I shutdown the vm through an external command? I assume the following: if I start a vm via "run once" and attach a cd in the run once dialog, then the system gets rebooted (from inside or outside shouldn't matter), when the system reboots, the attached cd-rom should get automatically removed from ovirt, shouldn't it? I uploaded the virsh dumpxml from this new vm (vr00001) and you can see, that the ubuntu.iso is still attached furthermore the <os> <type machine='rhel6.4.0'> bothers me, as I selected Ubuntu the same goes for: <system> .. <entry name='version'>6-4.el6.centos.10</entry> .. </system> Is this a bug?
Created attachment 832694 [details] virsh dumpxml from new vm from scratch I installed a new ubuntu 12.04 vm I documented every action during the installation, so I can provide additional details if needed. The System hangs after reboot after installation. The CD-ROM is still attached, according to the xml. it hangs at "fsck" during boot process.
please set needinfo on a specific person.
Created attachment 832729 [details] Screenshot of the fsck hang Screenshot of the fsck hang
This is how the vm was created through webadmin, step-by-step: General: Cluster: localcluster1/localcenter1 Based on Template: Blank (default) Operating System: Ubuntu Precise Pangolin LTS Optimized for: Server Name: vr00001 Description: ubuntu12.04.3 grundimage001 Comment: (default) Stateless:No (default) Start in Pause Mode: No (default) Delecte Protection: Yes nic1: datanet15/datanet15 System: Memory Size: 2048 MB Total Virtual CPUs: 2 Cores per Virtual Socket: 1 (default) Virtual Sockets: 2 (default) Initial Run: General: Time Zone:default (GMT) (grayed out) (default) Windows: Domain: (grayed out) (default) Console: Protocol: VNC VNC Keyboard Layout: de USB Support: Disabled (grayed out) (default) Monitors: 1 (grayed out) Single PCI: no (grayed out) (default) Smartcard Enabled: no (grayed out) (default) Disable strict user checking: yes (default) Soundcard enabled: no (default) VirtIO Console Device Enabled: no (default) High Availability: Highly Available: no (default) Priority for Run/Migration queue: low (default) Watchdog: Watchdog Model: empty (default) Watchdog Action: none (default) Resource Allocation: CPU Allocation: CPU Shares: Disabled (default) CPU Pinning topology: (grayed out) (default) Memory Allocation: Physical Memory Guaranteed: 1024 MB Memory Balloon Device Enabled: yes (default) Storage Allocation: (Available only when a template is selected) Template Pprovisioning: Thin (grayed out) Clone (grayed out) VirtIO-SCSI Enabled: yes (default) Boot Options: Boot Sequence: First Device: Hard Disk (default) Second Device: [None] (default) Attach CD: no (default) Linux Boot Options: kernel path: empty (default) initrd path: empty (default) kernel parameters: empty (default) Custom Properties: none (default) [OK] Configure Virtual Disks: Attach Disk: no (default) Internal or External(Direct Lun): Internal (default) Size(GB):50 Alias:vr00001_Disk1 Description: empty (default) Interface: VirtIO Allocation Policy: Thin Provision (default) Storage Domain: DataMasterLc1 (default) -> local storage DC Wipe After Delete: (grayed out) (default) Is Bootable: yes (default) Is Shareable: no (default) [OK] [Configure Later] Run Once: Boot Options: Attach Floppy: no(default) Attach CD: yes ubuntu-12.04.3-server-amd64.iso Boot Sequence: CD-ROM Hard Disk Network (PXE) Run Stateless: no (default) Start in Pause Mode: no (default) All other options (linux boot options, initial run, host(any host in cluster), display protocol(vnc), custom properties) stay on default (empty if not noted otherwise) [OK]
Created attachment 833027 [details] virsh dumpxml after reboot I rebooted the new vm via webadmin ("shutdown" and then "run"). but as you can see, the cd-rom was still attached. the boot still hangs at fsck.
I did another shutdown on the vm via webadmin and then booted it again via "run once" but without specifying a cd-rom, now the system boots successfull. The dumpxml differs in the following way: diff -y --suppress-common-lines vr00001dumpxml_after_run_through_webadmin vr00001dumpxml_after_run_once_through_webadmin <domain type='kvm' id='20'> | <domain type='kvm' id='21'> <boot dev='hd'/> < > <boot order='1'/> > <boot order='2'/> <graphics type='vnc' port='5904' autoport='yes' listen='0 | <graphics type='vnc' port='5904' autoport='yes' listen='0 <label>system_u:system_r:svirt_t:s0:c762,c886</label> | <label>system_u:system_r:svirt_t:s0:c279,c348</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c762,c886< | <imagelabel>system_u:object_r:svirt_image_t:s0:c279,c348< so my conclusion so far: simple "run" does state <"boot dev='hd'/> but does not state any boot order. I'll dig into the ubuntu logs. maybe it's an issue on the ubuntu side.
Created attachment 833042 [details] ubuntu hangs on boot, after cloud-init cd attached via run once this time, the same vm processed a little further than "fsck". but it maybe that these messages got suppressed during the earlier boots. I added "debug --verbose" to GRUB_CMDLINE_LINUX_DEFAULT="" in /etc/default/grub and updated grub via update-grub (as root). I started the vm via "run once" and submitted cloud-init configuration data. I also enabled the virtio serial console to try to connect via virsh. The connection gets started, but I can't do anything with it: the terminal hangs after displaying: Connected to domain vr00001 Escape character is ^]
Another Test with Ubuntu 13.10 in the same local storage DC does not show this behaviour. If this should be kernel related, here are the kernels: ubuntu 12.04.3 LTS: 3.8.0-29-generic #42~precise1-Ubuntu SMP x86_64 (does not work) ubuntu 13.10: 3.11.0-14-generic #21-Ubuntu SMP x86_64 (does work) After manually ejecting the cd via webadmin, the vm can be rebooted and starts correctly(but without the needed cd!)
I added "console=ttyS0,38400n8" to GRUB_CMDLINE_LINUX_DEFAULT="" in /etc/default/grub and updated grub via update-grub (as root). This solved the problem for me. Should oVirt handle the use of ttys in a different way to prevent this happening?
well, not in a generic way, it's up to the distro what is the default kernel line and where does it redirect console to. The serial console typically needs an explicit redirection (by console=ttySx) AFAIK for all serial consoles. It's completely different from "real" console access at QEMU display level via VNC or SPICE. I'm curious, is the port speed specification needed?
Regarding port speed: I'm atm testing this. If I don't add a reasonable high port speed the system does not boot, instead the cpu usage in webadmin for this vm goes to 100%. so I'd say, yes, it is needed indeed.
setting target release to current version for consideration and review. please do not push non-RFE bugs to an undefined target release to make sure bugs are reviewed for relevancy, fix, closure, etc.
Regarding target release 3.4.0: I doubt that it will work in 3.4. , since I'm not aware of anyone doing any more testing or even bugfixing on this.
Also, there's still the problem that the attached "run once" iso is still attached after a reboot. In my opinion, iso should not get loaded when the vm reboots. Should I create a separate BZ for this?
(In reply to Sven Kieske from comment #17) > Also, there's still the problem that the attached "run once" iso is still > attached after a reboot. > > In my opinion, iso should not get loaded when the vm reboots. > > Should I create a separate BZ for this? the problem is qemu doesn't know to handle detaching the iso as part of reboot, so either qemu needs this logic, or vdsm needs to detect the qemu reboot, detach the iso, re-start qemu. this requires a separate RFE.
I did file an RFE as BZ1054070
RFE from comment #19 should get into 3.6, the original issue is guest OS specific setting Sven, I did't understand..so you had the "serial console" option enabled all the time? Does it work ok without it? Was the CD issues related to comment #12?
(In reply to Michal Skrivanek from comment #20) > RFE from comment #19 should get into 3.6, the original issue is guest OS > specific setting > > Sven, I did't understand..so you had the "serial console" option enabled all > the time? Does it work ok without it? Was the CD issues related to comment > #12? As per Comment #12: No I did test it without serial console option which leads to the vm not booting successful. When I add this option it boots. This is the case when you fire up a vm, cloned from a template and start it the very first time with cloud-init metadata (which gets attached as a cd-rom). It seems to be only the case for ubuntu 12.04 (at least centos6.5 and debian7 work okay, I didn't test anything else). Please ask if you need more information. I hope I can re-test this soon with a newer ovirt-engine version.
hm, weird. likely some kernel/virtio issue on that specific ubuntu version.
this low prio bug didn't make it for 3.6 beta cutoff, moving to 4.0
This is an automated message. This Bugzilla report has been opened on a version which is not maintained anymore. Please check if this bug is still relevant in oVirt 3.5.4. If it's not relevant anymore, please close it (you may use EOL or CURRENT RELEASE resolution) If it's an RFE please update the version to 4.0 if still relevant.
This is an automated message. This Bugzilla report has been opened on a version which is not maintained anymore. Please check if this bug is still relevant in oVirt 3.5.4 and reopen if still an issue.
I don't know if this still is an issue in 4.0. Given that ubuntu 12.04 is also not the latest release anymore I'm fine with this being closed.