Bug 787023 - aeolus image build is having trouble w/ multiple targets
Summary: aeolus image build is having trouble w/ multiple targets
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: CloudForms Cloud Engine
Classification: Retired
Component: imagefactory
Version: 1.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: beta6
Assignee: Ian McLeod
QA Contact: Martin Kočí
URL:
Whiteboard:
: 782076 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-02 22:49 UTC by dgao
Modified: 2012-05-15 20:21 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-05-15 20:21:47 UTC


Attachments (Terms of Use)
image_tmpl (742 bytes, application/octet-stream)
2012-02-02 22:49 UTC, dgao
no flags Details
failed_screenshot (6.04 KB, image/png)
2012-02-02 22:50 UTC, dgao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2012:0588 0 normal SHIPPED_LIVE new packages: imagefactory 2012-05-15 22:31:27 UTC

Description dgao 2012-02-02 22:49:15 UTC
Building w/ --target ec2,rhevm,vsphere produces this traceback in imagefactory.log:

2012-02-02 16:09:13,157 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Exception caught in ImageFactory
2012-02-02 16:09:14,443 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_vsphere_Builder.py", line 111, in build_upload
    libvirt_xml = self.guest.install(self.app_config["timeout"])
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1456, in install
    return self._do_install(timeout, force, 0)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1438, in _do_install
    self._wait_for_install_finish(dom, timeout)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 546, in _wait_for_install_finish
    raise oz.OzException.OzException(exc_str)
OzException: No disk activity in 300 seconds, failing.  Check screenshot at /factory-build-fe16493d-2b67-418a-9472-b672d297b911-1328216923.95.png for more detail

2012-02-02 16:09:14,579 INFO oz.Guest.RHEL6Guest pid(5822) Message: Cleaning up guest named factory-build-fe16493d-2b67-418a-9472-b672d297b911
2012-02-02 16:09:22,672 DEBUG imgfac.BuildJob.BuildJob pid(5822) Message: Builder (fe16493d-2b67-418a-9472-b672d297b911) changed status from BUILDING to FAILED
2012-02-02 16:09:22,777 DEBUG imgfac.BuildJob.BuildJob pid(5822) Message: fe16493d-2b67-418a-9472-b672d297b911 for vsphere about to exit local queue...
2012-02-02 16:09:22,813 DEBUG imgfac.ReservationManager.ReservationManager pid(5822) Message: EXITING queue: (local)
2012-02-02 16:09:22,908 DEBUG imgfac.ReservationManager.ReservationManager pid(5822) Message: SUCCESS EXITING queue: (local)
2012-02-02 16:09:23,017 INFO oz.Guest.RHEL6Guest pid(5822) Message: Cleaning up after install
2012-02-02 16:09:23,051 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Exception caught in ImageFactory
2012-02-02 16:09:23,051 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_vsphere_Builder.py", line 49, in build_image
    self.build_upload(build_id)
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_vsphere_Builder.py", line 111, in build_upload
    libvirt_xml = self.guest.install(self.app_config["timeout"])
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1456, in install
    return self._do_install(timeout, force, 0)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1438, in _do_install
    self._wait_for_install_finish(dom, timeout)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 546, in _wait_for_install_finish
    raise oz.OzException.OzException(exc_str)
OzException: No disk activity in 300 seconds, failing.  Check screenshot at /factory-build-fe16493d-2b67-418a-9472-b672d297b911-1328216923.95.png for more detail

Comment 1 dgao 2012-02-02 22:49:33 UTC
Created attachment 559157 [details]
image_tmpl

Comment 2 dgao 2012-02-02 22:50:38 UTC
Created attachment 559158 [details]
failed_screenshot

Comment 3 Dave Johnson 2012-02-03 00:45:58 UTC
Just curious on rpm versions...  I installed from nightly and was not able to reproduce this.  I had tried already with a template of my own when first seeing this report.  I did try the attachment 1 [details] template with the same results on qeblade28

Comment 4 Martin Kočí 2012-02-03 07:38:42 UTC
There is opened similar bug 782076, but still under the investigation. 
https://bugzilla.redhat.com/show_bug.cgi?id=782076

Comment 5 wes hayutin 2012-02-03 13:39:24 UTC
dgao is going to try a recreate using <iso> instead of <url> .. that should help eliminate any possible networking issues during the build..  It does not seem to me that 782076 and this bug are related.

Comment 6 Dave Johnson 2012-02-03 14:28:27 UTC
Re-reading comment 3 I just want to clarify because it sounds a little ambiguous... I was unable to reproduce this on qeblade28 with both dgao's template as well as one of my own.

Comment 7 Martin Kočí 2012-02-03 15:16:26 UTC
(In reply to comment #5)
> dgao is going to try a recreate using <iso> instead of <url> .. that should
> help eliminate any possible networking issues during the build..  It does not
> seem to me that 782076 and this bug are related.

Just a note that when I had this issue I was able to reproduce it even if with .iso installation.

Comment 8 dgao 2012-02-03 15:43:04 UTC
Reproduced w/ <iso>. Seems related to bug 782076. Template I used:


<template>
  <name>RHEL6_2 configserver</name>
  <os>
    <name>RHEL-6</name>
    <version>2</version>
    <arch>x86_64</arch>
    <install type='iso'>
      <iso>http://download.lab.bos.redhat.com/released/RHEL-6/6.2/Server/x86_64/iso/RHEL6.2-20111117.0-Server-x86_64-DVD1.iso</iso>
    </install>
    <rootpw>dog8code</rootpw>
  </os>
  <repositories>
    <repository name="rhel">
      <url>http://download.lab.bos.redhat.com/released//RHEL-6/6.2/Server/x86_64/os/</url>
    </repository>
    <repository name="aeolus">
      <url>http://repos.fedorapeople.org/repos/aeolus/conductor/testing/6Server/x86_64/</url>
    </repository>
  </repositories>
  <packages>
    <package name="aeolus-audrey-agent"/>
  </packages>
  <description>RHEL 6.2 w/ Audrey Client for all (bos-iso)</description>
</template>

Comment 9 Martin Kočí 2012-02-07 10:05:30 UTC
just FYI - if you recreate the issue then you do not need to do the build with multiple target. You can run just #oz-install -d 4 template.tdl and the issue is reproduced as well. You can restart libvirtd or even reboot the whole server, but the issue is still reproducible. Just after I have re-provision the server oz works as it should. 
This odd behaviour might have invoked multiple build, but it's not 100 %. Currently reproducible on qeblade21. Need to contact segfault for server access (presume he is the owner of this server).
I will try to reproduce it on qeblade22 and then provide you an access there. 
Closing the bug https://bugzilla.redhat.com/show_bug.cgi?id=782076 as a duplicate of this bug.

Comment 10 Martin Kočí 2012-02-07 10:06:03 UTC
*** Bug 782076 has been marked as a duplicate of this bug. ***

Comment 11 Ian McLeod 2012-02-07 15:21:49 UTC
Martin,

The specific error in your log indicates that Oz detected 5 minutes (300 seconds) without any disk activity on the part of the installer.  This normally indicates that the install has failed due to an error and that error is normally visible on the screen capture that Oz provides as part of its error output.

However, the screenshot indicates that Anaconda was in the middle of an active install when this error occurred.  That's unusual.

You're suggesting it is repeatable on qeblade21, yes?  Does every build fail in this way or is it intermittent?

Comment 12 Martin Kočí 2012-02-07 15:38:25 UTC
yes, it was repeatable on qeblade21 and every build fail in this way..Lemme ask segfault if qeblade21 is still available.

Comment 13 Martin Kočí 2012-02-08 15:24:52 UTC
So at this point it looks like all network related. It seems that something about the KVM setup is preventing Anaconda from getting an IP address (screenshots show Anaconda in a "configure network") however in the initial bug we have reproduced this in the middle of installation process. So moving this bug to ON_QA (so the bug is not tracked by dev) state and trying to reproduce this issue other than stuck in "configure network".
If we reproduce the issue we will move bug back to Assign{new} state.

Comment 14 dgao 2012-02-22 16:24:58 UTC
Cannot be reproduced w/ the current puddle build of imagefactory. Closing...

[root@dell-pem600-01 ~]# rpm -qa | grep "imagefactory"
imagefactory-1.0.0rc5-1.el6.noarch
imagefactory-jeosconf-ec2-fedora-1.0.0rc5-1.el6.noarch
rubygem-imagefactory-console-0.4.0-1.el6.noarch
imagefactory-jeosconf-ec2-rhel-1.0.0rc5-1.el6.noarch

Comment 15 errata-xmlrpc 2012-05-15 20:21:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0588.html


Note You need to log in before you can comment on or make changes to this bug.