Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 787023

Summary:

aeolus image build is having trouble w/ multiple targets

Product:

[Retired] CloudForms Cloud Engine

Reporter:

dgao

Component:

imagefactory

Assignee:

Ian McLeod <imcleod>

Status:

CLOSED ERRATA

QA Contact:

Martin Kočí <mkoci>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

1.0.0

CC:

akarol, brad, dajohnso, deltacloud-maint, dgao, hbrock, imcleod, mkoci, morazi, ssachdev, whayutin

Target Milestone:

beta6

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2012-05-15 20:21:47 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
image_tmpl	none
failed_screenshot	none

Description dgao 2012-02-02 22:49:15 UTC

Building w/ --target ec2,rhevm,vsphere produces this traceback in imagefactory.log:

2012-02-02 16:09:13,157 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Exception caught in ImageFactory
2012-02-02 16:09:14,443 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_vsphere_Builder.py", line 111, in build_upload
    libvirt_xml = self.guest.install(self.app_config["timeout"])
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1456, in install
    return self._do_install(timeout, force, 0)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1438, in _do_install
    self._wait_for_install_finish(dom, timeout)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 546, in _wait_for_install_finish
    raise oz.OzException.OzException(exc_str)
OzException: No disk activity in 300 seconds, failing.  Check screenshot at /factory-build-fe16493d-2b67-418a-9472-b672d297b911-1328216923.95.png for more detail

2012-02-02 16:09:14,579 INFO oz.Guest.RHEL6Guest pid(5822) Message: Cleaning up guest named factory-build-fe16493d-2b67-418a-9472-b672d297b911
2012-02-02 16:09:22,672 DEBUG imgfac.BuildJob.BuildJob pid(5822) Message: Builder (fe16493d-2b67-418a-9472-b672d297b911) changed status from BUILDING to FAILED
2012-02-02 16:09:22,777 DEBUG imgfac.BuildJob.BuildJob pid(5822) Message: fe16493d-2b67-418a-9472-b672d297b911 for vsphere about to exit local queue...
2012-02-02 16:09:22,813 DEBUG imgfac.ReservationManager.ReservationManager pid(5822) Message: EXITING queue: (local)
2012-02-02 16:09:22,908 DEBUG imgfac.ReservationManager.ReservationManager pid(5822) Message: SUCCESS EXITING queue: (local)
2012-02-02 16:09:23,017 INFO oz.Guest.RHEL6Guest pid(5822) Message: Cleaning up after install
2012-02-02 16:09:23,051 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Exception caught in ImageFactory
2012-02-02 16:09:23,051 DEBUG imgfac.builders.BaseBuilder.RHEL6_vsphere_Builder pid(5822) Message: Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_vsphere_Builder.py", line 49, in build_image
    self.build_upload(build_id)
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_vsphere_Builder.py", line 111, in build_upload
    libvirt_xml = self.guest.install(self.app_config["timeout"])
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1456, in install
    return self._do_install(timeout, force, 0)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 1438, in _do_install
    self._wait_for_install_finish(dom, timeout)
  File "/usr/lib/python2.6/site-packages/oz/Guest.py", line 546, in _wait_for_install_finish
    raise oz.OzException.OzException(exc_str)
OzException: No disk activity in 300 seconds, failing.  Check screenshot at /factory-build-fe16493d-2b67-418a-9472-b672d297b911-1328216923.95.png for more detail

Comment 1 dgao 2012-02-02 22:49:33 UTC

Created attachment 559157 [details]
image_tmpl

Comment 2 dgao 2012-02-02 22:50:38 UTC

Created attachment 559158 [details]
failed_screenshot

Comment 3 Dave Johnson 2012-02-03 00:45:58 UTC

Just curious on rpm versions...  I installed from nightly and was not able to reproduce this.  I had tried already with a template of my own when first seeing this report.  I did try the attachment 1 [details] template with the same results on qeblade28

Comment 4 Martin Kočí 2012-02-03 07:38:42 UTC

There is opened similar bug 782076, but still under the investigation. 
https://bugzilla.redhat.com/show_bug.cgi?id=782076

Comment 5 wes hayutin 2012-02-03 13:39:24 UTC

dgao is going to try a recreate using <iso> instead of <url> .. that should help eliminate any possible networking issues during the build..  It does not seem to me that 782076 and this bug are related.

Comment 6 Dave Johnson 2012-02-03 14:28:27 UTC

Re-reading comment 3 I just want to clarify because it sounds a little ambiguous... I was unable to reproduce this on qeblade28 with both dgao's template as well as one of my own.

Comment 7 Martin Kočí 2012-02-03 15:16:26 UTC

(In reply to comment #5)
> dgao is going to try a recreate using <iso> instead of <url> .. that should
> help eliminate any possible networking issues during the build..  It does not
> seem to me that 782076 and this bug are related.

Just a note that when I had this issue I was able to reproduce it even if with .iso installation.

Comment 8 dgao 2012-02-03 15:43:04 UTC

Reproduced w/ <iso>. Seems related to bug 782076. Template I used:


<template>
  <name>RHEL6_2 configserver</name>
  <os>
    <name>RHEL-6</name>
    <version>2</version>
    <arch>x86_64</arch>
    <install type='iso'>
      <iso>http://download.lab.bos.redhat.com/released/RHEL-6/6.2/Server/x86_64/iso/RHEL6.2-20111117.0-Server-x86_64-DVD1.iso</iso>
    </install>
    <rootpw>dog8code</rootpw>
  </os>
  <repositories>
    <repository name="rhel">
      <url>http://download.lab.bos.redhat.com/released//RHEL-6/6.2/Server/x86_64/os/</url>
    </repository>
    <repository name="aeolus">
      <url>http://repos.fedorapeople.org/repos/aeolus/conductor/testing/6Server/x86_64/</url>
    </repository>
  </repositories>
  <packages>
    <package name="aeolus-audrey-agent"/>
  </packages>
  <description>RHEL 6.2 w/ Audrey Client for all (bos-iso)</description>
</template>

Comment 9 Martin Kočí 2012-02-07 10:05:30 UTC

just FYI - if you recreate the issue then you do not need to do the build with multiple target. You can run just #oz-install -d 4 template.tdl and the issue is reproduced as well. You can restart libvirtd or even reboot the whole server, but the issue is still reproducible. Just after I have re-provision the server oz works as it should. 
This odd behaviour might have invoked multiple build, but it's not 100 %. Currently reproducible on qeblade21. Need to contact segfault for server access (presume he is the owner of this server).
I will try to reproduce it on qeblade22 and then provide you an access there. 
Closing the bug https://bugzilla.redhat.com/show_bug.cgi?id=782076 as a duplicate of this bug.

Comment 10 Martin Kočí 2012-02-07 10:06:03 UTC

*** Bug 782076 has been marked as a duplicate of this bug. ***

Comment 11 Ian McLeod 2012-02-07 15:21:49 UTC

Martin,

The specific error in your log indicates that Oz detected 5 minutes (300 seconds) without any disk activity on the part of the installer.  This normally indicates that the install has failed due to an error and that error is normally visible on the screen capture that Oz provides as part of its error output.

However, the screenshot indicates that Anaconda was in the middle of an active install when this error occurred.  That's unusual.

You're suggesting it is repeatable on qeblade21, yes?  Does every build fail in this way or is it intermittent?

Comment 12 Martin Kočí 2012-02-07 15:38:25 UTC

yes, it was repeatable on qeblade21 and every build fail in this way..Lemme ask segfault if qeblade21 is still available.

Comment 13 Martin Kočí 2012-02-08 15:24:52 UTC

So at this point it looks like all network related. It seems that something about the KVM setup is preventing Anaconda from getting an IP address (screenshots show Anaconda in a "configure network") however in the initial bug we have reproduced this in the middle of installation process. So moving this bug to ON_QA (so the bug is not tracked by dev) state and trying to reproduce this issue other than stuck in "configure network".
If we reproduce the issue we will move bug back to Assign{new} state.

Comment 14 dgao 2012-02-22 16:24:58 UTC

Cannot be reproduced w/ the current puddle build of imagefactory. Closing...

[root@dell-pem600-01 ~]# rpm -qa | grep "imagefactory"
imagefactory-1.0.0rc5-1.el6.noarch
imagefactory-jeosconf-ec2-fedora-1.0.0rc5-1.el6.noarch
rubygem-imagefactory-console-0.4.0-1.el6.noarch
imagefactory-jeosconf-ec2-rhel-1.0.0rc5-1.el6.noarch

Comment 15 errata-xmlrpc 2012-05-15 20:21:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0588.html