Bug 752581

Summary: EC2 push fails executing euca-bundle-vol
Product: [Retired] CloudForms Cloud Engine Reporter: Dave Maley <dmaley>
Component: rubygem-aeolus-cliAssignee: Mike Orazi <morazi>
Status: CLOSED NOTABUG QA Contact: wes hayutin <whayutin>
Severity: high Docs Contact:
Priority: high    
Version: 1.0.0   
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-10 20:30:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Dave Maley 2011-11-09 22:22:19 UTC
Description of problem:
I'm seeing the below failure (in imagefactory.log) when trying to push to EC2:

2011-11-08 16:14:32,914 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Executing bundle vol command: euca-bundle-vol -c /tmp/tmpHm5vQY -k /tmp/tmpBV4yAk -u 5258-6269-5383 -e /mnt,/tmp,/
root/.ssh --arch x86_64 -d /mnt/bundles --kernel aki-427d952b -p e2e91cee-2d1b-43da-af54-0014775af521 -s 10240 --ec2cert /tmp/cert-ec2.pem --fstab /etc/fstab -v /
2011-11-08 16:18:42,930 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Terminating EC2 instance and deleting temp security group
2011-11-08 16:18:43,857 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:18:43,858 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [0 of 300 seconds elapsed]
2011-11-08 16:18:48,939 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:18:48,940 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [5 of 300 seconds elapsed]
2011-11-08 16:18:54,021 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:18:54,088 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [10 of 300 seconds elapsed]
2011-11-08 16:18:59,164 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:18:59,252 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [15 of 300 seconds elapsed]
2011-11-08 16:19:04,329 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:19:04,394 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [20 of 300 seconds elapsed]
2011-11-08 16:19:09,704 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:19:09,705 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [25 of 300 seconds elapsed]
2011-11-08 16:19:14,785 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:19:14,861 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Instance status (shutting-down) - waiting for 'terminated'. [30 of 300 seconds elapsed]
2011-11-08 16:19:20,148 WARNING boto pid(1887) Message: Error converting code (Client.UserInitiatedShutdown) to int
2011-11-08 16:19:20,441 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Removed temporary security group (imagefactory-e2e91cee-2d1b-43da-af54-0014775af521)
2011-11-08 16:19:20,442 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Exception caught in ImageFactory
2011-11-08 16:19:20,583 DEBUG imgfac.builders.BaseBuilder.Fedora_ec2_Builder pid(1887) Message: Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_ec2_Builder.py", line 443, in push_image
    credentials)
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_ec2_Builder.py", line 646, in push_image_snapshot_ec2
    stdout, stderr, retcode = self.guest.guest_execute_command(guestaddr, command)
  File "/usr/lib/python2.6/site-packages/imgfac/builders/Fedora_ec2_Builder.py", line 72, in guest_execute_command
    return super(FedoraRemoteGuest, self).guest_execute_command(guestaddr, command, timeout, tunnels)
  File "/usr/lib/python2.6/site-packages/oz/RedHat.py", line 469, in guest_execute_command
    command, timeout, tunnels)
  File "/usr/lib/python2.6/site-packages/oz/ozutil.py", line 342, in ssh_execute_command
    return subprocess_check_output(cmd)
  File "/usr/lib/python2.6/site-packages/oz/ozutil.py", line 310, in subprocess_check_output
    raise SubprocessException("'%s' failed(%d): %s" % (cmd, retcode, stderr), retcode)
SubprocessException: 'ssh -i /tmp/tmpmIct0V -F /dev/null -o ServerAliveInterval=30 -o StrictHostKeyChecking=no -o ConnectTimeout=30 -o UserKnownHostsFile=/dev/null -o PasswordAuthentication=no root.amazonaws.com euca-bundle-vol -c /tmp/tmpHm5vQY -k /tmp/tmpBV4yAk -u 5258-6269-5383 -e /mnt,/tmp,/root/.ssh --arch x86_64 -d /mnt/bundles --kernel aki-427d952b -p e2e91cee-2d1b-43da-af54-0014775af521 -s 10240 --ec2cert /tmp/cert-ec2.pem --fstab /etc/fstab -v /' failed(1): Warning: Permanently added 'ec2-107-22-33-17.compute-1.amazonaws.com,107.22.33.17' (RSA) to the list of known hosts.
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00338112 s, 310 MB/s
mke2fs 1.41.14 (22-Dec-2010)
Traceback (most recent call last):
  File "/usr/bin/euca-bundle-vol", line 492, in <module>
    main()
  File "/usr/bin/euca-bundle-vol", line 467, in main
    ancestor_ami_ids,
  File "/usr/lib/python2.7/site-packages/euca2ools/__init__.py", line 1008, in generate_manifest
    user_priv_key = RSA.load_key(private_key_path)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 351, in load_key
    return load_key_bio(bio, callback)
  File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 372, in load_key_bio
    rsa_error()
  File "/usr/lib64/python2.7/site-packages/M2Crypto/RSA.py", line 302, in rsa_error
    raise RSAError, m2.err_reason_error_string(m2.err_get_error())
M2Crypto.RSA.RSAError: no start line


Version-Release number of selected component (if applicable):
rubygem-aeolus-cli-0.1.0-3
imagefactory-0.8.0-1
euca2ools-1.3.1-4.el6_0 / euca2ools-1.3.1-12.fc15


How reproducible:
always


Steps to Reproduce:
1. build image for EC2:

  aeolus-image build --target ec2 --template <template>

2. attempt to push image to EC2:

  aeolus-image push --provider ec2-us-east-1 --account <acct> \
   --image <image> --build <build> --targetimage <targetimage>

  
Actual results:
push fails w/ above errors


Expected results:
push succeeds


Additional info:
I've reproduced this on rhel6.1 & F15 (x86_64) on bare-metal & kvm guests

Comment 1 wes hayutin 2011-11-09 23:44:58 UTC
This could be a timing issue w/ ec2 we've hit quite often where the instance is not yet ready for ssh connections.. having dev take a look to confirm.

Comment 2 wes hayutin 2011-11-10 16:22:01 UTC
<weshay_hm> sloranz, hey.. got ur email... so you make a great point.. is that error due to bad ec2 creds?
<sloranz> I'm not entirely sure, but that seems like a more likely place to look than ssh

Comment 3 Dave Maley 2011-11-10 20:24:19 UTC
(In reply to comment #2)
> <weshay_hm> sloranz, hey.. got ur email... so you make a great point.. is that
> error due to bad ec2 creds?
> <sloranz> I'm not entirely sure, but that seems like a more likely place to
> look than ssh

This does appear to be related to my account as I was able to successfully push using spr's ec2 account.  However I'm able to add the provider account in conductor so the creds are working for that piece.

Any pointers on what might need fixing in my acct?

Comment 4 Dave Maley 2011-11-16 17:33:04 UTC
For the benefit of anybody else who might run into this ... the problem was that I was mistakenly using the CloudFront keypair instead of the EC2 keypair.