Bug 698689 - Instances: periodic ssh failures
Summary: Instances: periodic ssh failures
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: CloudForms Cloud Engine
Classification: Retired
Component: aeolus-conductor
Version: 0.3.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
Assignee: Ian McLeod
QA Contact: wes hayutin
URL: http://dell-pe1955-01.rhts.eng.bos.re...
Whiteboard:
Depends On:
Blocks: ce-beta ce-ami
TreeView+ depends on / blocked
 
Reported: 2011-04-21 14:31 UTC by wes hayutin
Modified: 2012-01-26 12:28 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)
ss (20.34 KB, image/png)
2011-04-21 14:31 UTC, wes hayutin
no flags Details

Description wes hayutin 2011-04-21 14:31:41 UTC
Created attachment 493865 [details]
ss

[root@dell-pe1955-01 html]# rpm -qa | grep aeolus
aeolus-conductor-0.0.3-6.el6.x86_64
aeolus-configure-2.0.0-8.el6.noarch
aeolus-conductor-doc-0.0.3-6.el6.x86_64
aeolus-conductor-daemons-0.0.3-6.el6.x86_64
[root@dell-pe1955-01 html]# 


I have figured out why this *starts* to happen but once the keys are out of sync, every instance you create w/ cloud engine will have a key mismatch w/ ec2 and the user will not be able to ssh into the instance with the key provided by cloud engine.

recreate:
1. This bug only seems to rear its ugly head after *some* amount of use from cloud engine which would include, creating templates, instances, provider accounts etc.. restarting services, executing aeolus-cleanup/configure a few times.

2. create an instance
PROPERTIES FOR TESTKEY04

Name testkey04
Status running
Public Addresses ec2-50-17-141-23.compute-1.amazonaws.com
Private Addresses ip-10-202-23-91.ec2.internal
Operating system Fedora 13
Provider ec2-us-east-1
Base Template template_simplenddn6
Architecture x86_64
Memory 7680.0
Storage 850.0
Instantiation Time 21-Apr-2011 13:34:49
Uptime 56.182108
Current Alerts 0
Console Connection via SSH
SSH key Download
Owner aeolus user
Shared to N/A


[whayutin@minidoe Downloads]$ cat testkey04_rsa_1303393034.09075.pem
-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA8QpFGXO/fEBr9gLzFwJwNhmmG+/DK/XoN1vGY4CHnhovQrOCTappJN/fSzGT
QgVR+B8ckTDgqdPBvT7UJRYaWfAasZCgECT8az67G/mG0F0ZbjUwonddiqmWxsnsPoNLYGgdDFrx
0LbhRxgmgcJXXRGPveECGL8soo2Pm5e44sm1gWgUrJmang0ZAXwuQYJX1drsmGD5lpstvS1HNMZJ
3TIL1g6R9oDGEVK8+KDqDeqcC5U7dqidVCCkvYshlZ0b1N/Me6oHps672NnXz6TQuh4CyuMpwO92
heS18FPGlXB5GEFOUZ9PBGvvvc0Pp4rTk5i38WC8FoCfSraOQWfKVQIDAQABAoIBAQDT445IyIp8
5HjDY2ZkRL11oWh3SHaOj3YaK/AjChqtriD7hqe2NGaHhtFY3XSw5dJfxqGrNIdaYw79mFyPWXXC
uCIPE67RYmIOuK9s8RZE1oKDcfiV6U5ulZQ4ncqyTWKWlg9rDAtkyU8R2zYGfVulMcnlRgY80Nmg
78ZdJavel3O1GRxqPhR7oJXOF1ZgBsDA7gRCGhYgC4c5Qq5ORyEHG7UalwuaxYgxqWzFZWM/ePei
lbiiy4Up/yq62a0wPXif0zXfjznj6SeOfCWlfeg3OWvNjUIdhneJpstNdJxXOEzm30EUipgU/1Qd
sB7wgr10Ub7CiE7lOWdSZ3Tl9MylAoGBAP8qUiLq9JenYabqLrjUFo3veOMQ8UnF62IZ4pqmUWUr
GGcWWmyXQTySPaoO2u7Iv0UpR6sB3In6GjaXPO/cM4BsavhvrX3zNy/0rQreW32oTsuEpEYsrbvP
Q0PunoiqxgMRwVl+D6z5lgsWZslajpNwE6wVc2aoslNrxOtGek37AoGBAPHUHtRLN1KYFmRn+/u5
f9EisUUoOew2QoxoC7AhGivpg1McGmj4Bvzwrz4VYK61KQheHiVL8AdmiWybjWd6k8AvtTsOCfpP
fs6PZ+U16lE4b04jBoO6L0igxEUe7M0RXwWFi8Wnc3CmLWekY+BxGrUEtPl/Ng5CB9jvBZUlO2fv
AoGBALNVHx0DXJwpO2yAMg4coS1oHOIZSju7Kk9sOeLO+W3M9/2brDmdpG/ZqBUZE6220RbeiEwb
ptAiQsITUPSTIm8jw5qPgrN+eE7v+54j4NFTtO08b+gSBph0dqYL0sfingASPn2TJ5k+YMGyINNr
HcFph6nt+YkxDwOqPl/MzLB5AoGAdKw1r5EWgOfVrd2pajqGG12Uj1woDfnjw6ATO4fM+7Cu5nMh
ntFDddedhOOFgOTwhhP6kV4A0WE8HkUyROGT1V5vHq3YTIb8FCaGJsULZuJGeTlW1EkItQ6zgvG3
p/ygjqZu2A7BGHFkaKOceFW5X+qEcfdZGinrZVN0qw+KiMsCgYAqJ4UQOo9l7vqVaav3R4slVIp0
L5J2vGiF/HqMcKCwA/P5whtQPlYlKNrp2XH+cRY2ByY3YOTj4xVDZSezBhY1ANbgqpNMXFIMevXu
fhbRJUo51m0myN9cGZAkTBiJhiKDS2cb4fc5ezoWxHnSThYeY9km52R6FnPKOXrASLIW1g==
-----END RSA PRIVATE KEY-----[ssh-keygen -lf testkey04_rsa_1303393034.09075.pem 
line 23 too long: -----END RSA PRIVATE KEY-----...
testkey04_rsa_1303393034.09075.pem is not a public key file.
[whayutin@minidoe Downloads]$ man ssh-keygen 
[whayutin@minidoe Downloads]$ ssh-keygen  -y -f testkey04_rsa_1303393034.09075.pem 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDxCkUZc798QGv2AvMXAnA2GaYb78Mr9eg3W8ZjgIeeGi9Cs4JNqmkk399LMZNCBVH4HxyRMOCp08G9PtQlFhpZ8BqxkKAQJPxrPrsb+YbQXRluNTCid12KqZbGyew+g0tgaB0MWvHQtuFHGCaBwlddEY+94QIYvyyijY+bl7jiybWBaBSsmZqeDRkBfC5BglfV2uyYYPmWmy29LUc0xkndMgvWDpH2gMYRUrz4oOoN6pwLlTt2qJ1UIKS9iyGVnRvU38x7qgemzrvY2dfPpNC6HgLK4ynA73aF5LXwU8aVcHkYQU5Rn08Ea++9zQ+nitOTmLfxYLwWgJ9Kto5BZ8pV
[whayutin@minidoe Downloads]$ ssh-keygen  -y -f testkey04_rsa_1303393034.09075.pem > testkey04_public.pem
[whayutin@minidoe Downloads]$ cat testkey04_public.pem 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDxCkUZc798QGv2AvMXAnA2GaYb78Mr9eg3W8ZjgIeeGi9Cs4JNqmkk399LMZNCBVH4HxyRMOCp08G9PtQlFhpZ8BqxkKAQJPxrPrsb+YbQXRluNTCid12KqZbGyew+g0tgaB0MWvHQtuFHGCaBwlddEY+94QIYvyyijY+bl7jiybWBaBSsmZqeDRkBfC5BglfV2uyYYPmWmy29LUc0xkndMgvWDpH2gMYRUrz4oOoN6pwLlTt2qJ1UIKS9iyGVnRvU38x7qgemzrvY2dfPpNC6HgLK4ynA73aF5LXwU8aVcHkYQU5Rn08Ea++9zQ+nitOTmLfxYLwWgJ9Kto5BZ8pV
[whayutin@minidoe Downloads]$ ssh-keygen -lf testkey04_public.pem 
2048 40:bc:c3:2d:31:1b:42:87:0e:63:2d:04:e8:ba:0a:07 testkey04_public.pem (RSA)
[whayutin@minidoe Downloads]$ 


3. the amazon ec2 console reports that the running instances key is..

Key Pair Name: ec2_account_AKIAJ557U7P7OIHRV2EQ_1303330068_key_70346002961800

1 Key Pair selected

Key Pair Name: ec2_account_AKIAJ557U7P7OIHRV2EQ_1303330068_key_70346002961800
Fingerprint: a8:27:ce:94:df:a0:47:aa:62:2f:c3:f6:a6:ec:4d:4a:f1:0b:d9:37


NOTICE: THE FINGERPRINTS DO NOT MATCH

Comment 1 Jan Provaznik 2011-04-26 09:28:38 UTC
This bug is not related to copying unique instance key to running instance.
Most probably the problem is with setting up initial ssh key, which is downloaded from ec2 config server when booting the instance. I think best way to debug this is to enable ssh by password when building an image then when we hit this problem we can ssh with password and analyze it.

Comment 2 Jan Provaznik 2011-04-26 10:09:44 UTC
I forgot to mention that fingerprints differ even for instances where ssh works

Comment 3 wes hayutin 2011-05-02 11:05:42 UTC
modify /usr/lib/python2.6/site-packages/imagefactory/builders/FedoraBuilder.py

Line 585...
This:
----------------------
self.guest.guest_execute_command(guestaddr,
"[ -f /etc/init.d/firstboot ] && /sbin/chkconfig firstboot off")

To:
----------------------
self.guest.guest_execute_command(guestaddr,
"[ -f /etc/init.d/firstboot ] && /sbin/chkconfig firstboot off
|| /bin/true")

Comment 4 wes hayutin 2011-05-02 11:07:46 UTC
*** Bug 700811 has been marked as a duplicate of this bug. ***

Comment 5 Jan Provaznik 2011-05-02 11:26:20 UTC
Ian sent some patch which should fix this bug, assigning this task to him.

Comment 6 Ian McLeod 2011-05-02 14:27:51 UTC
My patches deal with a situation where the running instance is unable to extract the authorized key from the EC2 infrastructure due to network startup timing.

The issue Wes is describing above seems to be different.  Specifically, he seems to have discovered a situation where the SSH key fingerprint within EC2 seems to change.

The Image Factory does not manipulate SSH keys within a users EC2 account at all.  It simply creates images that use the runtime http based infra inside of EC2 to inject authorized keys.

I'd suggest we try to get a reliable reproducer for this "changing key" behaviour.

Comment 7 wes hayutin 2011-05-02 14:37:14 UTC
Ian.. currently all the ssh issue are resolved.. this title of the bug should be changed.

moving bug to on_qa

Comment 8 wes hayutin 2011-06-23 21:41:10 UTC
verified

Comment 9 wes hayutin 2011-08-01 19:56:29 UTC
release pending...

Comment 10 wes hayutin 2011-08-01 19:57:58 UTC
release pending...

Comment 12 wes hayutin 2011-12-08 13:52:01 UTC
perm close

Comment 13 wes hayutin 2011-12-08 13:55:02 UTC
closing out old bugs


Note You need to log in before you can comment on or make changes to this bug.