Red Hat Bugzilla – Bug 800251
Instance launched with http service (through configserver )doesn't allow to ssh into ec2 instance
Last modified: 2012-05-15 14:44:48 EDT
Description of problem:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Added configserver for ec2
2. Launched instance with it , downloaded key and tried to ssh , succesful
3. Now i edited the xml of the Blueprint and added services
<deployable version="1.0" name="RHEL6_2 configserver">
<assembly hwp="small" name="RHEL6-2-configserver">
<parameter name="wp_name" type="scalar">
<parameter name="wp_user" type="scalar">
<parameter name="wp_pw" type="scalar">
<parameter name="mysql_ip" type="scalar">
<reference assembly="mysql" parameter="ipaddress"/>
<parameter name="mysql_hostname" type="scalar">
<reference assembly="mysql" parameter="hostname"/>
<parameter name="mysql_dbup" type="scalar">
<reference assembly="mysql" parameter="dbup"/>
4. Launched an instance , downloaded key and tried to ssh , i could not ssh .
ssh: connect to host ec2-107-22-97-254.compute-1.amazonaws.com port 22: Connection refused
used this audrey template to build
<description>RHEL 6.2 w/ Audrey Client</description>
rpm -qa|grep aeolus
Just a note, I can only reproduce this bug going from one service to multiple services. If going from multiple services to more services, ssh works fine.
After some initial testing, this seems like a timing issue. There are a few observed stages after initializing an instance launch in conductor:
1) The instance is powered on but it's not completely up (None or few systemv services have started). At this stage, both EC2 console and conductor will report the instance to be "running". This is probably a little premature.
2) The instance is fully up and all of the systemv services has started. EC2 would mark this as "2/2 status check complete".
Any ssh attempt after stage #1 will result in a hang. But ssh attempt after stage #2 should work fine.
However we have observed cases where EC2 would report "2/2 status check complete" yet we get a
ssh: connect to host ec2-184-72-93-44.compute-1.amazonaws.com port 22: Connection refused
This error is consistent to what Shveta ran into in comment #1. But after waiting a few more mins, further attempts of ssh would work properly.
So overall, I would recommend devs in conductor and deltacloud to make a change and not flip the status to running in conductor until it receives a "2/2 status check complete" from EC2.
waiting for ec2 instances to be ready for ssh access is a known issue w/ ec2 itself. We're simply waiting for the ssh daemon to launch even though the various ec2 tools report its running..
The defect is w/ ec2..
this is seen with vsphere also , Re-opening the bug
Steps to reproduce on vsphere:
1. Added configserver for vsphere
2. Build and pushed Audrey agent enabled template to vsphere (PFA; template)
3. Created two deployables one with "wordpress" services and the other one with out having those services(PFA: With services, with_out services)
Observed that i was unable to ssh to instance having services,
ssh: connect to host 10.10.77.107 port 22: Connection refused
where as i was able to ssh to the machine which doesn't have the services
The authenticity of host '10.10.77.117 (10.10.77.117)' can't be established.
RSA key fingerprint is 87:9d:e8:c6:51:b7:96:6d:a5:3f:eb:1a:e9:b8:8e:c4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.10.77.117' (RSA) to the list of known hosts.
[root@dhcp77-117 ~]# hostname
Created attachment 569417 [details]
Created attachment 569418 [details]
Created attachment 569419 [details]
I tried this scenario.. works for me.. closing not a bug
sry.. this is an issue..
the ip is available, but you can not ssh in..
this is a bug.. tracking down
root cause here..
[root@dhcp77-144 rc.d]# cd rc3.d/
[root@dhcp77-144 rc3.d]# ls
K01smartd K89rdisc S13cpuspeed S26udev-post S90crond
K10psacct S01sysstat S13irqbalance S50vmtoolsd S95atd
K10saslauthd S02lvm2-monitor S15mdmonitor S55audrey S97rhnsd
K50netconsole S08ip6tables S20kdump S55sshd S97rhsmcertd
K74ntpd S08iptables S22messagebus S80postfix S99local
K75ntpdate S10network S25netfs S82abrt-ccpp
K75quota_nld S11auditd S26acpid S82abrtd
K87restorecond S12rsyslog S26haldaemon S82abrt-oops
S55audrey and S55sshd have the start value.. ssh needs to be before audrey
thanks for tracking this down Wes.
Assigning to Dan.
package version aeolus-audrey-agent-0.4.4-4
[root@dhcp77-109 ~]# rpm -qa | grep "audrey"
[root@dhcp77-109 ~]# head /etc/init.d/audrey
# chkconfig: 345 99 55
# description: The audrey agent.
# processname: audrey
# Source function library.
# Check that networking is up.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.