Bug 800251 - Instance launched with http service (through configserver )doesn't allow to ssh into ec2 instance
Instance launched with http service (through configserver )doesn't allow to s...
Status: CLOSED ERRATA
Product: CloudForms Cloud Engine
Classification: Red Hat
Component: aeolus-audrey-agent (Show other bugs)
1.0.0
Unspecified Unspecified
high Severity medium
: beta5
: ---
Assigned To: Dan Radez
wes hayutin
: Reopened, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-06 01:43 EST by Shveta
Modified: 2012-05-15 14:44 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-05-15 14:44:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
template (1.23 KB, application/octet-stream)
2012-03-12 10:21 EDT, Rehana
no flags Details
with_services (1.15 KB, application/octet-stream)
2012-03-12 10:22 EDT, Rehana
no flags Details
without_services (274 bytes, application/octet-stream)
2012-03-12 10:22 EDT, Rehana
no flags Details

  None (edit)
Description Shveta 2012-03-06 01:43:57 EST
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Added configserver for ec2
2. Launched instance with it , downloaded key and tried to ssh , succesful 
3. Now i edited the xml of the Blueprint and added services 

<?xml version="1.0"?>
<deployable version="1.0" name="RHEL6_2 configserver">
  <description/>
  <assemblies>
    <assembly hwp="small" name="RHEL6-2-configserver">
      <image id="d4f331b2-674f-11e1-807d-00215e203092"/>
<services>
         <service name="http">
           <executable url="http://radez.fedorapeople.org/wordpress-http.sh"/>
           <parameters>
             <parameter name="wp_name" type="scalar">
               <value>wordpress</value>
             </parameter>
             <parameter name="wp_user" type="scalar">
               <value>wordpress</value>
             </parameter>
             <parameter name="wp_pw" type="scalar">
               <value>wordpress</value>
             </parameter>
             <parameter name="mysql_ip" type="scalar">
               <reference assembly="mysql" parameter="ipaddress"/>
             </parameter>
             <parameter name="mysql_hostname" type="scalar">
               <reference assembly="mysql" parameter="hostname"/>
             </parameter>
             <parameter name="mysql_dbup" type="scalar">
               <reference assembly="mysql" parameter="dbup"/>
             </parameter>
           </parameters>
         </service>
       </services>
       <returns>
         <return name="hostname"/>
         <return name="ipaddress"/>
       </returns>
    </assembly>
  </assemblies>
</deployable>


4. Launched an instance , downloaded key and tried to ssh , i could not ssh .
ssh: connect to host ec2-107-22-97-254.compute-1.amazonaws.com port 22: Connection refused

=============================================================

used this audrey template to build

<template>
  <name>RHEL6_2 configserver</name>
  <os>
    <name>RHEL-6</name>
    <version>2</version>
    <arch>x86_64</arch>
    <install type='url'>
      <url>http://download.devel.redhat.com/released/RHEL-6/6.2/Server/x86_64/os/</url>
    </install>
    <rootpw>dog8code</rootpw>
  </os>
  <repositories>
    <repository name="rhel">
      <url>http://download.devel.redhat.com/released/RHEL-6/6.2/Server/x86_64/os/</url>
    </repository>
    <repository name="aeolus">
      <url>http://repos.fedorapeople.org/repos/aeolus/conductor/testing/6Server/x86_64/</url>
    </repository>
  </repositories>
  <packages>
    <package name="aeolus-audrey-agent"/>
  </packages>
  <description>RHEL 6.2 w/ Audrey Client</description>
</template>
  
Actual results:


Expected results:


Additional info:


rpm -qa|grep aeolus
aeolus-conductor-daemons-0.8.0-40.el6.noarch
aeolus-configure-2.5.0-17.el6.noarch
rubygem-aeolus-image-0.3.0-12.el6.noarch
aeolus-conductor-0.8.0-40.el6.noarch
rubygem-aeolus-cli-0.3.0-12.el6.noarch
aeolus-all-0.8.0-40.el6.noarch
aeolus-conductor-doc-0.8.0-40.el6.noarch
Comment 1 dgao 2012-03-06 11:30:47 EST
Just a note, I can only reproduce this bug going from one service to multiple services. If going from multiple services to more services, ssh works fine.
Comment 2 dgao 2012-03-08 12:40:24 EST
After some initial testing, this seems like a timing issue. There are a few observed stages after initializing an instance launch in conductor:

1) The instance is powered on but it's not completely up (None or few systemv services have started). At this stage, both EC2 console and conductor will report the instance to be "running". This is probably a little premature. 
2) The instance is fully up and all of the systemv services has started. EC2 would mark this as "2/2 status check complete". 

Any ssh attempt after stage #1 will result in a hang. But ssh attempt after stage #2 should work fine. 

However we have observed cases where EC2 would report "2/2 status check complete" yet we get a 

ssh: connect to host ec2-184-72-93-44.compute-1.amazonaws.com port 22: Connection refused

This error is consistent to what Shveta ran into in comment #1. But after waiting a few more mins, further attempts of ssh would work properly. 

So overall, I would recommend devs in conductor and deltacloud to make a change and not flip the status to running in conductor until it receives a "2/2 status check complete" from EC2.
Comment 3 wes hayutin 2012-03-08 12:45:32 EST
waiting for ec2 instances to be ready for ssh access is a known issue w/ ec2 itself. We're simply waiting for the ssh daemon to launch even though the various ec2 tools report its running..

The defect is w/ ec2..
Comment 4 Shveta 2012-03-12 10:00:55 EDT
this is seen with vsphere also , Re-opening the bug
Comment 5 Rehana 2012-03-12 10:20:47 EDT
Steps to reproduce on vsphere:

1. Added configserver for vsphere
2. Build and pushed Audrey agent enabled template to vsphere (PFA; template)
3. Created two deployables one with "wordpress" services and the other one with out having those services(PFA: With services, with_out services)

Actual behaviour:

Observed that i was unable to ssh to instance having services,

ssh root@10.10.77.107
ssh: connect to host 10.10.77.107 port 22: Connection refused

where as i was able to ssh to the machine which doesn't have the services

ssh root@10.10.77.117
The authenticity of host '10.10.77.117 (10.10.77.117)' can't be established.
RSA key fingerprint is 87:9d:e8:c6:51:b7:96:6d:a5:3f:eb:1a:e9:b8:8e:c4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.10.77.117' (RSA) to the list of known hosts.
root@10.10.77.117's password: 
[root@dhcp77-117 ~]# hostname
dhcp77-117.rhndev.redhat.com
Comment 6 Rehana 2012-03-12 10:21:28 EDT
Created attachment 569417 [details]
template
Comment 7 Rehana 2012-03-12 10:22:11 EDT
Created attachment 569418 [details]
with_services
Comment 8 Rehana 2012-03-12 10:22:41 EDT
Created attachment 569419 [details]
without_services
Comment 9 wes hayutin 2012-03-13 11:23:52 EDT
I tried this scenario.. works for me.. closing not a bug
Comment 10 wes hayutin 2012-03-13 11:36:55 EDT
sry.. this is an issue.. 
the ip is available, but you can not ssh in..

this is a bug.. tracking down
Comment 11 wes hayutin 2012-03-13 11:47:49 EDT
root cause here..

[root@dhcp77-144 rc.d]# cd rc3.d/
[root@dhcp77-144 rc3.d]# ls
K01smartd       K89rdisc         S13cpuspeed    S26udev-post  S90crond
K10psacct       S01sysstat       S13irqbalance  S50vmtoolsd   S95atd
K10saslauthd    S02lvm2-monitor  S15mdmonitor   S55audrey     S97rhnsd
K50netconsole   S08ip6tables     S20kdump       S55sshd       S97rhsmcertd
K74ntpd         S08iptables      S22messagebus  S80postfix    S99local
K75ntpdate      S10network       S25netfs       S82abrt-ccpp
K75quota_nld    S11auditd        S26acpid       S82abrtd
K87restorecond  S12rsyslog       S26haldaemon   S82abrt-oops
[root@dhcp77-144 rc3.d]#
Comment 12 wes hayutin 2012-03-13 11:49:47 EDT
S55audrey and S55sshd have the start value.. ssh needs to be before audrey
Comment 13 Greg Blomquist 2012-03-13 11:54:14 EDT
thanks for tracking this down Wes.

Assigning to Dan.
Comment 14 Dan Radez 2012-03-13 12:41:10 EDT
https://brewweb.devel.redhat.com/taskinfo?taskID=4147874
https://brewweb.devel.redhat.com/taskinfo?taskID=4147881

package version aeolus-audrey-agent-0.4.4-4
Comment 17 dgao 2012-03-14 14:41:23 EDT
[root@dhcp77-109 ~]# rpm -qa | grep "audrey"
aeolus-audrey-agent-0.4.4-4.el6.noarch
[root@dhcp77-109 ~]# head /etc/init.d/audrey 
#! /bin/sh
#
# chkconfig: 345 99 55
# description: The audrey agent.
# processname: audrey

# Source function library.
. /etc/init.d/functions

# Check that networking is up.
Comment 18 errata-xmlrpc 2012-05-15 14:44:48 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0669.html

Note You need to log in before you can comment on or make changes to this bug.