Bug 800251
Summary: | Instance launched with http service (through configserver )doesn't allow to ssh into ec2 instance | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] CloudForms Cloud Engine | Reporter: | Shveta <ssachdev> | ||||||||
Component: | aeolus-audrey-agent | Assignee: | Dan Radez <dradez> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | wes hayutin <whayutin> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 1.0.0 | CC: | akarol, cpelland, deltacloud-maint, dgao, dradez, hbrock, jrd, redakkan, ssachdev, whayutin | ||||||||
Target Milestone: | beta5 | Keywords: | Reopened, Triaged | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2012-05-15 18:44:48 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Shveta
2012-03-06 06:43:57 UTC
Just a note, I can only reproduce this bug going from one service to multiple services. If going from multiple services to more services, ssh works fine. After some initial testing, this seems like a timing issue. There are a few observed stages after initializing an instance launch in conductor: 1) The instance is powered on but it's not completely up (None or few systemv services have started). At this stage, both EC2 console and conductor will report the instance to be "running". This is probably a little premature. 2) The instance is fully up and all of the systemv services has started. EC2 would mark this as "2/2 status check complete". Any ssh attempt after stage #1 will result in a hang. But ssh attempt after stage #2 should work fine. However we have observed cases where EC2 would report "2/2 status check complete" yet we get a ssh: connect to host ec2-184-72-93-44.compute-1.amazonaws.com port 22: Connection refused This error is consistent to what Shveta ran into in comment #1. But after waiting a few more mins, further attempts of ssh would work properly. So overall, I would recommend devs in conductor and deltacloud to make a change and not flip the status to running in conductor until it receives a "2/2 status check complete" from EC2. waiting for ec2 instances to be ready for ssh access is a known issue w/ ec2 itself. We're simply waiting for the ssh daemon to launch even though the various ec2 tools report its running.. The defect is w/ ec2.. this is seen with vsphere also , Re-opening the bug Steps to reproduce on vsphere: 1. Added configserver for vsphere 2. Build and pushed Audrey agent enabled template to vsphere (PFA; template) 3. Created two deployables one with "wordpress" services and the other one with out having those services(PFA: With services, with_out services) Actual behaviour: Observed that i was unable to ssh to instance having services, ssh root.77.107 ssh: connect to host 10.10.77.107 port 22: Connection refused where as i was able to ssh to the machine which doesn't have the services ssh root.77.117 The authenticity of host '10.10.77.117 (10.10.77.117)' can't be established. RSA key fingerprint is 87:9d:e8:c6:51:b7:96:6d:a5:3f:eb:1a:e9:b8:8e:c4. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '10.10.77.117' (RSA) to the list of known hosts. root.77.117's password: [root@dhcp77-117 ~]# hostname dhcp77-117.rhndev.redhat.com Created attachment 569417 [details]
template
Created attachment 569418 [details]
with_services
Created attachment 569419 [details]
without_services
I tried this scenario.. works for me.. closing not a bug sry.. this is an issue.. the ip is available, but you can not ssh in.. this is a bug.. tracking down root cause here.. [root@dhcp77-144 rc.d]# cd rc3.d/ [root@dhcp77-144 rc3.d]# ls K01smartd K89rdisc S13cpuspeed S26udev-post S90crond K10psacct S01sysstat S13irqbalance S50vmtoolsd S95atd K10saslauthd S02lvm2-monitor S15mdmonitor S55audrey S97rhnsd K50netconsole S08ip6tables S20kdump S55sshd S97rhsmcertd K74ntpd S08iptables S22messagebus S80postfix S99local K75ntpdate S10network S25netfs S82abrt-ccpp K75quota_nld S11auditd S26acpid S82abrtd K87restorecond S12rsyslog S26haldaemon S82abrt-oops [root@dhcp77-144 rc3.d]# S55audrey and S55sshd have the start value.. ssh needs to be before audrey thanks for tracking this down Wes. Assigning to Dan. https://brewweb.devel.redhat.com/taskinfo?taskID=4147874 https://brewweb.devel.redhat.com/taskinfo?taskID=4147881 package version aeolus-audrey-agent-0.4.4-4 [root@dhcp77-109 ~]# rpm -qa | grep "audrey" aeolus-audrey-agent-0.4.4-4.el6.noarch [root@dhcp77-109 ~]# head /etc/init.d/audrey #! /bin/sh # # chkconfig: 345 99 55 # description: The audrey agent. # processname: audrey # Source function library. . /etc/init.d/functions # Check that networking is up. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2012-0669.html |