Bug 688608

Summary: /etc/rc.local is executed after ssh started, making some changes useless
Product: Red Hat Enterprise Linux 6 Reporter: Pierre Carrier <prc>
Component: relengAssignee: Jay Greguske <jgregusk>
Status: CLOSED ERRATA QA Contact: wes hayutin <whayutin>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.1CC: dmach, jfenal, kbidarka, mikeb, prc, sghai, syeghiay
Target Milestone: rcKeywords: EC2
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: GA Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 700583 (view as bug list) Environment:
Last Closed: 2011-06-09 15:07:07 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 700583    
Attachments:
Description Flags
Proposed patch none

Description Pierre Carrier 2011-03-17 09:43:27 EDT
Description of problem:
- Changes to the ssh config and deployment of the EC2 ssh key is done through /etc/rc.local
- rc.local is ran in /etc/rc.d/rc3.d/S99local
- That's after /etc/rc.d/rc3.d/S55sshd

Conclusion:
- ssh client return after the connection was refused during boot time
- Then sshd starts and they prompt for a password even if the private key is provided
- Then the key is made available and following connection attempts succeed directly
- The sshd_config changes are not taken into account by sshd

Version-Release number of selected component (if applicable):
- http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/utility-scripts/cloud/kickstarts/RHEL-5.5-starter-ec2.ks?rev=1.14;content-type=text%2Fplain;cvsroot=RH-RelEng;f=h
- Apparently that would be the currently released RHEL5.5 image

How reproducible:
100%

Steps to Reproduce:
1. Boot an instance
2. Try to ssh every few seconds, with -i $key
  
Actual results:
- Failure for the first attempts
- Then password prompt
- Finally, success without a password prompt

Expected results:
- Failure for the first attempts
- Then success without a password prompt

Additional info:
See find a patch for this ks file attached. Please note it was not tested.
Comment 1 Pierre Carrier 2011-03-17 09:44:27 EDT
Created attachment 486015 [details]
Proposed patch
Comment 3 wes hayutin 2011-03-21 11:09:04 EDT
Adding Jay to the bug
Comment 4 Jay Greguske 2011-03-28 12:15:41 EDT
Patch applied to the 5.5, 5.6, 6.0, and 6.1 kickstart files.

For 6.1 beta the fixes should be evident in snapshot 2.
Comment 5 wes hayutin 2011-03-28 20:35:07 EDT
Jay, whats a good way to test for the patch?
Comment 6 Jay Greguske 2011-03-29 08:32:00 EDT
Tough question. As soon as the instance comes back as "running" from AWS and not "pending" you have to slam it with ssh connections hoping for a password prompt even if you supply it with the private key.

I don't think you can test this easily from within the instance.
Comment 7 Jay Greguske 2011-04-01 15:26:22 EDT
After some review I'm not a fan of this patch. There is still a race condition: if sshd completes before the ec2config "service" completes you'll still have this problem. If you put '-o "PreferredAuthentications publickey"' in the ssh command you won't get the password prompt.
Comment 8 Mike Bonnet 2011-04-01 15:33:14 EDT
Also, your comment about the sshd_config changes not taking effect isn't correct.  sshd_config and firstboot changes are being executed during the kickstart %post, which happens at image creation time.  So when the image boots for the first time, those changes are already there.  Moving those changes into a service script is not correct.
Comment 9 Pierre Carrier 2011-04-03 17:24:11 EDT
(In reply to comment #7)
> After some review I'm not a fan of this patch. There is still a race condition:
> if sshd completes before the ec2config "service" completes you'll still have
> this problem.

A race condition between initscripts? I *really* don't see how this could happen...

ec2config is S54 and sshd is S55.
AFAIK we do not parallelize anything.

> If you put '-o "PreferredAuthentications publickey"' in the ssh
> command you won't get the password prompt.

Yes, this workaround was already provided to the original bug reporter.

(In reply to comment #8)
> Also, your comment about the sshd_config changes not taking effect isn't
> correct.  sshd_config and firstboot changes are being executed during the
> kickstart %post, which happens at image creation time.  So when the image boots
> for the first time, those changes are already there.  Moving those changes into
> a service script is not correct.

I didn't know about initscripts being executed then, in which case you can effectively drop the remark.

However, this makes me wonder: if this script is executed during the kickstart and not the first boot, how come we have this ssh key issue?

-- 
Pierre
Comment 10 Mike Bonnet 2011-04-04 10:21:39 EDT
(In reply to comment #9)
> (In reply to comment #8)
> I didn't know about initscripts being executed then, in which case you can
> effectively drop the remark.

The initscripts are not being executed in %post, only the script in the kickstart.

Take a look at the original kickstart linked above.  Everything between %post and %end is a script that gets executed at image creation time.  It edits grub.conf, sets up fstab, appends to sshd_config, etc.  All of these changes are made to the image when it's created, before it has ever been booted.  The script also appends *another* script to /etc/rc.local.  That second script is executed at boot time, and is responsible downloading the ssh key and installing it to /root/.ssh/authorized_keys.

> However, this makes me wonder: if this script is executed during the kickstart
> and not the first boot, how come we have this ssh key issue?

The script that downloads the ssh keys needs to be executed at boot time, not image creation time, because the key is coming from the metadata associated with that particular instance.  Each instance booted from the same disk image will have its own metadata, including its own ssh key.  EC2 users can set the ssh key associated with an instance via the web UI, and that user-specific key will be downloaded and installed into the instance when it is first booted, by the script in /etc/rc.local.  If we installed a ssh key at image creation time, then everyone would need to use the same ssh private key to access all Red Hat images, which is obviously not appropriate.

All that being said, an answer to this issue is to set "PasswordAuthentication no" in sshd_config at image creation time.
Comment 11 Pierre Carrier 2011-04-04 16:02:04 EDT
(In reply to comment #10)
> Take a look at the original kickstart linked above.  Everything between %post
> and %end is a script that gets executed at image creation time.  It edits
> grub.conf, sets up fstab, appends to sshd_config, etc.  All of these changes
> are made to the image when it's created, before it has ever been booted.  The
> script also appends *another* script to /etc/rc.local.  That second script is
> executed at boot time, and is responsible downloading the ssh key and
> installing it to /root/.ssh/authorized_keys.

Sorry, I got completely confused by my patch: I didn't realise it was merging the scriptlets with the initscript. We definitely should only keep the ssh key operation in the initscript.

> The script that downloads the ssh keys needs to be executed at boot time, not
> image creation time, because the key is coming from the metadata associated
> with that particular instance. [...]

Back to my original understanding on this :)

> All that being said, an answer to this issue is to set "PasswordAuthentication
> no" in sshd_config at image creation time.

It would have more effects than simply grabbing the key before starting ssh.
Sure "PasswordAuthentication no" is a good policy, but can we change it?
And given that some admins might want to have "PasswordAuthentication yes", shouldn't we also fix the boot order anyway?

-- 
Pierre
Comment 12 Mike Bonnet 2011-04-04 16:15:06 EDT
(In reply to comment #11)
> (In reply to comment #10)
> It would have more effects than simply grabbing the key before starting ssh.
> Sure "PasswordAuthentication no" is a good policy, but can we change it?
> And given that some admins might want to have "PasswordAuthentication yes",
> shouldn't we also fix the boot order anyway?

This really only affects how you access the instance the very first time it boots.  At that point there are no local users created, and root has no password set, so ssh keys are the only possible way to access the instance.  Setting "PasswordAuthentication no" would prevent ssh clients from hanging at password prompts, which I believe was the original complaint.  Once the admin is able to log in as root, they can change the ssh config in whatever way they'd like.  Of course, this change to the default would have to be documented to avoid surprising admins who create new users and expect them to be able to login with their password.

If you'd really rather pursue the initscript approach I'm not going to object, but it does seem more complicated, for only minor improvement in functionality.
Comment 13 Jay Greguske 2011-04-20 12:13:52 EDT
I'm going to work towards the "PasswordAuthentication no" approach. Any objections? If a customer wants it to be yes, they can log in and change it in their instance easily. Our images do not have valid passwords in the first place anyway.
Comment 14 Pierre Carrier 2011-04-22 10:01:28 EDT
No objection from me.

I would still prefer putting the keys in place before we start sshd, makes more sense to me not to offer a service until it's actually usable, but I've given my opinion more times than enough so I'll keep quiet now.


Thanks!

-- 
Pierre Carrier
Comment 16 RHEL Product and Program Management 2011-04-29 02:00:15 EDT
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.
Comment 18 wes hayutin 2011-05-25 11:20:42 EDT
Connection to ec2-122-248-207-169.ap-southeast-1.compute.amazonaws.com closed.
[whayutin@minidoe ~]$ ssh -i ~/.ec2/WESHAYUTIN/asia-cloudekey.pem root@ec2-122-248-207-169.ap-southeast-1.compute.amazonaws.com
Last login: Wed May 25 10:53:06 2011 from 99.39.212.236
[root@ip-10-130-43-97 ~]# cat /etc/ssh/sshd_config  | grep -i password
# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no
PasswordAuthentication no
# Change to no to disable s/key passwords
# PasswordAuthentication.  Depending on your PAM configuration,
# the setting of "PermitRootLogin without-password".
# PAM authentication, then enable this but set PasswordAuthentication
PermitRootLogin without-password
[root@ip-10-130-43-97 ~]# 



adding test to test scripts..
Comment 19 errata-xmlrpc 2011-06-09 15:07:07 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0540.html