Bug 699428 - slow connection to f15 beta sshd
Summary: slow connection to f15 beta sshd
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: openssh
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jan F. Chadima
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-25 15:38 UTC by Carl Byington
Modified: 2011-04-28 16:15 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-28 16:15:29 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Carl Byington 2011-04-25 15:38:39 UTC
Description of problem:
Connecting from centos5 openssh-clients-4.3p2-72.el5_6.3 to f15 openssh-server-5.6p1-30.fc15.x86_64, the initial prompt is slow to appear. This does not seem to be the common 'no reverse dns for client machine' problem.

Version-Release number of selected component (if applicable):
openssh-server-5.6p1-30.fc15.x86_64

How reproducible:
always

Steps to Reproduce:
1. ssh -v media2
2.
3.
  
Actual results:
multi second delays after two lines in the debug log:
debug1: read PEM private key done: type RSA
debug1: Sending env LANG = en_US.UTF-8

Expected results:
no delays.

Additional info:
Two machines (centos5 and media2(f15)), both with proper reverse dns names for their ipv4 addresses, connecting over ipv4. 

From centos5, ssh -v media2 has two multi-second delays as above. 
From media2, ssh -v centos5 has no such delays.

In media2:/var/log/secure we have:
Apr 25 08:28:57 media2 sshd[7812]: Accepted publickey for root from 192.168.3.2 port 57646 ssh2
Apr 25 15:28:57 media2 sshd[7813]: fatal: mm_request_receive: read: Connection reset by peer
Apr 25 08:29:02 media2 sshd[7812]: pam_systemd(sshd:session): Moving new user session for root into control group /user/root/367.
Apr 25 08:29:02 media2 sshd[7812]: pam_unix(sshd:session): session opened for user root by (uid=0)

It looks like some sshd process is running in UTC, where everything else is in PDT.

Comment 1 Tomas Mraz 2011-04-27 10:40:33 UTC
(In reply to comment #0)
> 
> In media2:/var/log/secure we have:
> Apr 25 08:28:57 media2 sshd[7812]: Accepted publickey for root from 192.168.3.2
> port 57646 ssh2
> Apr 25 15:28:57 media2 sshd[7813]: fatal: mm_request_receive: read: Connection
> reset by peer
> Apr 25 08:29:02 media2 sshd[7812]: pam_systemd(sshd:session): Moving new user
> session for root into control group /user/root/367.
> Apr 25 08:29:02 media2 sshd[7812]: pam_unix(sshd:session): session opened for
> user root by (uid=0)
> 
> It looks like some sshd process is running in UTC, where everything else is in
> PDT.

The UTC time in the syslog is a regression of an old bug - the /etc/localtime should be bind mounted into the unpriviledged child sshd chroot.

But does the fatal message appear on all connection attempts to the F15 sshd?

Comment 2 Jan F. Chadima 2011-04-27 10:49:35 UTC
please create /var/empty/sshd/etc; and make hardlink /etc/localtime into that
new dir (or copy it if it is different fs). This probably repair the timezones,
but probably not delays.
provide the same log if possible, but running sshd thru strace -fff -o ....
-ttt
(maximum f and maximum t)
Thx

Comment 3 Carl Byington 2011-04-27 17:35:29 UTC
mkdir /var/empty/sshd/etc
cd /var/empty/sshd/etc
ln /etc/localtime localtime
kill -HUP 5539 #restart sshd

Apr 27 10:27:38 media2 sshd[31469]: Accepted publickey for root from 192.168.3.2 port 58298 ssh2
Apr 27 10:27:38 media2 sshd[31470]: fatal: mm_request_receive: read: Connection reset by peer
Apr 27 10:27:43 media2 sshd[31469]: pam_systemd(sshd:session): Moving new user session for root into control group /user/root/762.
Apr 27 10:27:43 media2 sshd[31469]: pam_unix(sshd:session): session opened for user root by (uid=0)

Yes, the fatal: mm_request... appears on every ssh connection from centos5->media2.

What is the cleanest way to run sshd under strace on media2? Modify /etc/rc.d/init.d/sshd and 'service sshd restart', or something else? I only have remote ssh access to media2 - the machine is physically hours away (but close to centos5 for which I also only have ssh access).

Comment 4 Jan F. Chadima 2011-04-27 17:39:25 UTC
service sshd stop
strace -ff -o my_trace -ttt /usr/sbin/sshd

You may add -d -d -d   as an argument to the end for getting extra debug

Comment 5 Carl Byington 2011-04-27 18:28:27 UTC
strace -fff -o my_trace -ttt /usr/sbin/sshd

Apr 27 11:13:42 media2 sshd[32258]: Server listening on 0.0.0.0 port 22.
Apr 27 11:13:42 media2 sshd[32258]: Server listening on :: port 22.
Apr 27 11:14:41 media2 sshd[32270]: Accepted publickey for root from 192.168.3.2 port 56576 ssh2
Apr 27 11:14:41 media2 sshd[32271]: fatal: mm_request_receive: read: Connection reset by peer
Apr 27 11:14:46 media2 sshd[32270]: pam_systemd(sshd:session): Moving new user session for root into control group /user/root/771.
Apr 27 11:14:46 media2 sshd[32270]: pam_unix(sshd:session): session opened for user root by (uid=0)

trace files in http://www.five-ten-sg.com/699428.tar.gz

Comment 6 Tomas Mraz 2011-04-27 21:26:40 UTC
Note that the strace files probably contain parts of the media2 server private host keys if you did not explicitly sanitized them - I did not look at them.

Comment 7 Jan F. Chadima 2011-04-28 04:58:09 UTC
you may delete the traces from the internet. thans

Comment 8 Jan F. Chadima 2011-04-28 05:07:01 UTC
I've found there DNS timeouts cca 5 sec longs. Note that rhel5's openssh and f15's openssh handles DNS differently.

Comment 9 Carl Byington 2011-04-28 14:18:10 UTC
After looking at tcpdump for dns packets on media2, and the intermediate routers between media2 and the dns server, something is causing some (but not all) of the AAAA queries to be dropped before they arrive at the dns server, leading to 5 second delays before the dns resolver on media2 retries the query. Clearly not a problem with sshd on media2.

Comment 10 Jan F. Chadima 2011-04-28 16:15:29 UTC
thx, closing as notabug


Note You need to log in before you can comment on or make changes to this bug.