Bug 1296305
Summary: | scp hangs to other fedora hosts | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Steve Dickson <steved> | ||||||||||||
Component: | openssh | Assignee: | Jakub Jelen <jjelen> | ||||||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 23 | CC: | jjelen, mattdm, mattias.ellert, mgrepl, plautrba, steved, tmraz | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2016-01-11 17:19:29 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Created attachment 1112245 [details]
debuging for scp on rawhide
The openssh versions openssh-7.1p1-6.fc23.x86_64 openssh-7.1p1-6.fc24.x86_64 What happens if f23-server is localhost? (Just to make a nice one-system reproducer.) Hi Steve. Can you try to make f23 or rawhide reproducer as proposed by Matthew? I tried to reproduce it with my machine and virtual machines I have around but without any success (f23 -> f23). Weird enough that I can't see anything unexpected in the logs. It looks like the server is waiting for the data from client. I would expect message Sending file modes: C0644 19516 passwd Can you give it a try by hand if normal command execution works? For example ssh rawhide "read X; echo test: \$X" Are you aware of some special configuration on your client? Server seems to be quite standard. Does it behave the same with different user (newly created). Do you have any profile scripts? (In reply to Matthew Miller from comment #3) > What happens if f23-server is localhost? (Just to make a nice one-system > reproducer.) The same thing. I'll attached the client debug. Created attachment 1113198 [details]
log of 'scp -vvvvvv /etc/passwd localhost:/tmp'
(In reply to Jakub Jelen from comment #4) > Hi Steve. > Can you try to make f23 or rawhide reproducer as proposed by Matthew? I > tried to reproduce it with my machine and virtual machines I have around but > without any success (f23 -> f23). > > Weird enough that I can't see anything unexpected in the logs. It looks like > the server is waiting for the data from client. I would expect message > > Sending file modes: C0644 19516 passwd > > Can you give it a try by hand if normal command execution works? For example > > ssh rawhide "read X; echo test: \$X" The command hangs: fedora$ ssh fedora "read X; echo test: \$X" steved@fedora's password: but two strange things are happening 1) I'm asked for my password (which should not happen) 2) the title of the terminal changes to "fedora not a tty" > > Are you aware of some special configuration on your client? Server seems to > be quite standard. Does it behave the same with different user (newly > created). Do you have any profile scripts? No. actually I removed the openssh-client rpm completely and reinstalled it. wireshark traces from both rawhide client and f23 server are under http://people.redhat.com/steved/.bz1296305/ Note: packet 37 is where the hang occurs in both traces. >> Does it behave the same with different user (newly created)? > No. actually I removed the openssh-client rpm completely and reinstalled it. I meant the files /etc/profile, /etc/profile.d/*, ~/.bash_profile ~/.bashrc or respective to your shell? Any changes? What it the shell for your user? This really looks like some problem with initialization files. Rich, Did you try disabling NIC's offload engines? GSO, LRO, SG, etc.? - RF (In reply to Jakub Jelen from comment #9) > >> Does it behave the same with different user (newly created)? > > > No. actually I removed the openssh-client rpm completely and > reinstalled it. > I meant the files /etc/profile, /etc/profile.d/*, ~/.bash_profile ~/.bashrc > or respective to your shell? Any changes? Nope! Although I do have a ~/.bashrc that I have been using for years... > > What it the shell for your user? bash (In reply to Rodrigo A B Freire from comment #10) > Rich, > > Did you try disabling NIC's offload engines? No, how do I do that? (In reply to Steve Dickson from comment #12) > (In reply to Rodrigo A B Freire from comment #10) > > Rich, > > > > Did you try disabling NIC's offload engines? > No, how do I do that? Hi Steve! NOTICE: This is a RHEL6 system ok! 1. Enumerate the offload engines that are currently active in your interface: [root@rf ~]# ethtool -k eth0 | grep \ on rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on generic-segmentation-offload: on 2. Disable the offload engines that are active (use man ethtool for the engine names). For my scenario: [root@rf ~]# ethtool -K eth0 rx off rx off sg off tso off gso off 3. Ensure to do that on both hosts! 4. Redo your tests! Good luck! Steve, once more, can you give it a try with newly created user? I believe you were using your .bashrc for years, but it is not a proof that there is nothing wrong in it. Bash might have introduced some non-compatible change, regression or whatever. Otherwise I will have to ask you for a strace of server. (In reply to Rodrigo A B Freire from comment #13) > (In reply to Steve Dickson from comment #12) > > (In reply to Rodrigo A B Freire from comment #10) > > > Rich, > > > > > > Did you try disabling NIC's offload engines? > > No, how do I do that? > > Hi Steve! > > NOTICE: This is a RHEL6 system ok! > > 1. Enumerate the offload engines that are currently active in your interface: > > [root@rf ~]# ethtool -k eth0 | grep \ on > rx-checksumming: on > tx-checksumming: on > scatter-gather: on > tcp-segmentation-offload: on > generic-segmentation-offload: on > > > 2. Disable the offload engines that are active (use man ethtool for the > engine names). For my scenario: > [root@rf ~]# ethtool -K eth0 rx off rx off sg off tso off gso off > > 3. Ensure to do that on both hosts! Thanks for the info... I didn't know ethtool could do this sort of thing... but... it turns out by removing my .bashrc file on the f23 server causes the rawhide client to succeed. Which means this bz is a duplicate of bz1271394 Again, thanks for the cycles! (In reply to Jakub Jelen from comment #14) > Steve, > once more, can you give it a try with newly created user? I believe you were > using your .bashrc for years, but it is not a proof that there is nothing > wrong in it. Bash might have introduced some non-compatible change, > regression or whatever. > Otherwise I will have to ask you for a strace of server. It is my .bashrc... doing a 'mv .bashrc .bashrc.old' on the f23 server allows the rawhide client to works. Since this is very reproducible, is there anything I can do to help debug this? It depends if it is the problem with the .bashrc itself (wrong permissions, SELinux context?) or something in it. Certainly I would check the permissions ls -lZ ~/.bashrc Then you can either share your .bashrc and I can try to reproduce on my box, or you can rather bisect the which part is causing the trouble (hang?, wait for input? Expect TTY?). You can give it a try with set -x and then source your ~/.bashrc or if you can reproduce it also with bash -l (In reply to Jakub Jelen from comment #17) > It depends if it is the problem with the .bashrc itself (wrong permissions, > SELinux context?) or something in it. > > Certainly I would check the permissions ls -lZ ~/.bashrc f23$ ls -lZ ~/.bashrc -rw-r--r--. 1 steved 1000 unconfined_u:object_r:user_home_t:s0 2844 Dec 12 07:00 /home/steved/.bashrc > > Then you can either share your .bashrc and I can try to reproduce on my box, > or you can rather bisect the which part is causing the trouble (hang?, wait > for input? Expect TTY?). I'll attach it > > You can give it a try with set -x and then source your ~/.bashrc I'll attach that as well. > or if you can reproduce it also with bash -l No. Created attachment 1113616 [details]
my ~/.bashrc
Created attachment 1113617 [details]
The output from adding a 'set -x' to the top of my ~/.bashrc
I can't reproduce it with my updated Fedora 23, so it looks like it is something referenced in your bashrc, but defined somewhere else? I see you have got there pretty much of magic. You will probably have to give it a minute of commenting out parts of the bashrc and connecting to find out what is actually the trigger. I am afraid I will not be much helpful here any more. Another blind shot. Isn't there some SELinux denial recorder in audit? ausearch -m AVC (In reply to Jakub Jelen from comment #21) > I can't reproduce it with my updated Fedora 23, so it looks like it is > something referenced in your bashrc, but defined somewhere else? I see you > have got there pretty much of magic. I'll poke around to see what's going on... > > You will probably have to give it a minute of commenting out parts of the > bashrc and connecting to find out what is actually the trigger. I am afraid > I will not be much helpful here any more. Thanks for you help!!! > > Another blind shot. Isn't there some SELinux denial recorder in audit? > > ausearch -m AVC does this mean anything to you? I means nothing to me... ;-) f23# ausearch -m AVC ---- time->Mon Nov 23 03:38:01 2015 type=PROCTITLE msg=audit(1448267881.253:337): proctitle=2F7573722F62696E2F73797374656D63746C006B696C6C002D730048555000727379736C6F672E73657276696365 type=SYSCALL msg=audit(1448267881.253:337): arch=c000003e syscall=2 success=no exit=-13 a0=560763012544 a1=80101 a2=0 a3=7f2a60d56b98 items=0 ppid=4537 pid=4538 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="systemctl" exe="/usr/bin/systemctl" subj=system_u:system_r:logrotate_t:s0-s0:c0.c1023 key=(null) type=AVC msg=audit(1448267881.253:337): avc: denied { write } for pid=4538 comm="systemctl" name="kmsg" dev="devtmpfs" ino=7761 scontext=system_u:system_r:logrotate_t:s0-s0:c0.c1023 tcontext=system_u:object_r:kmsg_device_t:s0 tclass=chr_file permissive=0 And the answer is.... Don't do echoes in noninteractive shell I was doing an echo to set the terminal's title. Why it started to hang in f23 since the problem has been around since f3, I'll never know... *** This bug has been marked as a duplicate of bug 20527 *** |
Created attachment 1112244 [details] debuging for sshd on Fedora 23 Description of problem: I just upgraded a couple machine to rawhide and f23 and now scp hangs to those *any* fedora host. scp to non-fedora hosts (rhel6, rhel7) works fine. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1.scp -vvvvvv /etc/passwd f23-server:/tmp 2. 3. Actual results: Expected results: Additional info: Attached is the debugging logs for both the client and server