Bug 1203900 - ansible failing file copies to rawhide machine because of sshd error: mm_answer_audit_end_command: invalid handle
Summary: ansible failing file copies to rawhide machine because of sshd error: mm_answ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: openssh
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jakub Jelen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1204494 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-19 22:10 UTC by Josh Boyer
Modified: 2017-08-28 11:21 UTC (History)
12 users (show)

Fixed In Version: openssh-6.8p1-4.fc22
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-21 19:06:14 UTC
Type: Bug


Attachments (Terms of Use)

Description Josh Boyer 2015-03-19 22:10:44 UTC
Description of problem:

I have a simply ansible playbook that is failing to copy a file to one machine that is running rawhide.  An scp of the same file to the same machine works fine.  I've disabled the firewall and set SELinux to non-enforcing to eliminate those, but neither helped.

Version-Release number of selected component (if applicable):

ansible-1.8.4-1.fc21.noarch


How reproducible:

Always for this machine

Steps to Reproduce:
1. Run the playbook
2. Watch it fail for this machine as below
3. Scratch head

Actual results:

[jwboyer@vader kernel]$ ansible-playbook ~/ansible/test-kernel-split.yml -e "kdir=/home/jwboyer/kernel/x86_64 old_kernel=4.0.0-0.rc3.git2.1.fc23 kver=4.0.0-0.rc4.git1.3.fc23.x86_64" -K --limit @/home/jwboyer/test-kernel-split.retry
sudo password: 

PLAY [kernel-test] ************************************************************ 

TASK: [Copy kernel RPM] ******************************************************* 
fatal: [nuc-i7] => failed to transfer file to /home/jwboyer/.ansible/tmp/ansible-tmp-1426799949.57-89315340191452/source:

Couldn't read packet: Connection reset by peer

^C32
ERROR: interrupted
[jwboyer@vader kernel]$ scp x86_64/kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm nuc-i7:~/.
kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm     100%   59KB  59.2KB/s   00:00    
[jwboyer@vader kernel]$ scp x86_64/kernel-core-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm nuc-i7:~/.
kernel-core-4.0.0-0.rc4.git1.3.fc23.x86_64.rp 100%   19MB   9.6MB/s   00:02    
[jwboyer@vader kernel]$

Expected results:

ansible can connect and copy as before

Additional info:

This may be the fault of sshd on the rawhide system, but I have no idea why scp from the f21 machine would work and ansible wouldn't.  Looking in the logs on the rawhide machine I see:

Mar 19 17:19:11 nuc-i7 sshd[1622]: fatal: mm_request_send: write: Broken pipe
Mar 19 17:19:11 nuc-i7 sshd[1622]: fatal: mm_request_send: write: Broken pipe
Mar 19 17:19:10 nuc-i7 sshd[1617]: <audit-1701> auid=1000 uid=0 gid=0 ses=11 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 pid=1617 co
Mar 19 17:19:10 nuc-i7 sshd[1617]: pam_unix(sshd:session): session closed for user jwboyer
Mar 19 17:19:10 nuc-i7 sshd[1617]: fatal: mm_answer_audit_end_command: invalid handle
Mar 19 17:19:09 nuc-i7 sshd[1617]: pam_unix(sshd:session): session opened for user jwboyer by (uid=0)
Mar 19 17:19:09 nuc-i7 sshd[1617]: Accepted publickey for jwboyer from 192.168.11.63 port 53073 ssh2: RSA SHA256:raEyfyre5HGK2Ck3mDGU

Comment 1 Kevin Fenzi 2015-03-20 03:07:22 UTC
Seems to be ssh: 

openssh-6.7p1-11.fc23.x86_64 - fails
openssh-6.7p1-10.fc23.x86_64 - works fine

when -11 is installed I see in audit.log: 

type=ANOM_ABEND msg=audit(1426806655.424:7732): auid=100037 uid=0 gid=0 ses=54 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 pid=25840 comm="sshd" exe="/usr/sbin/sshd" sig=11
type=ANOM_ABEND msg=audit(1426806658.465:7768): auid=100037 uid=0 gid=0 ses=55 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 pid=25871 comm="sshd" exe="/usr/sbin/sshd" sig=11

ie, segfault?

Happy to gather more info, etc...

Comment 2 Jakub Jelen 2015-03-20 07:27:29 UTC
Thank you for reporting this issue. It was introduced by wrong applying a downstream patch related to auditing in recent version.
Sorry for the inconvenience. I prioritize this problem after finishing current rebase, but I wanted to let you know that it was picked up and something is going on.

Comment 3 Josh Boyer 2015-03-20 11:42:55 UTC
Thanks to both of you.  I also downgraded to -10 and ansible is working again.  I'll be happy to test when a fix is available.

Comment 4 Jakub Jelen 2015-03-23 06:20:59 UTC
*** Bug 1204494 has been marked as a duplicate of this bug. ***

Comment 5 Jakub Jelen 2015-03-24 10:32:48 UTC
This issue should be resolved with new openssh-6.8p1-1.1 release in rawhide, which contains this bug fixed:
http://koji.fedoraproject.org/koji/buildinfo?buildID=623115

If it will not solve your problem, please comment.

Comment 6 Fedora Update System 2015-03-26 14:17:03 UTC
openssh-6.8p1-2.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/openssh-6.8p1-2.fc22

Comment 7 Fedora Update System 2015-03-29 04:47:52 UTC
Package openssh-6.8p1-2.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing openssh-6.8p1-2.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-4898/openssh-6.8p1-2.fc22
then log in and leave karma (feedback).

Comment 8 Fedora Update System 2015-03-31 21:55:20 UTC
openssh-6.8p1-2.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 9 Josh Boyer 2015-04-02 00:17:15 UTC
This is still failing with openssh-6.8p1-2.fc22.  Ansible file copies continue to fail and I can see the following in the journal:

Apr  1 15:02:45 obiwan sshd[1982]: fatal: mm_answer_audit_end_command: invalid handle
Apr  1 15:02:45 obiwan sshd[1982]: pam_unix(sshd:session): session closed for user jwboyer
Apr  1 15:02:46 obiwan sshd[2064]: fatal: mm_request_send: write: Broken pipe
Apr  1 15:02:46 obiwan sshd[2064]: fatal: mm_request_send: write: Broken pipe
Apr  1 20:12:46 obiwan sshd[2643]: Accepted publickey for jwboyer from 192.168.11.63 port 55077 ssh2: RSA SHA256:raEyfyre5HGK2Ck3mDGU8iL8iXbNBtwbhFTep7tEiBE
Apr  1 20:12:46 obiwan systemd: pam_unix(systemd-user:session): session opened for user jwboyer by (uid=0)

Comment 10 Jakub Jelen 2015-04-02 10:35:27 UTC
Hi,
I was playing around Ansible, but I still can't reproduce your issue through Ansible. Using latest rawhide version openssh-6.8p1-3.fc23

Using command from closed duplicate bug I don't get any error
> ansible -i hosts all -m ping -vvvv


I went again through audit and fixed some cases better way. By hand it seems working for me. Can you test it again with this build and report back?
http://koji.fedoraproject.org/koji/taskinfo?taskID=9398779

If I got this right, it is problem of ssh commands with "ControlPersist option", which is commonly used by Ansible and persistent master is failing. Sorry for inconvenience.

Comment 11 Josh Boyer 2015-04-02 11:28:39 UTC
A simple file copy with openssh-6.7p1-10.fc23.x86_64 on the target machine:

[jwboyer@vader kernel]$ ansible nuc-i7 -m copy -a "src=x86_64/kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm dest=~/."
nuc-i7 | success >> {
    "changed": true,
    "checksum": "07016b5a3cb4e156210e31b163098ead3c9c4eab",
    "dest": "/home/jwboyer/./kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm",
    "gid": 1000,
    "group": "jwboyer",
    "md5sum": "aa40d75c6856bdc93c0c501a34e84b93",
    "mode": "0664",
    "owner": "jwboyer",
    "secontext": "unconfined_u:object_r:user_home_t:s0",
    "size": 60664,
    "src": "/home/jwboyer/.ansible/tmp/ansible-tmp-1427973730.47-175909122124082/source",
    "state": "file",
    "uid": 1000
}
[jwboyer@vader kernel]$

The same file copy to the same machine with openssh-6.8p1-3.fc23.x86_64:

[jwboyer@vader kernel]$ ansible nuc-i7 -m copy -a "src=x86_64/kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm dest=~/."
nuc-i7 | FAILED => failed to transfer file to /home/jwboyer/.ansible/tmp/ansible-tmp-1427973906.05-222325072018028/source:

Couldn't read packet: Connection reset by peer

[jwboyer@vader kernel]$ 

The ping command also fails.

[jwboyer@vader kernel]$ ansible nuc-i7 -m ping -vvvv
<nuc-i7> 
<nuc-i7> 
<nuc-i7> ConnectTimeout=10 PasswordAuthentication=no KbdInteractiveAuthentication=no ControlPath=/home/jwboyer/.ansible/cp/ansible-ssh-%h-%p-%r PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey ControlMaster=auto Port=22 ControlPersist=60s
<nuc-i7> 
nuc-i7 | FAILED => failed to transfer file to /home/jwboyer/.ansible/tmp/ansible-tmp-1427973970.96-163092021662184/ping:

Couldn't read packet: Connection reset by peer


I can test with the f22 build if you'd like, but I think the rawhide version of it shows the same issues.

Comment 12 Josh Boyer 2015-04-02 12:12:51 UTC
(In reply to Jakub Jelen from comment #10)
> Hi,
> I was playing around Ansible, but I still can't reproduce your issue through
> Ansible. Using latest rawhide version openssh-6.8p1-3.fc23
> 
> Using command from closed duplicate bug I don't get any error
> > ansible -i hosts all -m ping -vvvv
> 
> 
> I went again through audit and fixed some cases better way. By hand it seems
> working for me. Can you test it again with this build and report back?
> http://koji.fedoraproject.org/koji/taskinfo?taskID=9398779
> 
> If I got this right, it is problem of ssh commands with "ControlPersist
> option", which is commonly used by Ansible and persistent master is failing.
> Sorry for inconvenience.

OK, I tested that build on an f22 machine with successful results:


[jwboyer@vader kernel]$ ansible obiwan -m copy -a "src=x86_64/kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm dest=~/."
obiwan | success >> {
    "changed": true,
    "checksum": "07016b5a3cb4e156210e31b163098ead3c9c4eab",
    "dest": "/home/jwboyer/./kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm",
    "gid": 0,
    "group": "root",
    "md5sum": "aa40d75c6856bdc93c0c501a34e84b93",
    "mode": "0644",
    "owner": "jwboyer",
    "secontext": "unconfined_u:object_r:user_home_t:s0",
    "size": 60664,
    "src": "/home/jwboyer/.ansible/tmp/ansible-tmp-1427976697.45-116497879441197/source",
    "state": "file",
    "uid": 1000
}

[jwboyer@vader kernel]$ ansible obiwan -m ping -vvvv<obiwan> 
<obiwan> 
<obiwan> ConnectTimeout=10 PasswordAuthentication=no KbdInteractiveAuthentication=no ControlPath=/home/jwboyer/.ansible/cp/ansible-ssh-%h-%p-%r PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey ControlMaster=auto Port=22 ControlPersist=60s
<obiwan> 
<obiwan> ConnectTimeout=10 PasswordAuthentication=no 'LANG=C LC_CTYPE=C /usr/bin/python /home/jwboyer/.ansible/tmp/ansible-tmp-1427976707.55-205929320515257/ping; rm -rf /home/jwboyer/.ansible/tmp/ansible-tmp-1427976707.55-205929320515257/ >/dev/null 2>&1' KbdInteractiveAuthentication=no ControlPath=/home/jwboyer/.ansible/cp/ansible-ssh-%h-%p-%r PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey ControlMaster=auto Port=22 ControlPersist=60s
obiwan | success >> {
    "changed": false,
    "ping": "pong"
}

[jwboyer@vader kernel]$ 

[jwboyer@obiwan ~]$ rpm -q openssh
openssh-6.8p1-3.fc22.x86_64
[jwboyer@obiwan ~]$

Comment 13 Josh Boyer 2015-04-02 12:16:42 UTC
Same build tested on the previously tested rawhide machine works too:

[jwboyer@vader kernel]$ ansible nuc-i7 -m copy -a "src=x86_64/kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm dest=~/."
nuc-i7 | success >> {
    "changed": true,
    "checksum": "07016b5a3cb4e156210e31b163098ead3c9c4eab",
    "dest": "/home/jwboyer/./kernel-4.0.0-0.rc4.git1.3.fc23.x86_64.rpm",
    "gid": 1000,
    "group": "jwboyer",
    "md5sum": "aa40d75c6856bdc93c0c501a34e84b93",
    "mode": "0664",
    "owner": "jwboyer",
    "secontext": "unconfined_u:object_r:user_home_t:s0",
    "size": 60664,
    "src": "/home/jwboyer/.ansible/tmp/ansible-tmp-1427976857.54-175393144647249/source",
    "state": "file",
    "uid": 1000
}

[jwboyer@vader kernel]$ ansible nuc-i7 -m ping -vvvv<nuc-i7> 
<nuc-i7> 
<nuc-i7> ConnectTimeout=10 PasswordAuthentication=no KbdInteractiveAuthentication=no ControlPath=/home/jwboyer/.ansible/cp/ansible-ssh-%h-%p-%r PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey ControlMaster=auto Port=22 ControlPersist=60s
<nuc-i7> 
<nuc-i7> ConnectTimeout=10 PasswordAuthentication=no 'LANG=C LC_CTYPE=C /usr/bin/python /home/jwboyer/.ansible/tmp/ansible-tmp-1427976861.58-200531392598203/ping; rm -rf /home/jwboyer/.ansible/tmp/ansible-tmp-1427976861.58-200531392598203/ >/dev/null 2>&1' KbdInteractiveAuthentication=no ControlPath=/home/jwboyer/.ansible/cp/ansible-ssh-%h-%p-%r PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey ControlMaster=auto Port=22 ControlPersist=60s
nuc-i7 | success >> {
    "changed": false,
    "ping": "pong"
}

[jwboyer@vader kernel]$

Comment 14 Jakub Jelen 2015-04-02 15:36:48 UTC
Josh, thanks for the help with investigation. It appeared that I was unable to reproduce with root user, but non-root was failing as stated above.
Update is on its way.

Comment 15 Fedora Update System 2015-04-02 16:48:12 UTC
openssh-6.8p1-4.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/openssh-6.8p1-4.fc22

Comment 16 Fedora Update System 2015-04-04 16:31:58 UTC
Package openssh-6.8p1-4.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing openssh-6.8p1-4.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-5526/openssh-6.8p1-4.fc22
then log in and leave karma (feedback).

Comment 17 Jakub Jelen 2015-04-10 12:59:59 UTC
Hi Josh,
would you consider this as a blocker for Beta release (currently postponed so we have time for this) or is it fine for you to have it in GA?
I'm not sure if there are some test matrices for this Ansible use case for Beta.

Comment 18 Josh Boyer 2015-04-10 13:21:57 UTC
I'm not aware of any tests that would cause this to be a Beta blocker.  Given that normal SSH sessions work fine and it is limited to a small set of file copies, I think just having it in GA should work.  That's just my opinion though.  If you'd like it to be blocker, then I'm fine with that too.

Comment 19 Fedora Update System 2015-04-21 19:06:14 UTC
openssh-6.8p1-4.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.