Bug 1416584 - ssh-agent cert signing error: "process_sign_request2: RSA-CERT key not found" in 7.4p1 1.fc25
Summary: ssh-agent cert signing error: "process_sign_request2: RSA-CERT key not found"...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: openssh
Version: 25
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jakub Jelen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-25 23:01 UTC by Peter Moody
Modified: 2017-02-08 01:51 UTC (History)
6 users (show)

Fixed In Version: openssh-7.4p1-2.fc25
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-08 01:51:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
go repro (2.47 KB, text/plain)
2017-01-27 00:18 UTC, Peter Moody
no flags Details

Description Peter Moody 2017-01-25 23:01:21 UTC
Description of problem:

we have an inhouse pam module, like pam-ssh-agent-auth except that it works with ssh certs. 

this works fine with stock openssh-7.4p1, but with redhat's "7.4p1 release 1.fc25", it fails. if you run ssh-agent -d -a /tmp/ssh.sock and use that auth sock, the agent prints the following.

debug1: type 11
debug1: type 13
process_sign_request2: RSA-CERT key not found
debug1: XXX shrink: 5 < 6

type 11 is the request identities and 13 is a signing request.

Version-Release number of selected component (if applicable):

7.4p1 release 1.fc25

How reproducible:

I've got some go code that will give a 'minimal' reproducer but I need to get approval from the muckimucks here b/c it's pulled straight from this internal pam module.

the basic concept is that, on a remote machine with a forwarded ssh-agent, the pam module connects to SSH_AUTH_SOCK, lists the keys on the agent looking for one of our ssh certificates and when it finds one, it asks the agent to sign some random data with the associated private key and then verifies the signature. that signing operation is what's failing.

I'm going to try and get permission to post this repro, but hopefully there's enough info there to point at the errant fedora patch that's causing this. as I said, this works fine in stock 7.4p1

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Peter Moody 2017-01-27 00:18:38 UTC
Created attachment 1244891 [details]
go repro

build and run like so:

$ go get
$ go build -o sign sign.go
$ ./sign

you should see 

$ ./sign 
error signing data: agent: failed to sign challenge
false

if you're running the ssh-agent in the foreground and you've set your SSH_AUTH_SOCK to use that agent, you'll see the "process_sign_request2: RSA-CERT key not found" message in that window.

Comment 2 Jakub Jelen 2017-01-30 07:54:10 UTC
Hello. Thank you for the report.
It will be most probably again related to the openssl 1.1.0 patch, which is not upstream. I hope I will have a look into it during this week.

Comment 3 Jakub Jelen 2017-02-02 14:48:42 UTC
The problem is again ssh-agent protocol [1], which does not contain N and E parameters of public key, but new OpenSSL requires it.

I already added code to compute N, but computing E is not so simple. I don't have definite fix, except for rebuilding with older openssl.

[1] https://github.com/openssh/openssh-portable/blob/master/PROTOCOL.agent#L245

Comment 4 Tomas Mraz 2017-02-02 16:33:37 UTC
Jakub, have you tried to set zero bignum for E? It's a hack but it should work if the values are not used in the computation. (It should not be used unless RSA_FLAG_EXT_PKEY is set.) Unfortunately the N value is used to obtain the size of the signature, so you have to compute it.

I wonder what the code did with old openssl because the N value is clearly used in the source for the private key signature operation - I suppose that is what openssh later does with the private key. Or what are the operations done with the private key?

Comment 5 Peter Moody 2017-02-02 17:48:10 UTC
how does the agent sign requests that come from a remote sshd?

in our environment, these certs are used to auth to sshd as well as to pam. and sshd definitely asks the agent to sign some random data with the private key associated with this cert. 

eg. this is me using a cert to auth to one of our prod hosts.

debug3: fd 4 is O_NONBLOCK
debug1: type 11               # request identities
debug1: XXX shrink: 3 < 4
debug3: fd 4 is O_NONBLOCK
debug1: type 11               # request identities
debug1: type 13               # sign request
debug3: fd 5 is O_NONBLOCK
debug1: type 11               # request identities
debug1: type 13               # sign request
debug3: fd 6 is O_NONBLOCK
debug1: XXX shrink: 5 < 6

# now I'm authenticated.

also, if this is just about computing N and E (sorry, my openssl library knowledge pretty much begins and ends with how it's spelled), does that mean that it shouldn't affect other key types, like ed25519 or ecdsa keys? I haven't tried those yet, but I can if it would help.

Comment 6 Jakub Jelen 2017-02-02 18:23:13 UTC
The other key types should work fine. They do not have so weird interface in the agent (I hope).

The keys are now set to zero, but from there comes this bug during the attempt to use the certificate for the signature.

With ssh-signature request, the agent gets keyblob and fails when it searches for matching key already stored in the agent (comparing directly N and E parts, which obviously fails if one of them is zero).

Removing the whole patch in Fedora 25 solves the problem. I didn't find the minimal subset which is causing the problem yet.

In F25 we don't have OpenSSL 1.1.0 so the change is most probably not related to openssl 1.1.0 itself, but only to the new interface or some broken logic in the patch. I will have a further look tomorrow.

Comment 7 Tomas Mraz 2017-02-03 09:00:52 UTC
How could it match N and E before when it was not set at all?

Comment 8 Jakub Jelen 2017-02-03 10:08:29 UTC
After further inspection, the N and E are already set in the rsa structure from the certificate blob (which I somehow missed during the original implementation and which was already missed by the original author of the patch).

Skipping the assignment of these values makes it working again. I will send an update soon. So far testing scratch build for F25:

https://koji.fedoraproject.org/koji/taskinfo?taskID=17563323

Thanks for the good reproducer and patience :)

Comment 9 Fedora Update System 2017-02-06 11:51:57 UTC
openssh-7.4p1-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-40057be2c5

Comment 10 Peter Moody 2017-02-06 15:16:15 UTC
Thanks, Jackub.

is there a package that my fedora user(s) can test?

Comment 11 Jakub Jelen 2017-02-06 15:27:19 UTC
Yes, there is the scratch build in the comment #8 or the update for Fedora 25 in comment #9 (should go into the testing soon). So far you can pick that up from koji:
https://koji.fedoraproject.org/koji/buildinfo?buildID=838740

Comment 12 Fedora Update System 2017-02-07 02:49:25 UTC
openssh-7.4p1-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-40057be2c5

Comment 13 Evan Klitzke 2017-02-07 03:48:10 UTC
I am the original user affected by this bug (I work with Peter, and run Fedora). I can confirm that the packages you linked to in #11 fix this bug for me.

Thanks so much for your help -- you already fixed bug 1402029 for us, and I appreciate the hard work to fix this issue as well.

Comment 14 Fedora Update System 2017-02-08 01:51:52 UTC
openssh-7.4p1-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.