1472829 – mod_auth_mellon SIGSEGV in am_urldecode

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1472829 - mod_auth_mellon SIGSEGV in am_urldecode

Summary: mod_auth_mellon SIGSEGV in am_urldecode

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	mod_auth_mellon
Sub Component:
Version:	6.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	John Dennis
QA Contact:	ipa-qe
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1461138 1484119 1504542
TreeView+	depends on / blocked

Reported:	2017-07-19 13:34 UTC by Robert Bost
Modified:	2021-06-10 12:38 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1484119 (view as bug list)
Environment:
Last Closed:	2017-11-21 15:22:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	3119651	0	None	None	None	2017-07-19 13:54:56 UTC

Description Robert Bost 2017-07-19 13:34:22 UTC

Description of problem:

Customer reports the following:
When handling a SAML2 session or cookie expiry, a long RelayState can cause some or all of the existing httpd child processes to segfault together meaning the user is stuck with an error page (and potentially loses data stuck in a post-preservation state).


Program terminated with signal 11, Segmentation fault.
#0  am_urldecode (data=<value optimized out>) at auth_mellon_util.c:664
664	    *op = '\0';
(gdb) p op
$2 = 0x7f26250e23f1 ""
(gdb) bt
#0  am_urldecode (data=<value optimized out>) at auth_mellon_util.c:664
#1  0x00007f26250de756 in am_post_mkform_urlencoded (r=0x7f2630bc1538, post_data=<value optimized out>) at auth_mellon_handler.c:2325
#2  0x00007f26250de274 in am_handle_repost (r=0x7f2630bc1538) at auth_mellon_handler.c:2454
#3  am_handler (r=0x7f2630bc1538) at auth_mellon_handler.c:3116
#4  0x00007f262fa10fc0 in ap_run_handler (r=0x7f2630bc1538) at /usr/src/debug/httpd-2.2.15/server/config.c:158
#5  0x00007f262fa1487e in ap_invoke_handler (r=0x7f2630bc1538) at /usr/src/debug/httpd-2.2.15/server/config.c:376
#6  0x00007f262fa1ffd0 in ap_process_request (r=0x7f2630bc1538) at /usr/src/debug/httpd-2.2.15/modules/http/http_request.c:282
#7  0x00007f262fa1ce18 in ap_process_http_connection (c=0x7f26309f8348) at /usr/src/debug/httpd-2.2.15/modules/http/http_core.c:190
#8  0x00007f262fa18ae8 in ap_run_process_connection (c=0x7f26309f8348) at /usr/src/debug/httpd-2.2.15/server/connection.c:43
#9  0x00007f262fa24d77 in child_main (child_num_arg=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:670
#10 0x00007f262fa25099 in make_child (s=0x7f26305c5870, slot=2) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:773
#11 0x00007f262fa25d1c in perform_idle_server_maintenance (_pconf=<value optimized out>, plog=<value optimized out>, s=<value optimized out>)
    at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:908
#12 ap_mpm_run (_pconf=<value optimized out>, plog=<value optimized out>, s=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:1112
#13 0x00007f262f9fcaa0 in main (argc=1, argv=0x7ffea1867e58) at /usr/src/debug/httpd-2.2.15/server/main.c:763

Version-Release number of selected component (if applicable):
mod_auth_mellon-0.8.0-4.el6.x86_64

Additional info:
Upstream report looks similar: https://github.com/UNINETT/mod_auth_mellon/pull/115

Comment 2 John Dennis 2017-07-20 16:58:35 UTC

We can certainly fix the segfault but that may be fixing the symptom rather that the root cause.

The problem is am_post_mkform_urlencoded() does not detect the case where a query parameter does not have a value. More to the point I do not believe it's even an RFC requirement that the query part of a URI be formatted as a sequence of name=value pairs. So certainly am_post_mkform_urlencoded() needs some tightening up.

But what really worries me is statement the problem manifests itself when the RelayState parameter is too long. No where in this bug report does it state what the RelayState parameter was, that is essential information. If the RelayState parameter is being dropped for some reason there are going to be a host of other problems even after fixing the segfault because the RealyState is an essential piece of data for the SAML flow. If it's missing you won't be able to get back to the original resource.

Also there are two things which violate the SAML spec going on here (if indeed the RelayState is the root cause).

1) SAML requires each party that receives a RelayState value to pass it along unmodified.

2) SAML mandates the RelayState not exceed 80 characters.

Mellon (as well as a number of other SAML implementations) have never enforced the 80 character limit on the RelayState because you can usually get away with embedding the resource URL in the RelayState. But there are limits on how long URL's can be as well as their query parameters. Mellon could generate a unique id for the resource URL and store that in a table. The unique id is then encoded in the RelayState as opposed to the resource URL, this has other privacy advantages as well.

I need 2 pieces of information:

I would like to see the post_data passed to am_post_mkform_urlencoded(), this can be captured in gdb, but please make sure it's the entire string and gdb didn't truncate it.

I would also like to know what the RelayState was.

It would also be helpful to know why the customer believes a long RelayState is the root cause (any query parameter with a null value will trigger the segfault, it does not have to be the RelayState parameter, but if the RelayState is being stripped that is essential information, providing the post_data will answer this.

Comment 4 John Dennis 2017-08-03 20:04:25 UTC

1) The RelayState is derived from the return_url, it's also url-encoded which usually increases it sized (depends on how many characters need escaping)

2) You can get the post data from your browser by using the browser's developer tools. With Firefox it's firebug or with Chrome it's the built-in developer tool panel. What you need to do start an authentication flow in the browser and then examine the HTML response from the IdP, it should be mostly form data.

Hint, if you use firebug the data is in the "net" panel but you must enable the "persist" button in the "net" panel, that keeps a running log of all requests, otherwise the "net" panel only show the most recent request. The data should be in the "html" tab. Chrome always seems to keep a persistent log until cleared.

Comment 6 David Mulford 2017-08-07 13:52:10 UTC

I've attached the requested post data containing the RelayState, which appears to be over 80 characters as described.

In addition, the customer offered the following clarification on reproducing the issue: "... this is a problem when a local mellon session times out (or you clear the local cookie so the session is invalid) and the next request is a POST with a large RelayState."

With this additional explanation, I'll test this out to see if we have a reproducer.

Comment 7 David Mulford 2017-08-08 17:09:06 UTC

I think the issue must be elsewhere, i.e. not caused by RelayState with length greater than 80. I was able to test this and see the following message in httpd's error_log:

Lasso-WARNING **: 2017-08-08 12:43:22	Encoded a RelayState of more than 80 bytes, see #3.4.3 of saml-bindings-2.0-os

The request still completed successfully for me.

Comment 8 John Dennis 2017-08-22 14:26:10 UTC

I believe this is the same issue identified in this upstream issue report and pull request:

https://github.com/UNINETT/mod_auth_mellon/pull/115

The upstream commit is:

967f820a087bf2567db4b5f0dbeeff845c1e7f8a

Comment 12 John Dennis 2017-11-20 15:22:23 UTC

The issue occurs when the protected resource is accessed via a POST as opposed to a GET. I do not know whether or not that is sufficient to trigger the segfault. I personally have not encountered this particular failure, mostly have been relying on upstream to diagnose and fix this issue.

Specifically the issue occurs when the enctype query parameter is application/x-www-form-urlencoded and one of the query parameters has an empty value, e.g. foo=
 when the requust is posted to the "repost" mellon endpoint.

The sequence of operations is as follows:

1) HTTP client sends a POST request to a resource protected by Mellon.

2) Mellon saves the post information and begins the authentication flow to the IdP.

3) After successful authentication from the IdP Mellon will redirect to the RelayState parameter, which because this was a saved POST will look like this:

"%srepost?id=%s&ReturnTo=%s&enctype=%s%s" where the %s substitutions are:

MellonEndpointPath
Repost ID
RelayState
ContentType
Charset

After this redirect you'll land back in the am_handle_repost handler where because the enctype is application/x-www-form-urlencoded it will call am_post_mkform_urlencoded which will try to parse the list of query parameters, if one of those query parameters has an empty value it will segfault.

You can either try to reproduce the entire sequence or short circuit the entire flow and just send a request with the above query parameters to $MellonEndpointPath/repost.

Note You need to log in before you can comment on or make changes to this bug.