Bug 1828812 - httpd response contains garbage in Content-Type header
Summary: httpd response contains garbage in Content-Type header
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Software Collections
Classification: Red Hat
Component: httpd
Version: httpd24
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Luboš Uhliarik
QA Contact: Branislav Náter
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-28 12:03 UTC by Branislav Náter
Modified: 2020-12-01 12:07 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1649470
Environment:
Last Closed: 2020-12-01 12:06:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Branislav Náter 2020-04-28 12:03:32 UTC
Reproduced on httpd24-httpd-2.4.34-18.el7

+++ This bug was initially created as a clone of Bug #1649470 +++

Description of problem:

When I fetch audio file without file suffix, httpd response contains garbage in Content-Type header:

# curl -sv -o /dev/null http://127.0.0.1/path/to/audio
* About to connect() to 127.0.0.1 port 80 (#0)
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /path/to/audio HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 01 Nov 2018 20:08:53 GMT
< Server: Apache
< Last-Modified: Thu, 01 Nov 2018 19:22:59 GMT
< Accept-Ranges: bytes
< Content-Length: 24378
< X-Content-Type-Options: nosniff
< X-XSS-protection: 1; mode=block
< Content-Type: audio/unknown@",▒
< Content-Encoding: v/x-wav
<
{ [data not shown]
* Connection #0 to host 127.0.0.1 left intact

analysis:

Request with the extension:

~~~
# curl -sv -o /dev/null localhost:8080/audio2.wav
* About to connect() to localhost port 8080 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /audio2.wav HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 08 Nov 2018 13:04:07 GMT
< Server: Apache
< Last-Modified: Wed, 07 Nov 2018 19:54:28 GMT
< ETag: "923032-57a187c631027"
< Accept-Ranges: bytes
< Content-Length: 9580594
< Connection: close
< Content-Type: audio/x-wav
<
{ [data not shown]
* Closing connection 0
~~~

The content-type is ok. Here, the mod_mime_magic module is not used, because the file type is identified by the mod_mime via TypesConfig /etc/mime.types (because I have an entry: audio/x-wav  wav).

The mod_mime_magic is used when no matches after processing mime.types. Here is the issue:

~~~
curl -sv -o /dev/null localhost:8080/audio2
* About to connect() to localhost port 8080 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /audio2 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 08 Nov 2018 13:58:12 GMT
< Server: Apache
< Last-Modified: Wed, 07 Nov 2018 14:15:01 GMT
< ETag: "923032-57a13be6edc57"
< Accept-Ranges: bytes
< Content-Length: 9580594
< Connection: close
< Content-Type: audio/unknown(-audio/x-wav
<
{ [data not shown]
* Closing connection 0
~~~

Because of the garbage, sometimes we get:

~~~
curl -sv -o /dev/null localhost:8080/audio2
* About to connect() to localhost port 8080 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /audio2 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 08 Nov 2018 13:58:50 GMT
< Server: Apache
< Last-Modified: Wed, 07 Nov 2018 14:15:01 GMT
< ETag: "923032-57a13be6edc57"
< Accept-Ranges: bytes
< Content-Length: 9580594
< Connection: close
< Content-Type: audio/unknownh

< Content-Encoding: audio/x-wav
<
{ [data not shown]
* Closing connection 0
~~~

debug mod_mime_magic traces and matches (with junk):

~~~
[Thu Nov 08 15:15:57.475116 2018] [mime_magic:debug] [pid 7521:tid 139889756055296] mod_mime_magic.c(755): [client 127.0.0.1:34632] AH01508: mod_mime_magic: rsl_strdup() 14 chars: audio/unknownH\x0c
[Thu Nov 08 15:15:57.476340 2018] [mime_magic:debug] [pid 7521:tid 139889756055296] mod_mime_magic.c(755): [client 127.0.0.1:34632] AH01508: mod_mime_magic: rsl_strdup() 10 chars: audio/x-wav

~~~

I think that the junk comes from the function magic_rsl_to_request of mod_mime_magic.c that process the RSL and set the MIME info:

~~~
    magic_req_rec *req_dat = (magic_req_rec *)
                    ap_get_module_config(r->request_config, &mime_magic_module);
~~~

I am not very familiar with the httpd api structs, and also the request_config struct attributes are a bit opaque, but seems that the request_config is returning a magic_rsl_s which has a chained next struct magic_rsl_s pointing to some junk:

Here,
~~~
    for (frag = req_dat->head, cur_frag = 0; frag && frag->next; frag = frag->next, cur_frag++) {
~~~

It's returning 3 fragments instead of 2 (2 for the content-type instead of 1):

~~~
(own debug traces)

cur_frag: 0
frag:
audio/unknown
cur_frag: 1
frag:
H

cur_frag: 2
frag:
audio/x-wav
~~~

We can see the junk in the fragment number 1.

Debugging a bit more, the Content-Type and Content-Encoding are collected in:

~~~
tmp = rsl_strdup(r, type_frag, type_pos, type_len);
...
ap_set_content_type(r, tmp);
...
    if (state == rsl_encoding) {
        tmp = rsl_strdup(r, encoding_frag,
                                         encoding_pos, encoding_len);
~~~

the request_rec, r, argument passed to the rsl_strdup contains the junk and it's also assigned to tmp.

I think that the issue is in the magic file which brings with httpd:

~~~
# Microsoft WAVE format (*.wav)
# [GRR 950115:  probably all of the shorts and longs should be leshort/lelong]
#                    Microsoft RIFF
0    string        RIFF        audio/unknown
#                    - WAVE format
>8    string        WAVE        audio/x-wav
~~~

The file format is:

~~~
# The format is 4-5 columns:
#    Column #1: byte number to begin checking from, ">" indicates continuation
#    Column #2: type of data to match
#    Column #3: contents of data to match
#    Column #4: MIME type of result
#    Column #5: MIME encoding of result (optional)
~~~

In fact, the WAVE files has the first 4bytes the RIFF magic number and from the 8th byte has another 4 bytes with the format magic number (WAV 0x57415645):

~~~
xxd /var/www/html/audio2.wav | head
0000000: 5249 4646 2a30 9200 5741 5645 666d 7420  RIFF*0..WAVEfmt
~~~

but, It seems that the 'match' function that processes the multi-level continuations doesn't like the multi-level lines where the first level defines also the MIME type, like the WAVE format:

~~~
0    string        RIFF        audio/unknown
#                    - WAVE format
>8    string        WAVE        audio/x-wav
~~~

~~~
$ file -r -0 -m ../conf/magic /var/www/html/audio2
/var/www/html/audio2: audio/unknown audio/x-wav <= returning two types
~~~

Maybe because not make sense. I think that if we have a multi-level is because we don't want to define the content-type without checking the subsequent levels, and the correct form should be:

~~~
0    string        RIFF       
#                    - WAVE format
>8    string        WAVE        audio/x-wav
~~~

With this, the type is correctly identified:

~~~
curl -sv -o /dev/null localhost:8080/audio2
* About to connect() to localhost port 8080 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /audio2 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 08 Nov 2018 15:10:24 GMT
< Server: Apache
< Last-Modified: Wed, 07 Nov 2018 14:15:01 GMT
< ETag: "923032-57a13be6edc57"
< Accept-Ranges: bytes
< Content-Length: 9580594
< Connection: close
< Content-Type: audio/x-wav
<
{ [data not shown]
* Closing connection 0
~~~

~~~
$ file -r -0 -m ../conf/magic /var/www/html/audio2
/var/www/html/audio2: audio/x-wav
~~~


The RIFF is very generic and the real format is defined into the format offset of the RIFF descriptor, and it is maybe off base to define a type a RIFF file as audio/unknown, because many files uses RIFF, like AVI. 

Little test:

~~~
0       string          RIFF           
>8      string          WAVE            audio/x-wav
>8      string          AVI             video/x-msvideo
~~~

result:

~~~
curl -sv -o /dev/null localhost:8080/drop
* About to connect() to localhost port 8080 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /drop HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 08 Nov 2018 15:20:16 GMT
< Server: Apache
< Last-Modified: Wed, 07 Nov 2018 22:12:34 GMT
< ETag: "a5000-57a1a6a4ef64c"
< Accept-Ranges: bytes
< Content-Length: 675840
< Connection: close
< Content-Type: video/x-msvideo
<
{ [data not shown]
* Closing connection 0
~~~

~~~
$ file -r -0 -m ../conf/magic /var/www/html/drop
/var/www/html/drop: video/x-msvideo
~~~


Version-Release number of selected component (if applicable):

httpd-2.4.6-67.el7_4.6.x86_64 and also Apache/2.4.29.


How reproducible:


Steps to Reproduce:
1. make a request to fetch WAVE audio file without file suffix:

curl -sv -o /dev/null http://127.0.0.1/path/to/audio
* About to connect() to 127.0.0.1 port 80 (#0)
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET /path/to/audio HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 01 Nov 2018 20:08:53 GMT
< Server: Apache
< Last-Modified: Thu, 01 Nov 2018 19:22:59 GMT
< Accept-Ranges: bytes
< Content-Length: 24378
< X-Content-Type-Options: nosniff
< X-XSS-protection: 1; mode=block
< Content-Type: audio/unknown@",▒
< Content-Encoding: v/x-wav
<
{ [data not shown]
* Connection #0 to host 127.0.0.1 left intact


Actual results:

The content-type header contains garbage and the type it is not well identified by magic:
< Content-Type: audio/unknown@",▒

Expected results:

The content-type should be audio/x-wav without garbage.

Additional info:

workaround:

replace the following line of your /etc/httpd/conf/magic file:

~~~
0	string		RIFF		audio/unknown
~~~

to

~~~
0       string          RIFF
~~~

If you have the default magic file. 

Final form:

~~~
0    string        RIFF
>8    string        WAVE        audio/x-wav
~~~

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-11-13 16:28:08 UTC ---

Since this bug report was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Àngel Ollé Blázquez on 2019-01-30 21:36:52 UTC ---



--- Additional comment from Kyle Walker on 2019-06-26 14:38:05 UTC ---

Just a note, it looks like the associated case was closed with the express note that this bug would continue forwards.

--- Additional comment from Joe Orton on 2019-06-27 09:11:58 UTC ---

Àngel - thanks, very nice analysis!

I agree with your conclusion that mod_mime_magic can't handle both a MIME type defined for the top-level match and the continuation line, and have pushed your suggested change to the magic file upstream:

https://svn.apache.org/viewvc?view=revision&revision=1862200

I am not sure where the memory corruption is coming from and can't reproduce that against upstream, but possibly it is this fix:

https://svn.apache.org/viewvc?view=revision&revision=1491700

--- Additional comment from Joe Orton on 2019-07-05 11:28:52 UTC ---

Merged upstream for 2.4.40 - http://svn.apache.org/r1862604

Comment 8 errata-xmlrpc 2020-12-01 12:06:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (httpd24 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5280


Note You need to log in before you can comment on or make changes to this bug.