Reproduced on httpd24-httpd-2.4.34-18.el7 +++ This bug was initially created as a clone of Bug #1649470 +++ Description of problem: When I fetch audio file without file suffix, httpd response contains garbage in Content-Type header: # curl -sv -o /dev/null http://127.0.0.1/path/to/audio * About to connect() to 127.0.0.1 port 80 (#0) * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0) > GET /path/to/audio HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 127.0.0.1 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 01 Nov 2018 20:08:53 GMT < Server: Apache < Last-Modified: Thu, 01 Nov 2018 19:22:59 GMT < Accept-Ranges: bytes < Content-Length: 24378 < X-Content-Type-Options: nosniff < X-XSS-protection: 1; mode=block < Content-Type: audio/unknown@",▒ < Content-Encoding: v/x-wav < { [data not shown] * Connection #0 to host 127.0.0.1 left intact analysis: Request with the extension: ~~~ # curl -sv -o /dev/null localhost:8080/audio2.wav * About to connect() to localhost port 8080 (#0) * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /audio2.wav HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 08 Nov 2018 13:04:07 GMT < Server: Apache < Last-Modified: Wed, 07 Nov 2018 19:54:28 GMT < ETag: "923032-57a187c631027" < Accept-Ranges: bytes < Content-Length: 9580594 < Connection: close < Content-Type: audio/x-wav < { [data not shown] * Closing connection 0 ~~~ The content-type is ok. Here, the mod_mime_magic module is not used, because the file type is identified by the mod_mime via TypesConfig /etc/mime.types (because I have an entry: audio/x-wav wav). The mod_mime_magic is used when no matches after processing mime.types. Here is the issue: ~~~ curl -sv -o /dev/null localhost:8080/audio2 * About to connect() to localhost port 8080 (#0) * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /audio2 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 08 Nov 2018 13:58:12 GMT < Server: Apache < Last-Modified: Wed, 07 Nov 2018 14:15:01 GMT < ETag: "923032-57a13be6edc57" < Accept-Ranges: bytes < Content-Length: 9580594 < Connection: close < Content-Type: audio/unknown(-audio/x-wav < { [data not shown] * Closing connection 0 ~~~ Because of the garbage, sometimes we get: ~~~ curl -sv -o /dev/null localhost:8080/audio2 * About to connect() to localhost port 8080 (#0) * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /audio2 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 08 Nov 2018 13:58:50 GMT < Server: Apache < Last-Modified: Wed, 07 Nov 2018 14:15:01 GMT < ETag: "923032-57a13be6edc57" < Accept-Ranges: bytes < Content-Length: 9580594 < Connection: close < Content-Type: audio/unknownh < Content-Encoding: audio/x-wav < { [data not shown] * Closing connection 0 ~~~ debug mod_mime_magic traces and matches (with junk): ~~~ [Thu Nov 08 15:15:57.475116 2018] [mime_magic:debug] [pid 7521:tid 139889756055296] mod_mime_magic.c(755): [client 127.0.0.1:34632] AH01508: mod_mime_magic: rsl_strdup() 14 chars: audio/unknownH\x0c [Thu Nov 08 15:15:57.476340 2018] [mime_magic:debug] [pid 7521:tid 139889756055296] mod_mime_magic.c(755): [client 127.0.0.1:34632] AH01508: mod_mime_magic: rsl_strdup() 10 chars: audio/x-wav ~~~ I think that the junk comes from the function magic_rsl_to_request of mod_mime_magic.c that process the RSL and set the MIME info: ~~~ magic_req_rec *req_dat = (magic_req_rec *) ap_get_module_config(r->request_config, &mime_magic_module); ~~~ I am not very familiar with the httpd api structs, and also the request_config struct attributes are a bit opaque, but seems that the request_config is returning a magic_rsl_s which has a chained next struct magic_rsl_s pointing to some junk: Here, ~~~ for (frag = req_dat->head, cur_frag = 0; frag && frag->next; frag = frag->next, cur_frag++) { ~~~ It's returning 3 fragments instead of 2 (2 for the content-type instead of 1): ~~~ (own debug traces) cur_frag: 0 frag: audio/unknown cur_frag: 1 frag: H cur_frag: 2 frag: audio/x-wav ~~~ We can see the junk in the fragment number 1. Debugging a bit more, the Content-Type and Content-Encoding are collected in: ~~~ tmp = rsl_strdup(r, type_frag, type_pos, type_len); ... ap_set_content_type(r, tmp); ... if (state == rsl_encoding) { tmp = rsl_strdup(r, encoding_frag, encoding_pos, encoding_len); ~~~ the request_rec, r, argument passed to the rsl_strdup contains the junk and it's also assigned to tmp. I think that the issue is in the magic file which brings with httpd: ~~~ # Microsoft WAVE format (*.wav) # [GRR 950115: probably all of the shorts and longs should be leshort/lelong] # Microsoft RIFF 0 string RIFF audio/unknown # - WAVE format >8 string WAVE audio/x-wav ~~~ The file format is: ~~~ # The format is 4-5 columns: # Column #1: byte number to begin checking from, ">" indicates continuation # Column #2: type of data to match # Column #3: contents of data to match # Column #4: MIME type of result # Column #5: MIME encoding of result (optional) ~~~ In fact, the WAVE files has the first 4bytes the RIFF magic number and from the 8th byte has another 4 bytes with the format magic number (WAV 0x57415645): ~~~ xxd /var/www/html/audio2.wav | head 0000000: 5249 4646 2a30 9200 5741 5645 666d 7420 RIFF*0..WAVEfmt ~~~ but, It seems that the 'match' function that processes the multi-level continuations doesn't like the multi-level lines where the first level defines also the MIME type, like the WAVE format: ~~~ 0 string RIFF audio/unknown # - WAVE format >8 string WAVE audio/x-wav ~~~ ~~~ $ file -r -0 -m ../conf/magic /var/www/html/audio2 /var/www/html/audio2: audio/unknown audio/x-wav <= returning two types ~~~ Maybe because not make sense. I think that if we have a multi-level is because we don't want to define the content-type without checking the subsequent levels, and the correct form should be: ~~~ 0 string RIFF # - WAVE format >8 string WAVE audio/x-wav ~~~ With this, the type is correctly identified: ~~~ curl -sv -o /dev/null localhost:8080/audio2 * About to connect() to localhost port 8080 (#0) * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /audio2 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 08 Nov 2018 15:10:24 GMT < Server: Apache < Last-Modified: Wed, 07 Nov 2018 14:15:01 GMT < ETag: "923032-57a13be6edc57" < Accept-Ranges: bytes < Content-Length: 9580594 < Connection: close < Content-Type: audio/x-wav < { [data not shown] * Closing connection 0 ~~~ ~~~ $ file -r -0 -m ../conf/magic /var/www/html/audio2 /var/www/html/audio2: audio/x-wav ~~~ The RIFF is very generic and the real format is defined into the format offset of the RIFF descriptor, and it is maybe off base to define a type a RIFF file as audio/unknown, because many files uses RIFF, like AVI. Little test: ~~~ 0 string RIFF >8 string WAVE audio/x-wav >8 string AVI video/x-msvideo ~~~ result: ~~~ curl -sv -o /dev/null localhost:8080/drop * About to connect() to localhost port 8080 (#0) * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /drop HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 08 Nov 2018 15:20:16 GMT < Server: Apache < Last-Modified: Wed, 07 Nov 2018 22:12:34 GMT < ETag: "a5000-57a1a6a4ef64c" < Accept-Ranges: bytes < Content-Length: 675840 < Connection: close < Content-Type: video/x-msvideo < { [data not shown] * Closing connection 0 ~~~ ~~~ $ file -r -0 -m ../conf/magic /var/www/html/drop /var/www/html/drop: video/x-msvideo ~~~ Version-Release number of selected component (if applicable): httpd-2.4.6-67.el7_4.6.x86_64 and also Apache/2.4.29. How reproducible: Steps to Reproduce: 1. make a request to fetch WAVE audio file without file suffix: curl -sv -o /dev/null http://127.0.0.1/path/to/audio * About to connect() to 127.0.0.1 port 80 (#0) * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0) > GET /path/to/audio HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 127.0.0.1 > Accept: */* > < HTTP/1.1 200 OK < Date: Thu, 01 Nov 2018 20:08:53 GMT < Server: Apache < Last-Modified: Thu, 01 Nov 2018 19:22:59 GMT < Accept-Ranges: bytes < Content-Length: 24378 < X-Content-Type-Options: nosniff < X-XSS-protection: 1; mode=block < Content-Type: audio/unknown@",▒ < Content-Encoding: v/x-wav < { [data not shown] * Connection #0 to host 127.0.0.1 left intact Actual results: The content-type header contains garbage and the type it is not well identified by magic: < Content-Type: audio/unknown@",▒ Expected results: The content-type should be audio/x-wav without garbage. Additional info: workaround: replace the following line of your /etc/httpd/conf/magic file: ~~~ 0 string RIFF audio/unknown ~~~ to ~~~ 0 string RIFF ~~~ If you have the default magic file. Final form: ~~~ 0 string RIFF >8 string WAVE audio/x-wav ~~~ --- Additional comment from Red Hat Bugzilla Rules Engine on 2018-11-13 16:28:08 UTC --- Since this bug report was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from Àngel Ollé Blázquez on 2019-01-30 21:36:52 UTC --- --- Additional comment from Kyle Walker on 2019-06-26 14:38:05 UTC --- Just a note, it looks like the associated case was closed with the express note that this bug would continue forwards. --- Additional comment from Joe Orton on 2019-06-27 09:11:58 UTC --- Àngel - thanks, very nice analysis! I agree with your conclusion that mod_mime_magic can't handle both a MIME type defined for the top-level match and the continuation line, and have pushed your suggested change to the magic file upstream: https://svn.apache.org/viewvc?view=revision&revision=1862200 I am not sure where the memory corruption is coming from and can't reproduce that against upstream, but possibly it is this fix: https://svn.apache.org/viewvc?view=revision&revision=1491700 --- Additional comment from Joe Orton on 2019-07-05 11:28:52 UTC --- Merged upstream for 2.4.40 - http://svn.apache.org/r1862604
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (httpd24 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5280