Bug 114197

Summary: Uploaded files for content types have wrong extension
Product: [Retired] Red Hat Enterprise CMS Reporter: Jon Orris <jorris>
Component: content typesAssignee: Randy Graebner <randyg>
Status: CLOSED WONTFIX QA Contact: Jon Orris <jorris>
Severity: medium Docs Contact:
Priority: medium    
Version: nightlyCC: richardl, sseago
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-05 17:43:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jon Orris 2004-01-23 20:52:29 UTC
Description of problem:
@39693

Uploaded files for content types have incorrect mime types when
accessed due to the extension being changed
For example, upload an OpenOffice speadsheet. It will say 'You've
chosen to open bugs-rc0.sxc.bin' instead of it being an sxc file,
which is registered in my browser.
.txt files claim to be 'Type ASC', though the extension isn't changed.

For JPEG files, the extension & mime type are correct.

Comment 1 Scott Seago 2004-02-04 16:46:46 UTC
For txt files, I'm getting the correct mime type and filename:

 1 HTTP/1.1 200 OK
 2 Content-Type: text/plain
 3 Content-Length: 6506
 4 Date: Wed, 04 Feb 2004 16:28:26 GMT
 5 Content-Disposition: attachment; filename="carrefour-install-log.txt"

For the sxc file I'm also seeing the .bin added. Essentially what is
going on is:
1) upon file upload, if extension doesn't match recognized mime types,
it uses a default value, which for attachments is
'application/octet-stream'
2) For file download, the Content-Disposition header is set based on
the uploaded filename. However, if the filename doesn't end in the
extension associated with the mime type, the extension is added.
3) In this case, since 'application/octet-stream' has an associated
'bin' extension, which doesn't match the '.sxc' extension, '.bin' gets
appended.

I can think of 3 possible solutions:
1) Don't ever append mime type-generated extension to the filename
2) Don't append mime type-generated extension unless the file has no
extension at all, instead of making sure the extensions match (i.e.
add the extension if there's no '.' in the filename)
3) Keep current behavior but make an exception for
'application/octet-stream', and for this mime type, don't add the
extension even if it's missing.

I'm inclined to go with 1), unless we can think of any use cases where
we would need to add an extension to the filename.

Comment 2 Scott Seago 2004-02-05 05:01:22 UTC
Checked in fix @40082 following approach 1) above.

Comment 3 Jon Orris 2004-02-05 16:02:01 UTC
The .bin extension isn't added, but it still sets the type to BIN. So
if I click the uploaded presentation, Firebird will say 'You have
chosen to open 'Estimation.xsi' from
http://localhost:9004/ccm/cms-service/stream/asset/ which is a: BIN file.

Mozilla says it's of type application/octet-stream.  

Is it possible to keep the correct original extension sent somehow?
The mime type may not be known to CMS, but my browser has the
extension registered.


Comment 4 Scott Seago 2004-02-05 16:13:05 UTC
For unknown mime types, CMS uses application/octet-stream. In this
case, it sounds like the original extension is being sent (assuming
xsi is the correct extension). Since the mime type is set, though,
that takes precedence over the extension. We could consider doing one
or more of the following:
1) Add new mime types: compile a list of current missing mime types
that we may conceivably need to support and add those to the database
2) Don't set the mime type for the output stream if the file extension
doesn't match the expected extension for the mime type.

2) would solve the use case of "normal extension for an unknown mime
type." However, it would break for cases where the user explicitly
chose a mime type for a document which may for some reason have a
non-standard extension. So we may be better off leaving the code as it
is and just adding any missing common mime types.

Comment 5 Richard Li 2004-02-12 21:35:46 UTC
i agree we should add any missing mime types to finish ticket resolution.

Comment 6 Scott Seago 2004-02-12 23:53:55 UTC
Richard -- are you suggesting that we should add new mime types but
leave the current behavior for unknown mime types (i.e. fall back to
the default)?

Next question: which mime types do we need to support that are
currently missing? 

Comment 7 Richard Li 2004-02-24 19:00:45 UTC
Yes, I think we should add any mime types that we know are missing,
and fall leave the current behavior.

For missing media types, I found:

ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/

and

http://www.mhonarc.org/~ehood/MIME/2048/rfc2048.html#2.5

Given the above discussion, though, I think that the issues raised in
comment 3 aren't worth a lot of time to get right. (If it's easy to
fix, then that's fine, but I think we have more critical issues.)

Comment 8 David Lawrence 2006-07-18 03:40:35 UTC
QA_READY has been deprecated in favor of ON_QA. Please use ON_QA in the future.
Moving to ON_QA.

Comment 9 Jon Orris 2006-09-05 17:43:07 UTC
Closing old tickets