Bug 106479 - mkisofs doesn't encode utf-8 filenames correctly for joliet extension
mkisofs doesn't encode utf-8 filenames correctly for joliet extension
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: cdrtools (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Harald Hoyer
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-10-07 12:21 EDT by Jaakko Heinonen
Modified: 2007-11-30 17:10 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-01-29 08:54:03 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jaakko Heinonen 2003-10-07 12:21:42 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

Description of problem:
On utf-8 filesystem it's not possible to create iso-images with correctly
encoded Joliet filenames. mkisofs has -input-charset option but utf-8 is not
supported.

How reproducible:
Always

Steps to Reproduce:
Create an iso image with mkisofs -J on filesystem with utf-8 filenames.
(Filenames must have non-ASCII characters.)

Actual Results:  Filenames in joliet extension are incorrectly encoded.

Additional info:

There is a patch available on
http://joerghaeger.de/webCDwriter/mkisofs+UTF-8.html . However this patch causes
mkisofs to work incorrectly on non-utf-8 systems.
Comment 1 Harald Hoyer 2003-10-24 10:28:05 EDT
please report this upstream to the author of cdrtools... I will consider this
patch for Fedora Core 2
Comment 2 Jungshik Shin 2003-10-30 13:05:03 EST
See a thread of articles 

at http://mail.nl.linux.org/linux-utf8/2002-10/msg00050.html

I loved to see this fixed in the upstream, but the response hasn't been
positive. No response wouldn't be considered as positive, would it? 

Anyway, the patch by Ilya Konstantinov (built upon my patch) works not only for
UTF-8 but also for other character encodings. 

We can further improve it to make it independent of iconv(3) used. Currently, it
assumes that the name of the codeset for UTF-16LE is 'UTF-16LE', but different
iconv(3) implementations use different names for UTF-16LE. To avoid this
problem, we can convert the input charset to UTF-8 first and then use our
own(built-in) UTF-8 -> UTF-16LE conversion routine.

Of course, if we just want to fix this on Linux or where Bruno's libiconv is
used, we don't have to worry about the portability. 

BTW, would anyone give yet another try at persuading the maintainer of
mkisofs(cdrtools) to support multibyte character encodings with iconv? 
Comment 3 Jaakko Heinonen 2003-11-04 02:48:46 EST
As Jungshik Shin said the upstream author seems not to be interested.
I have ported the patch by Shin & Konstantinov to the latest version
of cdrtools and added automatic detection of UTF-8 encoding. The patch
is made for cdrtools 2.01a19 but applies also to version 2.0. It is
found here:

http://users.utu.fi/jahhein/mkisofs/mkisofs-iconv-2.patch

Known issue is that mkisofs won't print character encodings available
through iconv with "-input-charset help".
Comment 4 Harald Hoyer 2003-11-04 05:53:28 EST
cool! many thanx!
Comment 5 Jaakko Heinonen 2003-11-13 10:17:07 EST
The patch was broken. New version is found from:

http://users.utu.fi/jahhein/mkisofs/

I may make more changes later. Latest version will be found in the
directory above.
Comment 6 Jaakko Heinonen 2004-02-03 07:27:40 EST
Version 9 of the patch which was included to Rawhide had a bug related
to sorting of the Joliet items. (Details:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=230725) Version 10
fixes this bug.

Note You need to log in before you can comment on or make changes to this bug.