Description of problem: mkisofs cannot handle UTF-8 local (input) encoding. A transcript of a shell session follows: [root@vagabond mp3]# ls -la "ÐоÑнÑе ÑнайпеÑÑ" иÑого 12 drwxrwxr-x 3 avn avn 4096 ÐÐ°Ñ 30 19:30 . drwxrwxr-x 9 avn avn 4096 ÐÐ°Ñ 30 21:00 .. drwxrwxr-x 2 avn avn 4096 ÐÐ°Ñ 30 19:33 2002 ЦÑнами // First try the old-way (that worked on 7.x RedHat Linuxes) [root@vagabond mp3]# mkisofs -o q.iso -J -jcharset koi8-r "ÐоÑнÑе ÑнайпеÑÑ" [root@vagabond mp3]# mount -t iso9660 -o ro,loop=/dev/loop0 q.iso q [root@vagabond mp3]# ls -la q иÑого 8 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 . drwxrwxr-x 9 avn avn 4096 ÐÐ°Ñ 30 21:00 .. dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 2002 ???????????? // Try 'default' conversion with 1:1 mapping of local file names [root@vagabond mp3]# mkisofs -o q.iso -J -jcharset default "ÐоÑнÑе ÑнайпеÑÑ" [root@vagabond mp3]# mount -t iso9660 -o ro,loop=/dev/loop0 q.iso q [root@vagabond mp3]# ls -la q иÑого 8 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 . drwxrwxr-x 9 avn avn 4096 ÐÐ°Ñ 30 21:00 .. dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 2002 Ц??нами [root@vagabond mp3]# ls -la q q q.iso [root@vagabond mp3]# ls -la q/2002\ Ц�?нами/ иÑого 63650 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 . dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 .. -r-xr-xr-x 1 root root 4058601 Ðек 9 15:10 01.??а??о??од??.mp3 -r-xr-xr-x 1 root root 4872313 Ðек 9 15:10 02.??а??а??????о??и??е??ки.mp3 (the rest of output skipped) // Try to omit the -jcharset at all: [root@vagabond mp3]# mkisofs -o q.iso -J "ÐоÑнÑе ÑнайпеÑÑ" [root@vagabond mp3]# mount -t iso9660 -o ro,loop=/dev/loop0 q.iso q [root@vagabond mp3]# ls -la q иÑого 8 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 . drwxrwxr-x 9 avn avn 4096 ÐÐ°Ñ 30 21:00 .. dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 2002 Ц?_нами [root@vagabond mp3]# ls -la q/2002\ Ц�нами/ иÑого 63650 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 . dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 .. -r-xr-xr-x 1 root root 4058601 Ðек 9 15:10 01.?_а?_о?_од?_.mp3 -r-xr-xr-x 1 root root 4872313 Ðек 9 15:10 02.?_а?_а?_?_?_о?_и?_е?_ки.mp3 You see, in first way the file names are scrambled beyond measure (those characters represented above with question marks and underscores are cyrillic characters, in fact). The 2nd and 3rd ways preserve some characters, though some characters are scrabled also. This link http://mail.nl.linux.org/linux-utf8/2002-03/msg00022.html has a long discussion considering what is wrong with mkisofs (the message above even has a patch, though I didn't try it out. Yet.) In brief, it seems that iso9660+joliet system make use of UCS-2 encoding, not of UTF-8. The mkisofs itself cannot convert from UTF-8 to UCS-2; the patch above claims adding iconv(3) support to mkisofs.
Created attachment 90793 [details] The patch is attached for convenience.
Actually, I also tried adding "iocharset=utf8,utf8" to mount options (which is the way the CD-ROM is mounted to enable cyrillic characters), the result is the same: the cyrillic characters are scrambled. I tried rebuilding the source RPM for cdrtools with the patch above applied. It builds ok, and the resulting mkisofs produces nice ISO images: [root@vagabond mp3]# ~avn/tmp/mkisofs -o q.iso -jcharset utf-8 "ÐоÑнÑе ÑнайпеÑÑ" [root@vagabond mp3]# mount -t iso9660 -o ro,loop=/dev/loop0,iocharset=utf8,utf8 q.iso q [root@vagabond mp3]# ls -la q иÑого 8 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 . drwxrwxr-x 9 avn avn 4096 ÐÐ°Ñ 30 21:58 .. dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 2002 ЦÑнами [root@vagabond mp3]# ls -la q/2002\ ЦÑнами иÑого 63650 dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:33 . dr-xr-xr-x 1 root root 2048 ÐÐ°Ñ 30 19:30 .. -r-xr-xr-x 1 root root 4058601 Ðек 9 15:10 01.ÐаÑоÑодÑ.mp3 -r-xr-xr-x 1 root root 4872313 Ðек 9 15:10 02.ÐаÑаÑÑÑоÑиÑеÑки.mp3 Right now I'm updating my mkisofs RPM with this hand-made one ;)
thx