Description of problem: According to Xiph.org Foundation: http://xiph.org/ogg/vorbis/doc/v-comment.html ...a comment vector consists of an ASCII field name, a "=" character, and "8 bit clean UTF-8 encoded field contents to the end of the field." But libvorbis completely bombs if the encoded field contents contains 8-bit UTF-8-encoded data. Version-Release number of selected component (if applicable): libvorbis-1.0-7 How reproducible: Create an Ogg Vorbis file with text in a comment field which requires UTF-8 encoding. (You can't use vorbiscomment; it replaces all characters greater than ASCII value 0x7F with "#".) Try to play back the Ogg Vorbis file. Actual results: You will get an error message similar to "Error opening [file] using the oggvorbis module. The file may be corrupted.". Expected results: libvorbis should correctly parse and decode the file.
Created attachment 92817 [details] A sample UTF-8 encoded Ogg Vorbis file. Here's an example. libvorbis will complain that this Ogg Vorbis file is corrupted, but as far as I can tell, it is completely valid.
How did you create this file?
Emacs, I think. (It's been a while.) But from using XMMS to test editing comments, I see now that other data in the file changes when the comment changes. (In particular, 4 bytes at offset 80 change.) I'm guessing that the data in question is a checksum of some sort that includes the comments in its calculation, so by hand editing the comments, I corrupted the file. By using Emacs to look at the file whose comments I edited with XMMS, I see that the comments are indeed UTF-8 encoded, and libvorbis deals with it just fine. Therefore, the only real problem here is that vorbiscomment isn't UTF-8 aware. (Which is still a bug, but not a serious one.)
It looks to me like vorbiscomment has a --raw switch for passing UTF-8 directly, instead of recoding for your locale. Is your locale not UTF-8?
This bug still occurs in FC6. It seems that all the vorbis-tools "automatic recoding for your locale" is broken. Looks like it assumes that en_US means ISO8859-1 or something. I'll do some more digging. $ env | grep LANG LANG=en_US.UTF-8 $ vorbiscomment 01\ -\ 天使の休息.ogg TITLE=????? ARTIST=???? TRACKNUMBER=1 TRACKTOTAL=4 ALBUM=????? MUSICBRAINZ_SORTNAME=???? $ vorbiscomment -R 01\ -\ 天使の休息.ogg TITLE=天使の休息 ARTIST=奥井雅美 TRACKNUMBER=1 TRACKTOTAL=4 ALBUM=天使の休息 MUSICBRAINZ_SORTNAME=奥井雅美
Exporting LC_ALL=en_US.UTF-8 sees no change. However, this works: $ export CHARSET=UTF-8 $ vorbiscomment 01\ -\ 天使の休息.ogg TITLE=天使の休息 ARTIST=奥井雅美 TRACKNUMBER=1 TRACKTOTAL=4 ALBUM=天使の休息 MUSICBRAINZ_SORTNAME=奥井雅美 Definitely something broken in the automatic charset conversion. Breaks for ogg123 as a result too.
Something's wrong with their configure and automagic stuff. If I remove the #ifdef HAVE_LANGINFO_CODSET lines from share/utf8.c, all works fine. It looks as though the configure script is picking it up correctly, but it's not being compiled properly.
Err, HAVE_LANGINFO_CODESET clearly. But there definitely seems to be some sort of mistake in the build script. #define HAVE_LANGINFO_CODESET 1 is getting set in config.h, but somehow not going through to the rest of the build.
Ah, got it. It's fixed upstream. Several files, like utf8.c, need to have a #ifdef HAVE_CONFIG_H # include <config.h> #endif block in them, but don't. See this link for the changeset: https://trac.xiph.org/changeset/10080 and this upstream bug: https://trac.xiph.org/ticket/685 A new version of vorbis-tools hasn't been released for a long time (since these fixes are a year old.)
Created attachment 139640 [details] Patch which fixes the charset conversion This fixes the charset conversion and use of iconv() by properly including config.h in some files that need it. This is a copy of a changeset which has been applied upstream.
*** Bug 187649 has been marked as a duplicate of this bug. ***
I rebuilt vorbis-tools-1.1.1-2.src.rpm from Fedora Development with the patch in comment 10; I can confirm that the patch fixes the problem: $ vorbiscomment *01* TRACKNUMBER=1 ARTIST=Queensrÿche ALBUM=Operation Mindcrime II DATE=2006 GENRE=rock TITLE=Freiheit Ouvertüre LABEL=Rhino / WEA LICENSE=all rights reserved ENCODING=normalize 0.7.7 amplitude=0.30; OggEnc v1.0.2 quality=3 (Thanks for tracking this down, John--I never got back to this issue.)
vorbis-tools-1.1.1-3.fc6 has been pushed for fc6, which should resolve this issue. If these problems are still present in this version, then please make note of it in this bug report.
Works great, thanks for moving so quickly on this after I posted the patch! Now if you could only do the same for bug 141592, where there's also a patch. :)