Description of problem: /usr/share/man> for x in `find -name *gz`; do zcat $x | if ! test -z "`iconv 2>&1 >/dev/null`"; then echo $x; fi done ./man1/ptx.1.gz ./man1/perlcn.1.gz ./man1/perlebcdic.1.gz ./man1/perlhack.1.gz ./man1/perlhist.1.gz ./man1/perljp.1.gz ./man1/perlko.1.gz ./man1/perlothrtut.1.gz ./man1/perlthrtut.1.gz ./man1/perltw.1.gz ./man1/fc-cache.1.gz ./man1/fc-list.1.gz ./man1/grolbp.1.gz ./man1/gs-pcl3.1.gz ./man1/pcl3opts.1.gz ./man1/ntpdate.1.gz ./man1/ntpq.1.gz ./man1/tickadj.1.gz ./man1/cdrecord.1.gz ./man1/mkromdic.1.gz ./man1/ppmshadow.1.gz ./man1/flipdiff.1.gz ./man1/unwrapdiff.1.gz ./man1/dvdrecord.1.gz ./man3/Unicode::Collate.3pm.gz ./man5/sane-agfafocus.5.gz ./man5/sane-avision.5.gz ./man5/sane-coolscan2.5.gz ./man5/sane-umax_pp.5.gz ./man7/groff_mm.7.gz ./man7/groff_mmse.7.gz ./man8/netdump-server.8.gz Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
It should be noted that recent versions of man should be able to handle non-UTF-8 man-pages (for the sake of 3rd party cross-distro rpms)... although man assumes that each language is either UTF-8 or exactly one other predetermined encoding (example: Russian is KOI8-R if it's not UTF-8). The "other" encoding is what I believed to be the "most popular" encoding in Linux for that language's man pages. Where this runs into trouble is for languages with multiple encodings; e.g. French (ISO-8859-1 and ISO-8859-15), Japanese (EUC-JP, ISO-2022-JP, and (rarely) Shift-JIS). These two languages deserve special attention with respect to multiple charsets/encodings.
Adrian, because of the reason you presented I convert all the man pages in packages where I found bad encoded man to UTF-8. Do you have any objections against that? Florian, we're working together with mitr to convert all the pages to the correct enoding. On my fresh FC3 installation I've found slighly more that 100 man pages in wrong encoding.
Comment 2: absolutely no objections-- thanks Jindrich. The intention for the non-UTF-8 support is for supporting LEGACY/old man-pages from 3rd party folks playing catch up. (It will print a warning to stderr when man encounters a non-UTF-8 pages)
All packages except the blockers of this bug have been fixed.