man 2 close fails with iconv: illegal input sequence at position 1722 LANG is set to en_US.ISO-8859-1 The problem comes from this line .\" Modified 2000-07-22 by Nicolás Lichtmaier <nick> iconv dies on á and the man page works fine if I replace this with 'a' I don't know enough about man pages to say if it's a bug in the man page itself (i.e. if 8 bit chars are forbidden in english man pages), or if the problem is other. Can you advise?
The man pages need to be cleaned of latin-1 and converted to utf-8.
*** Bug 99014 has been marked as a duplicate of this bug. ***
*** Bug 88148 has been marked as a duplicate of this bug. ***
*** Bug 89203 has been marked as a duplicate of this bug. ***
*** Bug 96943 has been marked as a duplicate of this bug. ***
*** Bug 90784 has been marked as a duplicate of this bug. ***
maybe this utf problem is gone with latest man-pages 1.56 :-?
*** Bug 89629 has been marked as a duplicate of this bug. ***
manpages has new version 1.57 with more pages: Differences from version 1.56: The man pages epoll_create.2 epoll_ctl.2 epoll_wait.2 getresuid.2 ioctl_list.2 lookup_dcookie.2 mmap.2 open.2 poll.2 semop.2 semtimedop.2 cabs.3 cabsf.3 cabsl.3 cacos.3 cacosh.3 cacoshf.3 cacoshl.3 carg.3 cargf.3 cargl.3 casin.3 casinf.3 casinh.3 casinhf.3 casinhl.3 casinl.3 catan.3 catanf.3 catanh.3 catanhf.3 catanhl.3 catanl.3 cbrt.3 cbrtf.3 cbrtl.3 ccos.3 ccosf.3 ccosh.3 ccoshf.3 ccoshl.3 ccosl.3 cerf.3 cerfc.3 cerfcf.3 cerfcl.3 cerff.3 cerfl.3 cexp2.3 cexp2f.3 cexp2l.3 cexp.3 cexpf.3 cexpl.3 cimag.3 cimagf.3 cimagl.3 clog10.3 clog10f.3 clog10l.3 clog2.3 clog2f.3 clog2l.3 clog.3 clogf.3 clogl.3 conj.3 conjf.3 conjl.3 cpow.3 cpowf.3 cpowl.3 cproj.3 cprojf.3 cprojl.3 creal.3 crealf.3 creall.3 csin.3 csinf.3 csinh.3 csinhf.3 csinhl.3 csinl.3 csqrt.3 csqrtf.3 csqrtl.3 ctan.3 ctanf.3 ctanh.3 ctanhf.3 ctanhl.3 ctanl.3 dlopen.3 encrypt.3 lockf.3 mtrace.3 rtime.3 epoll.4 complex.5 proc.5 iso_8859-16.7 ip.7 are new or have been updated. Typographical or grammatical errors have been corrected in several other places.
confirmed that this problem is no longer present with 1.58-1
It's not fixed. Try: export LANG=en_US man 2 close The man page for close was not cleaned in /usr/share/man/en/man2. At least, not as of man-pages-1.60-3.noarch.rpm Also, why not fix it in shrike? Not applicable to the plain-jane distribution?
comment 11: use release 4 in rawhide. release 3 wasn't fixed, true.
No, it's still broken in 1.60-4: % rpm -q man-pages man-pages-1.60-4 % rpm -V man-pages % man 2 close | head -1 iconv: illegal input sequence at position 1729 % echo $LANG en_US % env LANG=C man 2 close | head -1 CLOSE(2) Linux Programmer's Manual CLOSE(2) % ls -oF /usr/share/man{/,/en/}man2/close.2.gz -rw-r--r-- 1 root 1811 Dec 13 2001 /usr/share/man/en/man2/close.2.gz -rw-r--r-- 1 root 1809 Sep 24 08:40 /usr/share/man/man2/close.2.gz % zdiff -u /usr/share/man{/,/en/}man2/close.2.gz --- - 2003-10-05 01:04:09.309207000 -0700 +++ /tmp/close2.gz.XXXXhuhMPP 2003-10-05 01:04:09.000000000 -0700 @@ -29,7 +29,7 @@ .\" corrected description of effect on locks (thanks to .\" Tigran Aivazian <tigran>). .\" Modified Fri Jan 31 16:21:46 1997 by Eric S. Raymond <esr> -.\" Modified 2000-07-22 by Nicol?s Lichtmaier <nick> +.\" Modified 2000-07-22 by Nicolás Lichtmaier <nick> .\" added note about close(2) not guaranteeing that data is safe on close. .\" .TH CLOSE 2 2001-12-13 "" "Linux Programmer's Manual" Is the suggested solution to replace 'man' with an alias of 'env LANG=C man', or what? Also, pretty goofy fix for the C version of close.2. I'd think it'd make more sense to replace 'á' with 'a' rather than '?'. Finally, claw's question about why this isn't being fixed with an RH9 RPM was not addressed...
> Is the suggested solution to replace 'man' with an alias of 'env LANG=C man', or > what? No. The fix solution and rational are described in bug 103214 > Also, pretty goofy fix for the C version of close.2. I'd think it'd make more > sense to replace 'á' with 'a' rather than '?'. No this does not make more sense. Depending on the language and country, how a non-ASCII letter is transliterated into "non-accent English" is ambiguous. For example, certain single letters in German become "ss" or "oe", but they DON'T become this if the same letters are used by another language). There is no context as to the original language for each non-ASCII word in the man pages > Finally, claw's question about why this isn't being fixed with an RH9 RPM was > not addressed... The rawhide package will install on RH9 without changing the dependencies
> No. The fix solution and rational are described in bug 103214 Okay, I've now read all of bug 103214, and I still don't know what the solution is supposed to be. Considering this bug is marked "CLOSED" yet people are still getting the failure, could you please be more specific? One thing that you may not have picked up on is that claw and I and presumably a whole lot of other users are using LANG=en_US, not the RH9 default setting of en_US.UTF-8. With the UTF-8 setting, I was getting serious problems in Perl, my shell, my terminal emulator, and other programs, so I backed off to en_US. If I do: % env LANG=en_US.UTF-8 man 2 close | head -1 CLOSE(2) Linux Programmerâs Manual CLOSE(2) then, again, man works, just like if I do LANG=C. Well, not "just like". My terminal program doesn't handle UTF-8, so the three non-ASCII characters that appear after "Programmer" get shown as an a-circumflex, as above. The problem is that there's a single /usr/share/man/en directory that assumes everyone uses en_US.UTF-8, not en_US (or the equivalent for the English locales of other countries). Can /usr/bin/nrofff be fixed to properly detect if a non-UTF-8 locale is being used, and call iconv appropriately? > No this does not make more sense. Hmm, that conflicts with your bug 103214 comment 4: > 2) be "transliterated" for the POSIX locale. That is, convert the "acute a"s > and the umlauts into plain ASCII "a" and "u" respectively. Yes, I know that an > umlaut and a "u" are entirely different things and this is going to upset some > people who will get their name mangled, but... Is the difference that the LANG=C man pages in the man-pages RPM are getting filtered with no user intervention? If there is a human involved, then non-ASCII characters should be mapped to a "best-effort" equivalent, just as you described for u-umlaut. Even if we are talking pure machine translation, using a "most likely" translation like á -> a will still convey the most information in the average case. I don't see why we should all have to suffer with '?' because there might be some obscure language with a different transliteration. > The rawhide package will install on RH9 without changing the dependencies It wasn't a question of compatibility. The point is that the man pages are BROKEN in RH9, so why isn't an update RPM being issued? Most users aren't sophisticated enough to find this bug on Bugzilla, understand the vague references to "rawhide", and go find and install the rawhide RPM. (And then, as I've stated, it still doesn't fix the bug.)