Bug 485072
Summary: | gencat(1p) does not document the "$ codeset" option | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Michael Solberg <msolberg> |
Component: | man-pages | Assignee: | Ivana Varekova <varekova> |
Status: | CLOSED NOTABUG | QA Contact: | BaseOS QE <qe-baseos-auto> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 4.7 | ||
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-02-12 09:36:43 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Michael Solberg
2009-02-11 14:15:32 UTC
ISO-8859-1 isn't valid UTF-8, unless you use only the ASCII subset thereof, and as UTF-8 is the default encoding, no wonder you need to tell gencat that you are using a different encoding. gencat manpage isn't part of glibc though (and it is just the 1p manpage, so it really should describe just what POSIX says and nothing else). The posix man pages describes only the POSIX norm - so this kind of changes are not relevant for them (there is only one change - which was done in 2.67 - it is not in RHEL4 - which added to all POSIX man pages the note it only describes the POSIX description - it is not in older version for now). I'm closing this bug. Would it be possible to get this documented upsteam in the glibc manual? Maybe a knowledgebase article? I understand not wanting to mess with the POSIX pages, but the only way to find out about the option at this point is to read the source for gencat. (In reply to comment #1) > ISO-8859-1 isn't valid UTF-8, unless you use only the ASCII subset thereof, and > as UTF-8 is the default encoding, no wonder you need to tell gencat that you > are using a different encoding. I should have been a little more specific. The files were valid ISO-8859-1. When running gencat on them, you get the errors. I converted one to UTF-8 with iconv (-f ISO-8859-1 -t UTF-8) and it worked fine and the ones I had in ASCII worked fine. It was only files in ISO-8859-1 format that I had a problem with: #Without "$ codeset=ISO-8859-1", I get invalid characters [l3p5xms@cpliis13 src]$ file set_real_id.msg.es_AR set_real_id.msg.es_AR: ISO-8859 English text [l3p5xms@cpliis13 src]$ file set_real_id.msg.es_AR set_real_id.msg.es_AR: ISO-8859 English text [l3p5xms@cpliis13 src]$ gencat set_real_id.cat.es_AR set_real_id.msg.es_AR set_real_id.msg.es_AR:25: invalid character: message ignored set_real_id.msg.es_AR:26: invalid character: message ignored set_real_id.msg.es_AR:27: invalid character: message ignored # If I convert to UTF-8, it works. [l3p5xms@cpliis13 src]$ iconv -f ISO-8859-1 -t UTF-8 set_real_id.msg.es_AR > set_real_id.msg.es_AR.utf8 [l3p5xms@cpliis13 src]$ gencat set_real_id.cat.es_AR set_real_id.msg.es_AR.utf8 # If I add the codeset line, it works. [l3p5xms@cpliis13 src]$ cp set_real_id.msg.es_AR set_real_id.msg.es_AR~ [l3p5xms@cpliis13 src]$ vi set_real_id.msg.es_AR [l3p5xms@cpliis13 src]$ diff set_real_id.msg.es_AR~ set_real_id.msg.es_AR 0a1 > $ codeset=ISO-8859-1 [l3p5xms@cpliis13 src]$ gencat set_real_id.cat.es_AR set_real_id.msg.es_AR # Even if LANG is set to ISO-8859-1, I still get errors with the old file. [l3p5xms@cpliis13 src]$ export LANG=ISO-8859-1 [l3p5xms@cpliis13 src]$ gencat set_real_id.cat.es_AR set_real_id.msg.es_AR~ set_real_id.msg.es_AR~:25: invalid character: message ignored set_real_id.msg.es_AR~:26: invalid character: message ignored set_real_id.msg.es_AR~:27: invalid character: message ignored Do I just not understand how the tool is supposed to work? |