Bug 118961
| Summary: | iconv -c doesn't convert whole file when illegal character is met | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Linux | Reporter: | Gilles Serasset <dodecaplex> | ||||
| Component: | glibc | Assignee: | Jakub Jelinek <jakub> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 9 | CC: | fweimer | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | i686 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2006-02-21 19:02:06 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 98772 [details]
data file to be used for conversion
This text file is encoded in UTF-8 and contains a UTF-8 character that is not
part of WINDOWS-1251 encoding.
*** This bug has been marked as a duplicate of 117021 *** Changed to 'CLOSED' state since 'RESOLVED' has been deprecated. |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/124 (KHTML, like Gecko) Safari/125 Description of problem: When I use the '-c' option in iconv on a rather big file containing an illegal character, the illegal character is correctly ignored, but the rest of the file is not entirely processed. Demonstrating the bug : Use the attached file "izvorig.xml" iconv -f UTF-8 -t WINDOWS-1251 izvorig.xml > izv.xml CORRECTLY exits with message : iconv: illegal input sequence at position 1 iconv -c -f UTF-8 -t WINDOWS-1251 izvorig.xml > izv.xml exits with no error (which is OK), but the output file is not complete. ll izv.xml shows: -rw-r--r-- 1 serasset geta 8159 Mar 23 09:32 izv.xml but the correct result file should be 92409 bytes long. The output file is truncated as if iconv did not continue after converting a block that contained an illegal character. Version-Release number of selected component (if applicable): glibc-common-2.3.2-27.9.7 How reproducible: Always Steps to Reproduce: 1. get the attached file (a rather big UTF-8 file with ONE illegal character) into izvorig.xml 2. tail -f izvorig.xml 2. iconv -c -f UTF-8 -t WINDOWS-1251 izvorig.xml > izv.xml 3. tail -f izv.xml should finish with </DOC>, but it does not... Actual Results: step 2 shows that the original file finisehd with </DOC> step 4 shows that the converted file does not finish with </DOC> and is incomplete. Expected Results: At step 4, the converted file should finish with </DOC> and should be 92409 bytes long. Additional info: I reproduced this bug on 2 RedHat 9 linux systems.