118961 – iconv -c doesn't convert whole file when illegal character is met

Bug 118961 - iconv -c doesn't convert whole file when illegal character is met

Summary: iconv -c doesn't convert whole file when illegal character is met

Keywords:
Status:	CLOSED DUPLICATE of bug 117021
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	glibc
Sub Component:
Version:	9
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jakub Jelinek
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-03-23 08:48 UTC by Gilles Serasset
Modified:	2016-11-24 15:23 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-02-21 19:02:06 UTC
Embargoed:

Attachments	(Terms of Use)
data file to be used for conversion (155.41 KB, text/plain) 2004-03-23 09:32 UTC, Gilles Serasset	no flags	Details
View All

Description Gilles Serasset 2004-03-23 08:48:22 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/124 (KHTML, like Gecko) Safari/125

Description of problem:
When I use the '-c' option in iconv on a rather big file containing an illegal character, the illegal character is correctly ignored, but the rest of the file is not entirely processed.

Demonstrating the bug :

Use the attached file "izvorig.xml"

iconv -f UTF-8 -t WINDOWS-1251 izvorig.xml > izv.xml
CORRECTLY exits with message : iconv: illegal input sequence at position 1

iconv -c -f UTF-8 -t WINDOWS-1251 izvorig.xml > izv.xml
exits with no error (which is OK), but the output file is not complete.

ll izv.xml
shows: -rw-r--r-- 1 serasset geta 8159 Mar 23 09:32 izv.xml

but the correct result file should be 92409 bytes long.

The output file is truncated as if iconv did not continue after converting a block that contained an illegal character.

Version-Release number of selected component (if applicable):
glibc-common-2.3.2-27.9.7

How reproducible:
Always

Steps to Reproduce:
1. get the attached file (a rather big UTF-8 file with ONE illegal character) into izvorig.xml
2. tail -f izvorig.xml
2. iconv -c -f UTF-8 -t WINDOWS-1251 izvorig.xml > izv.xml
3. tail -f izv.xml should finish with </DOC>, but it does not...

Actual Results: step 2 shows that the original file finisehd with </DOC>
step 4 shows that the converted file does not finish with </DOC> and is incomplete.

Expected Results: At step 4, the converted file should finish with </DOC> and should be 92409 bytes long.

Additional info:

I reproduced this bug on 2 RedHat 9 linux systems.

Comment 1 Gilles Serasset 2004-03-23 09:32:36 UTC

Created attachment 98772 [details]
data file to be used for conversion

This text file is encoded in UTF-8 and contains a UTF-8 character that is not
part of WINDOWS-1251 encoding.

Comment 2 Jakub Jelinek 2004-03-23 09:51:13 UTC


*** This bug has been marked as a duplicate of 117021 ***

Comment 3 Red Hat Bugzilla 2006-02-21 19:02:06 UTC

Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.

Note You need to log in before you can comment on or make changes to this bug.