Description of problem:
When the locale isn't UTF-8, expand and unexpand tools don't correctly manage files having an UTF-8 BOM header. With expand, some spaces are missing and with unexpand, there is some extra spaces so the result aren't correct.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Write an UTF-8 file (file_spaces) having a BOM header with the following contents:
é à ç
é à ç
Two lines with four spaces + 'é' + three spaces + 'à' + three spaces + 'ç' + EOL
It's important to test with stressed characters. With ASCII characters, only the first line is incorrect and the second one is OK. With stressed characters, all lines are incorrect.
NB: For writing files with BOM header, I'm using the scite editor.
2. LANG=C unexpand -t4 file_spaces > file_unexpand
3. Edit the two files (e.g. with gedit or scite) and see the problem in file_unexpand
4. Write an UTF-8 file (file_tabs) having a BOM header with the following contents:
é à ç
é à ç
Two lines with one tab + 'é' + one tab + 'à' + one tab + 'ç' + EOL
5. LANG=C expand -t4 file_tabs > file_expand
6. Edit the two files (e.g. with gedit or scite) and see the problem in file_expand
Thanks for report, can you please attach file_tabs file to this bugzilla so it is easier for everyone? E.g. scite is not in EPEL nor RHEL, so I have to compile it from sources...
From the description you provided, for UTF-8 locales, everything is ok, correct? With C locales, even multibyte characters are handled byte by byte and different path of code is used (Fedora has downstream i18n patch which is active in multibyte locales).
Created attachment 951805 [details]
UTF-8 file with BOM header for testing unexpand
Created attachment 951807 [details]
UTF-8 file with BOM header for testing expand
From the description you provided, for UTF-8 locales, everything is ok, correct?
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora 'version'
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 20 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Just as a side note. Ondrej is working on new implementation of i18n and he expects to fix this issue there. Fix not planned for the old implementation, though. Reassigning to him for now.
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.
(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)
More information and reason for this action is here:
coreutils-8.25-7.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-75adc7da4f
coreutils-8.25-7.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-75adc7da4f
coreutils-8.25-7.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.