Description of problem: charset parser uses tolower() which is not locale safe. With turkish locale, charsets that contain 'I' (i.e, all uppercase ISO* charset declarations) do not work. Version-Release number of selected component (if applicable): 0.5.2-1 Steps to Reproduce: 1. export LANG=tr_TR.UTF-8 2. w3m -I ISO-8859-9 iso8859-9_encoded_document
Created attachment 160014 [details] replace tolower() with bit flip
I do not understand turkish locale. Can you give me brief steps to reproduce problem you have faced?
Briefly, in Turkish alphabet lowercase of 'I' is 'ı' and uppercase of 'i' is 'İ'. Therefore, assuming tolower('I') to be 'i' does not work in turkish locale. When parsing the charset declaration of a html page or arguments given to w3m command (e.g. passing the charset declared in email header to w3m via mailcap), if the charset is given in uppercase and contains the letter 'I' (as in ISO-xxxx-x and WINDOWS-xxxx etc.) the parser fails.
ok will build new package by tomorrow.
Can you provide ISO-8859-9 encoded document to verify patch is working fine. without sample test cases,screenshots I can't proceed further. Or you can attach screen shots of what problem you faced.
You can paste my name to a text file and save it in iso-8859-9. Or type euro sign and save in iso-8859-15, shouldn't matter. Here's what I get (note the iSO and ISO difference): $ rpm -q w3m w3m-0.5.2-1.fc7 $ LANG=en_US.UTF-8 w3m -I ISO-8859-9 -dump iso8859-9.txt Encoding should be iso-8859-9 to see this properly: Sertaç Ö. Yıldız $ LANG=tr_TR.UTF-8 w3m -I ISO-8859-9 -dump iso8859-9.txt Encoding should be iso-8859-9 to see this properly: Serta? ?. Y?ld?z $ LANG=tr_TR.UTF-8 w3m -I iSO-8859-9 -dump iso8859-9.txt Encoding should be iso-8859-9 to see this properly: Sertaç Ö. Yıldız [rebuild the rpm with the patch attached above] $ rpm -q w3m w3m-0.5.2-1.sy $ LANG=en_US.UTF-8 w3m -I ISO-8859-9 -dump iso8859-9.txt Encoding should be iso-8859-9 to see this properly: Sertaç Ö. Yıldız $ LANG=tr_TR.UTF-8 w3m -I ISO-8859-9 -dump iso8859-9.txt Encoding should be iso-8859-9 to see this properly: Sertaç Ö. Yıldız $ LANG=tr_TR.UTF-8 w3m -I iSO-8859-9 -dump iso8859-9.txt Encoding should be iso-8859-9 to see this properly: Sertaç Ö. Yıldız
thanks. above tests give me positive results as you got. But, I saw only iso8859-9 is affected and other encodings iso8859-7 and iso8859-8 gave me same results w/ and wo/ patch applied.
hmm, I copied some hebrew caharacters from wikipedia and it seems to work with iso8859-8 encoding: $ rpm -q w3m w3m-0.5.2-1.fc7 $ LANG=en_US.UTF-8 w3m -I ISO-8859-8 -dump iso8859-8.txt Saved in iso-8859-8: בגדהוזחטיךכל $ LANG=tr_TR.UTF-8 w3m -I ISO-8859-8 -dump iso8859-8.txt Saved in iso-8859-8: ???????????? $ rpm -q w3m w3m-0.5.2-1.sy $ LANG=en_US.UTF-8 w3m -I ISO-8859-8 -dump iso8859-8.txt Saved in iso-8859-8: בגדהוזחטיךכל $ LANG=tr_TR.UTF-8 w3m -I ISO-8859-8 -dump iso8859-8.txt Saved in iso-8859-8: בגדהוזחטיךכל
so it looks problem related to turkish locale or its problem of font?
Sorry, I meant I can reproduce the bug with iso8859-8, i.e it does _not_ work with other encodings too. I get question marks with turkish locale and proper glyphs with other locales. Because the parser cannot parse encoding names with capital I in turkish locale.
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists. Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs: http://docs.fedoraproject.org/release-notes/ The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 7 changed to end-of-life (EOL) status on June 13, 2008. Fedora 7 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
$ rpm -q w3m w3m-0.5.2-10.fc9.i386 $ echo uüoösş |iconv -t iso-8859-9 | LANG=tr_TR.UTF-8 w3m -I ISO-8859-9 -dump u?o?s? $ echo uüoösş |iconv -t iso-8859-9 | LANG=tr_TR.UTF-8 w3m -I iso-8859-9 -dump uüoösş
can you check again this bug for reported bug maybe in F11? I am revisiting this bug to fix this?
I don't have f11 to test the package but AFAIK the same source in w3m-0.5.2-10 is just rebuilt. So I think this problem isn't solved in F11. Is the explanation in comment #3 clear? For further explanation of why this happens only in Turkish locale, see <http://www.i18nguy.com/unicode/turkish-i18n.html>
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
w3m-0.5.2-13.fc11.i586 is still buggy.
This message is a reminder that Fedora 11 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 11. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '11'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 11's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 11 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
(In reply to comment #3) > Briefly, in Turkish alphabet lowercase of 'I' is 'ı' and uppercase of 'i' is > 'İ'. Therefore, assuming tolower('I') to be 'i' does not work in turkish locale. > > When parsing the charset declaration of a html page or arguments given to w3m > command (e.g. passing the charset declared in email header to w3m via mailcap), > if the charset is given in uppercase and contains the letter 'I' (as in > ISO-xxxx-x and WINDOWS-xxxx etc.) the parser fails. this thing is already handled in tr_TR locale see tolower and toupper pairs for same http://sourceware.org/cgi-bin/cvsweb.cgi/~checkout~/libc/localedata/locales/tr_TR?rev=1.18.2.3&content-type=text/plain&cvsroot=glibc IMO if one set locale properly this conversion should happen proper
(In reply to comment #19) > this thing is already handled in tr_TR locale Yes, but I haven't filed the bug against glibc locale data. > IMO if one set locale properly this conversion should happen proper The problem is that "tolower('I') == 'i'" is locale dependent. It is TRUE for most locales but FALSE for some. Check the output of these two (from comment #13): echo uüoösş |iconv -t iso-8859-9 | LANG=tr_TR.UTF-8 w3m -I iso-8859-9 -dump echo uüoösş |iconv -t iso-8859-9 | LANG=tr_TR.UTF-8 w3m -I Iso-8859-9 -dump
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
same with w3m-0.5.2-20.fc15.i686
There is no upstream reply to this problem. This is just kept open and still present in F17.
This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.