Bug 2084575
| Summary: | Spell checker dictionaries and hyphenation rules are not propertly installed on pt_BR locales compared to pt locales | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Mateus Rodrigues Costa <mateusrodcosta> |
| Component: | langpacks | Assignee: | Parag Nemade <pnemade> |
| Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | low | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 36 | CC: | mateusrodcosta, pnemade |
| Target Milestone: | --- | Flags: | pnemade:
mirror+
|
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-07-01 12:47:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Mateus Rodrigues Costa
2022-05-12 13:00:13 UTC
I think we have made this difference to make sure one locale user will not get other locale user packages. I see you are requesting that when langpacks-pt_BR is installed then it should also install hunspell-pt and hyphen-pt packages. Does that mean there are no differences pt_BR and pt_PT when it comes to writing words? I have seen translators making difference between pt_PT and pt_BR translations. (In reply to Parag Nemade from comment #1) > I think we have made this difference to make sure one locale user will not > get other locale user packages. > I see you are requesting that when langpacks-pt_BR is installed then it > should also install hunspell-pt and hyphen-pt packages. > Does that mean there are no differences pt_BR and pt_PT when it comes to > writing words? > I have seen translators making difference between pt_PT and pt_BR > translations. So, here's the thing, they are different because they are dialects of the same language, Brazilian Portuguese is different from European Portuguese in the same way that American English is different from Britsh English. Also, pt_BR and pt_PT are way different between themselves compared to en dialects because they have grown really apart from each other for very long and pt_BR changed a lot in that time, so the difference is way bigger than en_US and en_GB. Of course, both mostly understand each other, with sightly bigger misunderstandings than en dialects. In this case specifically from what I understood both hunspell-pt and hyphen-pt are made to support both pt_PT and pt_BR. Here for example the contents of it right now: ``` ⬢[mateusrc@toolbox ~]$ dnf repoquery -l hunspell-pt /usr/share/doc/hunspell-pt /usr/share/doc/hunspell-pt/README_pt_BR.txt /usr/share/doc/hunspell-pt/README_pt_PT.txt /usr/share/licenses/hunspell-pt /usr/share/licenses/hunspell-pt/COPYING /usr/share/myspell/pt_AO.aff /usr/share/myspell/pt_AO.dic /usr/share/myspell/pt_BR.aff /usr/share/myspell/pt_BR.dic /usr/share/myspell/pt_PT.aff /usr/share/myspell/pt_PT.dic ``` ``` ⬢[mateusrc@toolbox ~]$ dnf repoquery -l hyphen-pt /usr/share/doc/hyphen-pt /usr/share/doc/hyphen-pt/README_hyph_pt_PT.txt /usr/share/hyphen/hyph_pt_AO.dic /usr/share/hyphen/hyph_pt_BR.dic /usr/share/hyphen/hyph_pt_PT.dic ``` You might also notice the pt_AO locale, that's a third dialect for the Portuguese spoken in Angola, but from the most part if follows European Portuguese, so it's a symlink to pt_PT on both. On hunspell-pt, pt_BR and pt_PT are separate dictionaries but, on hyphen-pt, pt_BR is also a symlink to pt_PT. I do not know if that's due to having no big hyphenation differences (I personally am not aware of any), but I was also going to open an issue about how its upstream is basically dead too, so... Well, you probably noticed that the three locales are merged on those packages. You could either include hunspell-pt along with langpacks-pt_BR or split the pt_BR dictionary into its own package and use that new package instead (I checked the spec for hunspell-pt dictionaries and they come from two different sources anyway). Something similar could be done for hyphen-pt, but everything is a symlink to the pt_PT one, so not sure here. Also, there might be other situations in which pt_PT and pt_BR support are merged in the same package, most likely `tesseract-langpack-por` has the support for all pt locales merged into a single one. Also, mythes-pt has code on its spec to also symlink the pt_AO and pt_BR files to the pt_PT but it currently doesn't work due to a copy-pasting error. And I checked the pdf inside nqc-doc-pt and it's in pt_BR, it would make more sense to install it with for pt_BR than pt_PT (although pt_PT speakers could understand it too). Do note also that, while I am asking this specifically for pt_BR, this could also happen to a few other locales with regional variants if someone wasn`t carreful enough and nobody noticed the problem. Yes when the individual language support packages, do not follow any locale then this becomes an issue for langpacks package. For this issue related to hunspell package, hunspell-pt can be split or new package can be added and individual packages can be created like hunspell-pt, hunspell-pt_BR and if we really have pt_AO locale then hunspell-pt_AO package. Langpacks concept is based on enabling languages where anaconda translations and/or glibc locale are available. Also, It depends on if any upstream is there who can provide hyphen-pt and hyphen-pt_BR. It may be the case that either one is available so we can pull either one to that respective langpacks-pt or langpacks-pt_BR package installation. Same applies to other language packages like mythes, tesseract. For tesseract-langpack-por, I think when first time I added Supplements: there, I checked which language it belongs but now it has grown a lot. I may check with its maintainer or upstream in next few days about which language they mean just pt or pt_BR. I actually maintain only langpacks package but not other language packages so you can report bug to nqc-doc-pt to fix any issues :) I think hunspell* packages are providing more symlinks to other languages where their glibc locale is not available, then there makes no sense to generate new langpacks for those other languages. I believe this issue is fixed now in rawhide considering that installing langpacks-pt_BR should not bring any pt packages. See now [test@fedora ~]$ sudo dnf repoquery --whatsupplements langpacks-pt Last metadata expiration check: 0:02:54 ago on Friday 24 June 2022 11:44:52 AM. cockatrice-langpack-pt-0:2.8.0-5.fc36.noarch guayadeque-langpack-pt-0:0.4.7-0.36.20220618gitd947179.fc37.noarch hunspell-pt-0:0.20131030-6.fc37.noarch hyphen-pt-0:0.20140727-2.fc37.noarch kde-l10n-pt-0:17.08.3-12.fc36.noarch mythes-pt-0:0.20060817-27.fc36.noarch nqc-doc-pt-0:3.1.7-30.fc36.x86_64 tesseract-langpack-por-0:4.1.0-3.fc36.noarch [test@fedora ~]$ sudo dnf repoquery --whatsupplements langpacks-pt_BR Last metadata expiration check: 0:02:59 ago on Friday 24 June 2022 11:44:52 AM. cockatrice-langpack-pt_BR-0:2.8.0-5.fc36.noarch gimp-help-pt_BR-0:2.10.0-8.fc36.noarch guayadeque-langpack-pt_BR-0:0.4.7-0.36.20220618gitd947179.fc37.noarch hunspell-pt-BR-0:0.20131030-6.fc37.noarch hyphen-pt-BR-0:0.20140727-2.fc37.noarch kde-l10n-pt_BR-0:17.08.3-12.fc36.noarch libreoffice-langpack-pt-BR-1:7.3.4.2-2.fc37.x86_64 libreoffice-langpack-pt-BR-1:7.3.4.2-3.fc37.x86_64 man-pages-pt_BR-0:4.14.0-1.fc37.noarch I will close this issue now. If still not fixed please reopen this bug. |