Bug 2084575 - Spell checker dictionaries and hyphenation rules are not propertly installed on pt_BR locales compared to pt locales
Summary: Spell checker dictionaries and hyphenation rules are not propertly installed ...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: langpacks
Version: 36
Hardware: All
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Parag Nemade
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-12 13:00 UTC by Mateus Rodrigues Costa
Modified: 2022-07-01 12:47 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-07-01 12:47:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FC-448 0 None None None 2022-05-13 13:34:17 UTC

Internal Links: 2084587

Description Mateus Rodrigues Costa 2022-05-12 13:00:13 UTC
Description of problem:

This is basically what says on the title, if you do install langpacks_pt you will most likely get hunspell-pt and hyphen-pt installed as well, if you try to install langpacks-pt_BR the same won`t happen.

(Do note that I'm using Fedora Silverblue 36)

Version-Release number of selected component (if applicable):

langpacks-pt-3.0-21.fc36.noarch
langpacks-pt_BR-3.0-21.fc36.noarch

How reproducible:

Should happen most of the time, except you won't generally notice it on Silverblue/Workstation unless you install the pt_BR langpacks manually as, due to a bug, GNOME Software 42.0 and 42.1, and apparently also 41.x, will always install langpacks-pt on pt_BR locales.

Steps to Reproduce:

1. Check dependencies on langpacks-pt installation
2. Compare them to the ones on langpacks-pt_BR installation

Actual results:

hunspell-pt and hyphen-pt won't be installed when installing langpacks-pt_BR but will when installing langpacks-pt instead.

Expected results:

hunspell-pt and hyphen-pt should be installed when installing langpacks-pt_BR.

Additional info:

It seems to be because of the weak dependencies of both packages, for example:

For langpacks-pt:

```
⬢[mateusrc@toolbox ~]$ dnf repoquery --whatsupplements langpacks-pt
Last metadata expiration check: 0:57:05 ago on Thu May 12 08:38:30 2022.
cockatrice-langpack-pt-0:2.8.0-2.fc35.noarch
guayadeque-langpack-pt-0:0.4.7-0.32.20210917git5eed2ee.fc35.noarch
guayadeque-langpack-pt-0:0.4.7-0.34.20211201git2aa235b.fc35.noarch
hunspell-pt-0:0.20130125-17.fc35.noarch
hyphen-pt-0:0.20021021-24.fc35.noarch
kde-l10n-pt-0:17.08.3-11.fc35.noarch
mythes-pt-0:0.20060817-26.fc35.noarch
nqc-doc-pt-0:3.1.7-29.fc35.x86_64
tesseract-langpack-por-0:4.1.0-2.fc35.noarch
```

For langpacks-pt_BR:

```
⬢[mateusrc@toolbox ~]$ dnf repoquery --whatsupplements langpacks-pt_BR
Last metadata expiration check: 0:58:05 ago on Thu May 12 08:38:30 2022.
cockatrice-langpack-pt_BR-0:2.8.0-2.fc35.noarch
gimp-help-pt_BR-0:2.10.0-7.fc35.noarch
guayadeque-langpack-pt_BR-0:0.4.7-0.32.20210917git5eed2ee.fc35.noarch
guayadeque-langpack-pt_BR-0:0.4.7-0.34.20211201git2aa235b.fc35.noarch
kde-l10n-pt_BR-0:17.08.3-11.fc35.noarch
libreoffice-langpack-pt-BR-1:7.2.1.2-1.fc35.x86_64
libreoffice-langpack-pt-BR-1:7.2.6.2-1.fc35.x86_64
man-pages-pt_BR-0:4.10.0-2.fc35.noarch
shotcut-langpack-pt_BR-0:21.03.21-3.fc35.noarch
```

By checking a bit closely the differences, it does seem that the actual problem here is that langpacks-pt_BR will only picks as dependencies the packages that are explicitly marked as pt_BR only, while langpacks-pt will pick both pt and, appparently in one instance, the tesseract one.

So, if for whatever reason the pt_PT and pt_BR localization/dictionary/whatever stuff are merged into a single package, langpacks-pt will install it but langpacks-pt_BR won't.

Comment 1 Parag Nemade 2022-05-12 16:39:05 UTC
I think we have made this difference to make sure one locale user will not get other locale user packages.
I see you are requesting that when langpacks-pt_BR is installed then it should also install hunspell-pt and hyphen-pt packages.
Does that mean there are no differences pt_BR and pt_PT when it comes to writing words?
I have seen translators making difference between pt_PT and pt_BR translations.

Comment 2 Mateus Rodrigues Costa 2022-05-12 18:32:36 UTC
(In reply to Parag Nemade from comment #1)
> I think we have made this difference to make sure one locale user will not
> get other locale user packages.
> I see you are requesting that when langpacks-pt_BR is installed then it
> should also install hunspell-pt and hyphen-pt packages.
> Does that mean there are no differences pt_BR and pt_PT when it comes to
> writing words?
> I have seen translators making difference between pt_PT and pt_BR
> translations.

So, here's the thing, they are different because they are dialects of the same language, Brazilian Portuguese is different from European Portuguese in the same way that American English is different from Britsh English. Also, pt_BR and pt_PT are way different between themselves compared to en dialects because they have grown really apart from each other for very long and pt_BR changed a lot in that time, so the difference is way bigger than en_US and en_GB.

Of course, both mostly understand each other, with sightly bigger misunderstandings than en dialects.

In this case specifically from what I understood both hunspell-pt and hyphen-pt are made to support both pt_PT and pt_BR.

Here for example the contents of it right now:

```
⬢[mateusrc@toolbox ~]$ dnf repoquery -l hunspell-pt

/usr/share/doc/hunspell-pt
/usr/share/doc/hunspell-pt/README_pt_BR.txt
/usr/share/doc/hunspell-pt/README_pt_PT.txt
/usr/share/licenses/hunspell-pt
/usr/share/licenses/hunspell-pt/COPYING
/usr/share/myspell/pt_AO.aff
/usr/share/myspell/pt_AO.dic
/usr/share/myspell/pt_BR.aff
/usr/share/myspell/pt_BR.dic
/usr/share/myspell/pt_PT.aff
/usr/share/myspell/pt_PT.dic
```

```
⬢[mateusrc@toolbox ~]$ dnf repoquery -l hyphen-pt

/usr/share/doc/hyphen-pt
/usr/share/doc/hyphen-pt/README_hyph_pt_PT.txt
/usr/share/hyphen/hyph_pt_AO.dic
/usr/share/hyphen/hyph_pt_BR.dic
/usr/share/hyphen/hyph_pt_PT.dic
```

You might also notice the pt_AO locale, that's a third dialect for the Portuguese spoken in Angola, but from the most part if follows European Portuguese, so it's a symlink to pt_PT on both.

On hunspell-pt, pt_BR and pt_PT are separate dictionaries but, on hyphen-pt, pt_BR is also a symlink to pt_PT. I do not know if that's due to having no big hyphenation differences (I personally am not aware of any), but I was also going to open an issue about how its upstream is basically dead too, so...

Well, you probably noticed that the three locales are merged on those packages.

You could either include hunspell-pt along with langpacks-pt_BR or split the pt_BR dictionary into its own package and use that new package instead (I checked the spec for hunspell-pt dictionaries and they come from two different sources anyway). Something similar could be done for hyphen-pt, but everything is a symlink to the pt_PT one, so not sure here.

Also, there might be other situations in which pt_PT and pt_BR support are merged in the same package, most likely `tesseract-langpack-por` has the support for all pt locales merged into a single one.

Also, mythes-pt has code on its spec to also symlink the pt_AO and pt_BR files to the pt_PT but it currently doesn't work due to a copy-pasting error. And I checked the pdf inside nqc-doc-pt and it's in pt_BR, it would make more sense to install it with for pt_BR than pt_PT (although pt_PT speakers could understand it too).

Do note also that, while I am asking this specifically for pt_BR, this could also happen to a few other locales with regional variants if someone wasn`t carreful enough and nobody noticed the problem.

Comment 3 Parag Nemade 2022-05-13 13:29:09 UTC
Yes when the individual language support packages, do not follow any locale then this becomes an issue for langpacks package. 

For this issue related to hunspell package, hunspell-pt can be split or new package can be added and individual packages can be created like hunspell-pt, hunspell-pt_BR and if we really have pt_AO locale then hunspell-pt_AO package.
Langpacks concept is based on enabling languages where anaconda translations and/or glibc locale are available.

Also, It depends on if any upstream is there who can provide hyphen-pt and hyphen-pt_BR. It may be the case that either one is available so we can pull either one to that respective langpacks-pt or langpacks-pt_BR package installation.
Same applies to other language packages like mythes, tesseract.

For tesseract-langpack-por, I think when first time I added Supplements: there, I checked which language it belongs but now it has grown a lot. I may check with its maintainer or upstream in next few days about which language they mean  just pt or pt_BR.

I actually maintain only langpacks package but not other language packages so you can report bug to nqc-doc-pt to fix any issues :)

I think hunspell* packages are providing more symlinks to other languages where their glibc locale is not available, then there makes no sense to generate new langpacks for those other languages.

Comment 4 Parag Nemade 2022-06-24 06:31:38 UTC
I believe this issue is fixed now in rawhide considering that installing langpacks-pt_BR should not bring any pt packages. See now

[test@fedora ~]$ sudo dnf repoquery --whatsupplements langpacks-pt
Last metadata expiration check: 0:02:54 ago on Friday 24 June 2022 11:44:52 AM.
cockatrice-langpack-pt-0:2.8.0-5.fc36.noarch
guayadeque-langpack-pt-0:0.4.7-0.36.20220618gitd947179.fc37.noarch
hunspell-pt-0:0.20131030-6.fc37.noarch
hyphen-pt-0:0.20140727-2.fc37.noarch
kde-l10n-pt-0:17.08.3-12.fc36.noarch
mythes-pt-0:0.20060817-27.fc36.noarch
nqc-doc-pt-0:3.1.7-30.fc36.x86_64
tesseract-langpack-por-0:4.1.0-3.fc36.noarch

[test@fedora ~]$ sudo dnf repoquery --whatsupplements langpacks-pt_BR
Last metadata expiration check: 0:02:59 ago on Friday 24 June 2022 11:44:52 AM.
cockatrice-langpack-pt_BR-0:2.8.0-5.fc36.noarch
gimp-help-pt_BR-0:2.10.0-8.fc36.noarch
guayadeque-langpack-pt_BR-0:0.4.7-0.36.20220618gitd947179.fc37.noarch
hunspell-pt-BR-0:0.20131030-6.fc37.noarch
hyphen-pt-BR-0:0.20140727-2.fc37.noarch
kde-l10n-pt_BR-0:17.08.3-12.fc36.noarch
libreoffice-langpack-pt-BR-1:7.3.4.2-2.fc37.x86_64
libreoffice-langpack-pt-BR-1:7.3.4.2-3.fc37.x86_64
man-pages-pt_BR-0:4.14.0-1.fc37.noarch

Comment 5 Parag Nemade 2022-07-01 12:47:33 UTC
I will close this issue now. If still not fixed please reopen this bug.


Note You need to log in before you can comment on or make changes to this bug.