Hide Forgot
Tested on Fedora-Workstation-netinst-x86_64-25-Alpha-1.2.iso [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ rpm -q hunspell hunspell-1.4.1-1.fc25.x86_64 [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ rpm -q hunspell-de hunspell-de-0.20160407-1.fc25.noarch [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ cat /etc/fedora-release Fedora release 25 (Twenty Five) [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ https://github.com/hunspell/hunspell/blob/master/README says: “unmunch: list all recognized words of a MySpell dictionary” When using unmunch with the German dictionary, I get stuff like: [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ unmunch /usr/share/myspell/de_DE.dic /usr/share/myspell/de_DE.aff 2>/dev/null | grep -a '^Agent$' -A 40 Agent Agentin Agentinnen AgentIn AgentInnen Agenten -agentin -agentinnen -agentIn -agentInnen -agenten -agent Agenten Agenten0/xoc| Agenten-/zocf| Agenten-/cz| Agentinnen/xyoc| AgentInnen/xyoc| Agentinnen/xyocf| AgentInnen/xyocf| Agentinnen-/cz| AgentInnen-/cz| -/coyf|Agenten0/xoc| -/coyf|Agenten-/zocf| -/coyf|Agenten-/cz| -/coyf|Agentinnen/xyoc| -/coyf|AgentInnen/xyoc| -/coyf|Agentinnen/xyocf| -/coyf|AgentInnen/xyocf| -/coyf|Agentinnen-/cz| -/coyf|AgentInnen-/cz| -/coyf|Agenten Agentur Agenturen Agentur0/xoc| Agentur-/zocf| Agentur-/cz| -/coyf|Agenturen -agenturen -/coyf|Agentur0/xoc| -agentur0/xoc| [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ This looks OK: Agent Agentin Agentinnen AgentIn AgentInnen Agenten I am not sure about this: -agentin -agentinnen -agentIn -agentInnen -agenten -agent And what is this?: Agenten0/xoc| Agenten-/zocf| Agenten-/cz| Agentinnen/xyoc| AgentInnen/xyoc| Agentinnen/xyocf| AgentInnen/xyocf| Agentinnen-/cz| AgentInnen-/cz| -/coyf|Agenten0/xoc| -/coyf|Agenten-/zocf| -/coyf|Agenten-/cz| -/coyf|Agentinnen/xyoc| -/coyf|AgentInnen/xyoc| -/coyf|Agentinnen/xyocf| -/coyf|AgentInnen/xyocf| -/coyf|Agentinnen-/cz| -/coyf|AgentInnen-/cz| -/coyf|Agenten These does not look like “recognized words of a MySpell dictionary”. The original de_DE.dic contains: Agenten/ghij And looking into de_DE.aff for the “i” and “j” flags, I find: PFX i Y 1 PFX i 0 -/coyf . SFX j Y 3 SFX j 0 0/xoc . SFX j 0 -/zocf . SFX j 0 -/cz . I don’t understand these prefix rules for “i” and suffix rules for “j”. So is this a bug in “unmunch” or is it a bug in the de_DE.{dic,aff} files?
https://github.com/hunspell/hunspell/blob/master/README also says: “wordforms: word generation (Hunspell version of unmunch)” For comparison, here is what “wordforms” produces for “Agent” using de_DE.aff and de_DE.dic: [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ cd /usr/share/myspell/ [mfabian@Fedora-Workstation-netinst-x86_6 myspell]$ wordforms de_DE.aff de_DE.dic Agent Agent -agent Agentinnen Agenten Agentin AgentIn AgentInnen Agent Agent Agenten Agentinnen Agenten Agentin AgentIn AgentInnen Agent Agent Agenten -agentinnen -agenten -agentin -agentIn -agentInnen -agent -agent -agenten [mfabian@Fedora-Workstation-netinst-x86_6 myspell]$ pwd /usr/share/myspell [mfabian@Fedora-Workstation-netinst-x86_6 myspell]$ This looks better.
By the way, “wordforms” seems to work only when the current directory is the directory where the dictionaries are. In comment#1, “wordforms” was executed in /usr/share/myspell/. Trying to execute it in other directories fails: [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ pwd /home/mfabian [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ wordforms /usr/share/myspell/de_DE.aff /usr/share/myspell/de_DE.dic Agent awk: fatal: cannot open file `/tmp/wordforms.aff' for reading (No such file or directory) Can't open affix or dictionary files for dictionary named "/tmp/wordforms". [mfabian@Fedora-Workstation-netinst-x86_6 ~]$
if the description of unmunch is right and this is for MySpell, then the broken results must be expected as the i and j flags in the hunspell dictionary are based on a hunspell-only feature. To avoid confusion Debian for example installs the hunspell dictionaries not in /usr/share/myspell/ but in /usr/share/hunspell/. The result of wordforms from comment #1 looks reasonable.
(In reply to Björn Jacke from comment #3) > if the description of unmunch is right and this is for MySpell, then the > broken results must be expected as the i and j flags in the hunspell > dictionary are based on a hunspell-only feature. To avoid confusion Debian > for example installs the hunspell dictionaries not in /usr/share/myspell/ > but in /usr/share/hunspell/. > > The result of wordforms from comment #1 looks reasonable. But unmunch is distributed with hunspell now and we don’t even have a myspell in fedora. If unmunch is now part of hunspell, it should work correctly for hunspell dictionaries, shouldn’t it? Otherwise it is quite useless.
From: Björn Jacke <bjoern> Subject: Re: [Bug 1373404] unmunch produces weird results for some dictionaries To: bugzilla, mfabian Date: Tue, 13 Sep 2016 18:40:58 +0200 (16 hours, 36 minutes, 2 seconds ago) hi mike, the i and j flags are for leading and trailing dashes on composable words. required for example in "Versicherungsberater und -agenten" cheers björn
This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.