Bug 830559

Summary: Migemo crashed with invalid byte sequence in UTF-8
Product: [Fedora] Fedora Reporter: Taiki Sugawara <buzz.taiki>
Component: migemoAssignee: Mamoru TASAKA <mtasaka>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: mtasaka
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-19 15:10:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch for fix encoding probrem none

Description Taiki Sugawara 2012-06-10 16:10:34 UTC
Description of problem:

Migemo crashed when to input 'aiueo' on no euc-jp locale.


Version-Release number of selected component (if applicable):

0.40-15.fc17

Steps to Reproduce:
1. Run migemo as follows

  $ export LANG=ja_JP.utf-8
  $ $ echo 'aiueo' | migemo -d /usr/share/migemo/migemo-dict

Actual results:

Migemo crashed with following error:

/usr/share/ruby/vendor_ruby/migemo-dict.rb:42:in `split': invalid byte sequence in UTF-8 (ArgumentError)
        from /usr/share/ruby/vendor_ruby/migemo-dict.rb:42:in `decompose'
        from /usr/share/ruby/vendor_ruby/migemo-dict.rb:72:in `block in lookup'
        from /usr/share/ruby/vendor_ruby/bsearch.rb:53:in `bsearch_lower_boundary'
        from /usr/share/ruby/vendor_ruby/bsearch.rb:115:in `bsearch_range'
        from /usr/share/ruby/vendor_ruby/migemo-dict.rb:71:in `lookup'
        from /usr/share/ruby/vendor_ruby/migemo.rb:163:in `expand_words'
        from /usr/share/ruby/vendor_ruby/migemo.rb:180:in `block in lookup0'
        from /usr/share/ruby/vendor_ruby/migemo.rb:177:in `each'
        from /usr/share/ruby/vendor_ruby/migemo.rb:177:in `lookup0'
        from /usr/share/ruby/vendor_ruby/migemo.rb:213:in `lookup'
        from /usr/share/ruby/vendor_ruby/migemo.rb:228:in `regex'
        from /usr/bin/migemo:138:in `block in main'
        from /usr/bin/migemo:147:in `call'
        from /usr/bin/migemo:147:in `main'
        from /usr/bin/migemo:163:in `<main>'


Expected results:

show pattern of 'aiueo' with euc-jp encoding.

Comment 1 Taiki Sugawara 2012-06-10 16:19:19 UTC
Created attachment 590763 [details]
patch for fix encoding probrem

Comment 2 Taiki Sugawara 2012-06-10 16:23:20 UTC
I create a small patch for this issue. Please see attachment 590763 [details].
And I also send pull request to https://github.com/yshl/migemo-for-Ruby-1.9

Comment 3 Mamoru TASAKA 2012-06-11 08:47:38 UTC
Just
array = line.chomp.force_encoding("EUC-JP").split("\t").delete_if do |x| x == nil end
is enough?

Comment 4 Taiki Sugawara 2012-06-11 13:40:32 UTC
It is just enough.

Comment 5 Mamoru TASAKA 2012-06-11 14:04:31 UTC
(In reply to comment #4)
> It is just enough.

Thank you.

Comment 6 Fedora Update System 2012-06-11 14:49:57 UTC
migemo-0.40-16.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/migemo-0.40-16.fc17

Comment 7 Fedora Update System 2012-06-13 21:28:37 UTC
Package migemo-0.40-16.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing migemo-0.40-16.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-9291/migemo-0.40-16.fc17
then log in and leave karma (feedback).

Comment 8 Fedora Update System 2012-06-19 15:10:29 UTC
migemo-0.40-16.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.