Bug 554481
Summary: | Environment var TESSDATA_PREFIX not set; causes gscan2pdf to ignore tesseract language data. | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Daniel Berlin <dan.btown> |
Component: | gscan2pdf | Assignee: | Bernard Johnson <bjohnson> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 12 | CC: | bjohnson, karlikt, rpandit, vitor.dominor |
Target Milestone: | --- | Keywords: | EasyFix, i18n |
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | gscan2pdf-0.9.30-2.fc12 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-03-06 03:48:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Daniel Berlin
2010-01-11 20:08:35 UTC
(In reply to comment #0) Regarding the bugzilla ticket at hand, the author of gscan2pdf reported later today on the gscan2pdf help mailing list: "I have already fixed the development version. As soon as I have fixed a particularly nasty (different) little bug that has been eluding me for the last couple of months, I'll release it as 0.9.30." (In reply to comment #0) Pls. also note this feature request: http://code.google.com/p/tesseract-ocr/issues/detail?id=89&can=1 (In reply to comment #0) The bug reporter reclassified this ticket from the tesseract component to the gscan2pdf component because the unexpected behaviour shows when using gscan2pdf, not when using tesseract and the tesseract language packs alone. gscan2pdf-0.9.30-1.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/gscan2pdf-0.9.30-1.fc12 gscan2pdf-0.9.30-1.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update gscan2pdf'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1506 gscan2pdf-0.9.30-2.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/gscan2pdf-0.9.30-2.fc12 gscan2pdf-0.9.30-2.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update gscan2pdf'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1936 gscan2pdf-0.9.30-2.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report. In addition, it is necessary to change some lines in /usr/bin/gscan2pdf, so that it can locate the tesseract 3.00 language files now with the extension of .traineddata. In the lines 11098, 11101 and 11106, the occurrences of unicharset need to be replaced with traineddata. After that, gscanpdf correctly detects and lists the languages in the OCR dialog. (In reply to comment #9) > In addition, it is necessary to change some lines in /usr/bin/gscan2pdf, so > that it can locate the tesseract 3.00 language files now with the extension of > .traineddata. In the lines 11098, 11101 and 11106, the occurrences of > unicharset need to be replaced with traineddata. After that, gscanpdf correctly > detects and lists the languages in the OCR dialog. You should provide this information to the upstream developer since it's a change in the way gscan2pdf works. I didn't know this before, but it looks someone has already done that: http://sourceforge.net/tracker/?func=detail&aid=3246957&group_id=174140&atid=868098. |