Bug 1109072
Summary: | subpackage for the tika standalone app | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Fabrice Bellet <fabrice> | ||||||
Component: | tika | Assignee: | gil cattaneo <puntogil> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 20 | CC: | java-sig-commits, puntogil | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | tika-1.4-4.fc20 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-06-27 21:49:25 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Fabrice Bellet
2014-06-13 08:06:24 UTC
Created attachment 908406 [details]
tike.spec patch
proposed patch for tika app subpackage
hi, the patch seem have wrong references to +BuildRequires: mvn(org.apache.felix:maven-shade-plugin:1.6) maven-shade-plugin should be removed from pom file, is not usable see http://fedoraproject.org/wiki/Packaging:No_Bundled_Libraries [1] (installable) tika-1.4/tika-app/target/original-tika-app-1.4.jar [1] tika-1.4/tika-app/target/tika-app-1.4.jar and please remove the version to (i used only how reminder ... e.g. isoparser ) +BuildRequires: mvn(com.google.code.gson:gson:1.7.1) ig you want use this module, you can create a launcher script with the following deps, merged by shade plugin (i don't know if are all necessary) [INFO] --- maven-shade-plugin:2.0:shade (default) @ tika-app --- [INFO] Including org.apache.tika:tika-parsers:jar:1.4 in the shaded jar. [INFO] Including javax.xml.stream:stax-api:jar:1.0.1 in the shaded jar. [INFO] Including org.apache.tika:tika-core:jar:1.4 in the shaded jar. [INFO] Including edu.ucar:netcdf:jar:4.2-min in the shaded jar. [INFO] Including edu.ucar:udunits:jar:4.4.2 in the shaded jar. [INFO] Including org.apache.httpcomponents:httpclient:jar:4.2.6 in the shaded jar. [INFO] Including org.apache.httpcomponents:httpcore:jar:4.2.5 in the shaded jar. [INFO] Including org.apache.httpcomponents:httpmime:jar:4.2.6 in the shaded jar. [INFO] Including org.slf4j:jcl-over-slf4j:jar:1.7.5 in the shaded jar. [INFO] Including joda-time:joda-time:jar:2.2 in the shaded jar. [INFO] Including org.jdom:jdom2:jar:2.0.4 in the shaded jar. [INFO] Including net.jcip:jcip-annotations:jar:1.0 in the shaded jar. [INFO] Including org.quartz-scheduler:quartz:jar:2.2.0 in the shaded jar. [INFO] Including javax.mail:mail:jar:any in the shaded jar. [INFO] Including javax.ejb:ejb:jar:any in the shaded jar. [INFO] Including javax.servlet:servlet-api:jar:any in the shaded jar. [INFO] Including javax.jms:jms:jar:any in the shaded jar. [INFO] Including javax.transaction:jta:jar:any in the shaded jar. [INFO] Including com.mchange:c3p0:jar:0.9.1.1 in the shaded jar. [INFO] Including com.mchange:mchange-commons-java:jar:0.2.3.4 in the shaded jar. [INFO] Including com.google.protobuf:protobuf-java:jar:2.5.0 in the shaded jar. [INFO] Including net.sf.ehcache:ehcache-core:jar:2.6.2 in the shaded jar. [INFO] Including com.sleepycat:je:jar:4.0.92 in the shaded jar. [INFO] Including org.ow2.asm:asm:jar:4.1 in the shaded jar. [INFO] Including org.apache.james:apache-mime4j-core:jar:0.7.2 in the shaded jar. [INFO] Including org.apache.james:apache-mime4j-dom:jar:0.7.2 in the shaded jar. [INFO] Including org.apache.commons:commons-compress:jar:1.5 in the shaded jar. [INFO] Including org.tukaani:xz:jar:1.2 in the shaded jar. [INFO] Including commons-codec:commons-codec:jar:1.5 in the shaded jar. [INFO] Including org.apache.pdfbox:pdfbox:jar:1.8.1 in the shaded jar. [INFO] Including org.apache.pdfbox:fontbox:jar:1.8.1 in the shaded jar. [INFO] Including org.apache.pdfbox:jempbox:jar:1.8.1 in the shaded jar. [INFO] Including commons-logging:commons-logging:jar:1.1.1 in the shaded jar. [INFO] Including avalon-framework:avalon-framework-api:jar:4.3 in the shaded jar. [INFO] Including avalon-logkit:avalon-logkit:jar:2.1 in the shaded jar. [INFO] Including org.bouncycastle:bcmail-jdk16:jar:1.45 in the shaded jar. [INFO] Including org.bouncycastle:bcprov-jdk16:jar:1.45 in the shaded jar. [INFO] Including org.apache.poi:poi:jar:3.9 in the shaded jar. [INFO] Including org.apache.poi:poi-scratchpad:jar:3.9 in the shaded jar. [INFO] Including org.apache.poi:poi-ooxml:jar:3.9 in the shaded jar. [INFO] Including org.apache.poi:poi-ooxml-schemas:jar:3.9 in the shaded jar. [INFO] Including org.apache.xmlbeans:xmlbeans:jar:2.3.0 in the shaded jar. [INFO] Including dom4j:dom4j:jar:1.6.1 in the shaded jar. [INFO] Including org.ccil.cowan.tagsoup:tagsoup:jar:1.2.1 in the shaded jar. [INFO] Including org.ow2.asm:asm-all:jar:4.1 in the shaded jar. [INFO] Including com.drewnoakes:metadata-extractor:jar:2 in the shaded jar. [INFO] Including xerces:xercesImpl:jar:2.8.1 in the shaded jar. [INFO] Including xml-apis:xml-apis:jar:1.4.01 in the shaded jar. [INFO] Including de.l3s.boilerpipe:boilerpipe:jar:1.1.0 in the shaded jar. [INFO] Including net.sourceforge.nekohtml:nekohtml:jar:1.9.14 in the shaded jar. [INFO] Including rome:rome:jar:0.9 in the shaded jar. [INFO] Including jdom:jdom:jar:1.0 in the shaded jar. [INFO] Including org.gagravarr:vorbis-java-core:jar:0.1 in the shaded jar. [INFO] Including com.googlecode.juniversalchardet:juniversalchardet:jar:1.0.3 in the shaded jar. [INFO] Including org.apache.tika:tika-xmp:jar:1.4 in the shaded jar. [INFO] Including com.adobe.xmp:xmpcore:jar:5.1.2 in the shaded jar. [INFO] Including org.slf4j:slf4j-log4j12:jar:1.5.6 in the shaded jar. [INFO] Including org.slf4j:slf4j-api:jar:1.7.4 in the shaded jar. [INFO] Including log4j:log4j:jar:1.2.17 in the shaded jar. [INFO] Including com.google.code.gson:gson:jar:1.7.1 in the shaded jar. regards e.g. %jpackage_script org.apache.tika.cli.TikaCLI "" "" %{name}/%{name}:[list of all required libs, with separate ":"] %{name}-app true %jpackage_script org.apache.tika.cli.TikaCLI "" "" %{name}:[list of all required libs, with separate ":"] %{name}-app true Created attachment 909877 [details]
updated spec file with maven-shade-plugin disabled
Ah I was not aware that all required jar files were bundled in the single tika-app jar file thanks to maven-shade-plugin: thanks for the feedback!
I updated the spec file following your suggestion. The standalone tika-app can process *almost* all sample documents from the tika-parsers subdir (except some encrypted microsoft office files), so I assume I collected all the dependencies required to the tika-parsers.
The problem with this wrapper script is that it cannot be used as-is in the drupal search_api_attachment module, because it expects the self contained tika-app.jar file, which was my primary motivation to get tika-app shipped in fedora, but I can live with that, and patch the drupal module accordingly (this is a third party module, not packaged in Fedora anyway).
Please , can attached a git format patch? tika-1.4-4.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/tika-1.4-4.fc20 great, it works for me, thanks! Package tika-1.4-4.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing tika-1.4-4.fc20' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-7507/tika-1.4-4.fc20 then log in and leave karma (feedback). tika-1.4-4.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report. |