Bug 1004643 - Tika extractor fails while attempting to extract metadata from images
Summary: Tika extractor fails while attempting to extract metadata from images
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Data Virtualization 6
Classification: JBoss
Component: ModeShape
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ER2
: 6.0
Assignee: Horia Chiorean
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-05 06:43 UTC by Horia Chiorean
Modified: 2016-02-10 08:53 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-02-10 08:53:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker MODE-2022 0 Major Closed Tika extractor fails while attempting to extract metadata from images 2014-07-20 23:26:33 UTC

Description Horia Chiorean 2013-09-05 06:43:55 UTC
If a repository has a tika-text-extractor configured and a binary file representing a Tika-recognized image is uploaded (e.g. JPEG, GIF, BMP), the text extraction fails with:

5:57:05,750 ERROR [org.modeshape.extractor.tika.TikaTextExtractor] (modeshape-text-extractor-5-thread-1) Error while extracting text : com/drew/metadata/MetadataException: java.lang.NoClassDefFoundError: com/drew/metadata/MetadataException
	at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:56)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
	at org.modeshape.extractor.tika.TikaTextExtractor$1.execute(TikaTextExtractor.java:134)
	at org.modeshape.jcr.api.text.TextExtractor.processStream(TextExtractor.java:82) [modeshape-jcr-api-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
	at org.modeshape.extractor.tika.TikaTextExtractor.extractFrom(TikaTextExtractor.java:124)
	at org.modeshape.jcr.TextExtractors$Worker.run(TextExtractors.java:182) [modeshape-jcr-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]
Caused by: java.lang.ClassNotFoundException: com.drew.metadata.MetadataException from [Module "org.apache.tika:1.3" from local module loader @3336a1a1 (finder: local module finder @47ad6b4b (roots: d:\Work\hchiorean.modeshape\integration\modeshape-jbossas-integration-tests\target\jboss-eap-6.1\modules,d:\Work\hchiorean.modeshape\integration\modeshape-jbossas-integration-tests\target\jboss-eap-6.1\modules\system\layers\base))]
	at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:196) [jboss-modules.jar:1.2.0.Final-redhat-1]
	at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:444) [jboss-modules.jar:1.2.0.Final-redhat-1]
	at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:432) [jboss-modules.jar:1.2.0.Final-redhat-1]
	at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:399) [jboss-modules.jar:1.2.0.Final-redhat-1]
	at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:374) [jboss-modules.jar:1.2.0.Final-redhat-1]
	at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:119) [jboss-modules.jar:1.2.0.Final-redhat-1]
	... 9 more

Comment 6 JBoss JIRA Server 2013-10-24 09:22:11 UTC
Randall Hauch <rhauch> updated the status of jira MODE-2022 to Closed

Comment 7 belong 2013-11-19 23:18:38 UTC
Fixed before GA - setting to requires_doc_text- accordingly


Note You need to log in before you can comment on or make changes to this bug.