If a repository has a tika-text-extractor configured and a binary file representing a Tika-recognized image is uploaded (e.g. JPEG, GIF, BMP), the text extraction fails with: 5:57:05,750 ERROR [org.modeshape.extractor.tika.TikaTextExtractor] (modeshape-text-extractor-5-thread-1) Error while extracting text : com/drew/metadata/MetadataException: java.lang.NoClassDefFoundError: com/drew/metadata/MetadataException at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:56) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.modeshape.extractor.tika.TikaTextExtractor$1.execute(TikaTextExtractor.java:134) at org.modeshape.jcr.api.text.TextExtractor.processStream(TextExtractor.java:82) [modeshape-jcr-api-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.modeshape.extractor.tika.TikaTextExtractor.extractFrom(TikaTextExtractor.java:124) at org.modeshape.jcr.TextExtractors$Worker.run(TextExtractors.java:182) [modeshape-jcr-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45] Caused by: java.lang.ClassNotFoundException: com.drew.metadata.MetadataException from [Module "org.apache.tika:1.3" from local module loader @3336a1a1 (finder: local module finder @47ad6b4b (roots: d:\Work\hchiorean.modeshape\integration\modeshape-jbossas-integration-tests\target\jboss-eap-6.1\modules,d:\Work\hchiorean.modeshape\integration\modeshape-jbossas-integration-tests\target\jboss-eap-6.1\modules\system\layers\base))] at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:196) [jboss-modules.jar:1.2.0.Final-redhat-1] at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:444) [jboss-modules.jar:1.2.0.Final-redhat-1] at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:432) [jboss-modules.jar:1.2.0.Final-redhat-1] at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:399) [jboss-modules.jar:1.2.0.Final-redhat-1] at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:374) [jboss-modules.jar:1.2.0.Final-redhat-1] at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:119) [jboss-modules.jar:1.2.0.Final-redhat-1] ... 9 more
https://github.com/jboss-integration/modeshape/commit/90f87990b845fc3f7034a8bdb9955392a83e7318
Randall Hauch <rhauch> updated the status of jira MODE-2022 to Closed
Fixed before GA - setting to requires_doc_text- accordingly