Description of problem: Beagle does not include a filter for application/msword. Version-Release number of selected component (if applicable): beagle-0.2.10-6.fc6 wv-1.2.4-1.fc6 How reproducible: Always, for all msword files. Steps to Reproduce: 1. beagle-extract-content any_msword_file.doc Actual results: Filename: file:///u/samba/public/nautilus/f/EmployeeTimeRecord.doc Debug: Loaded 45 filters from /usr/lib/beagle/Filters/Filters.dll Debug: No filter for file:///u/samba/public/nautilus/f/EmployeeTimeRecord.doc (/u/samba/public/nautilus/f/EmployeeTimeRecord.doc) [application/msword] No filter for application/msword Expected results: The text of the document. Additional info:
Created attachment 140569 [details] Sample file that doesn't work.
My original report was made based on the behavior of a machine which I had upgraded from FC5. I have now confirmed this problem with a fresh install on another machine. In addition, even if I include my own external filter get no snippets for files of type application/msword. Excel files are indexed as expected. BTW, I feel a bit self-conscious about caring whether MS docs get indexed. I'm trying to get Fedora into an environment which, unfortunately, currently uses MSOffice exclusively. Moving file services from WIndows to Samba and providing a beagle based app with a web front end is Fedora's first task. I want Fedora to come out looking good.
If you rebuild beagle with --enable-wv1, is this fixed?
Yes. That fixes it. Thank You, Steve
Please note that there seems to be a new and relevant upstream problem in 0.2.12. http://bugzilla.gnome.org/show_bug.cgi?id=371152#c9
I added this to rawhide.
Name : beagle Relocations: (not relocatable) Version : 0.2.13 Vendor: Red Hat, Inc. Release : 1.fc6 Build Date: Tue 28 Nov 2006 13:16:34 GMT I still have a problem with the above build when attempting to index word documents. [dfurlong@localhost ~]$ beagle-extract-content Documents/Purchase\ Orders/djf-20060626.doc Filename: file:///home/dfurlong/Documents/Purchase Orders/djf-20060626.doc Debug: Loaded 49 filters from /usr/lib/beagle/Filters/Filters.dll Debug: No filter for file:///home/dfurlong/Documents/Purchase Orders/djf-20060626.doc (/home/dfurlong/Documents/Purchase Orders/djf-20060626.doc) [application/msword] No filter for application/msword
Do you have "wv" installed? Its in extras.
Yup Name : wv Relocations: (not relocatable) Version : 1.2.4 Vendor: Fedora Project Release : 1.fc6 Build Date: Sat 28 Oct 2006 09:00:53 GMT
strange. Maybe it needs to be there when building. If you rebuild the srpm, does that fix it?
I don't have a spare box to do this on, and building that package would involve installing about 15 RPM's + all it's dependancies. I did notice that there is wv and wv2, could it be that you built it against wv2?
Okay, I did it any way to help expadite the issue. I have wv and wv-devel installed, on rebuilding the package, I now get valid information from beagle-extract-content. beagle-extract-content Documents/Purchase\ Orders/djf-20060626.doc Filename: file:///home/dfurlong/Documents/Purchase Orders/djf-20060626.doc Debug: Loaded 50 filters from /usr/lib/beagle/Filters/Filters.dll Warn: DocumentSummaryInformationStream not found in /home/dfurlong/Documents/Purchase Orders/djf-20060626.doc Filter: Beagle.Filters.FilterDOC MimeType: application/msword Properties: Timestamp = 29/06/2006 14:24:44 dc:title = Vendor fixme:last-saved-by = saes etc etc etc. I hope this helps
I am not sure what you will get from this however.... After a restart, attempting to run the beagle-extract-content command on a word doc resulted in a failure. Re-moving and re-installing the files created from the rpm-build command, once again allowed me to query the doc files. Unfortunately I do not know enough to realy debug this. let me know if you need any more info from me.
Things are a bit tricky because wv is in extras, the comming merge will solve that though.
Fedora apologizes that these issues have not been resolved yet. We're sorry it's taken so long for your bug to be properly triaged and acted on. We appreciate the time you took to report this issue and want to make sure no important bugs slip through the cracks. If you're currently running a version of Fedora Core between 1 and 6, please note that Fedora no longer maintains these releases. We strongly encourage you to upgrade to a current Fedora release. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained and closing them. http://fedoraproject.org/wiki/LifeCycle/EOL If this bug is still open against Fedora Core 1 through 6, thirty days from now, it will be closed 'WONTFIX'. If you can reporduce this bug in the latest Fedora version, please change to the respective version. If you are unable to do this, please add a comment to this bug requesting the change. Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we are following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. And if you'd like to join the bug triage team to help make things better, check out http://fedoraproject.org/wiki/BugZappers
This bug is open for a Fedora version that is no longer maintained and will not be fixed by Fedora. Therefore we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen thus bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.