Bug 879233 - RFE: fast file pattern matching
Summary: RFE: fast file pattern matching
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: libdnf
Version: rawhide
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: ---
Assignee: rpm-software-management
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1186604 (view as bug list)
Depends On:
Blocks: 1156501 1192811
TreeView+ depends on / blocked
 
Reported: 2012-11-22 12:05 UTC by Daniel Mach
Modified: 2018-05-29 14:46 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-05-29 14:46:37 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Daniel Mach 2012-11-22 12:05:52 UTC
Part of mash, there's a script which tells if a package should be multilib or not:
http://git.fedorahosted.org/cgit/mash/tree/mash/multilib.py

If the number of patterns is high (say 200+), and repo is large, then using fnmatch or regexps is very slow.

It would be nice to have something which works fast and is part of either hawkey or libsolv. It might be used in repoquery too.


Input:
<pattern_with_wildcards_1> [token_1]
<pattern_with_wildcards_2> [token_2]
etc.

+ a repo

Output:
/foo/bar matches pattern1 -> [token1] is returned
/foo/bar/baz matches pattern1 and pattern2 -> [token1, token2] is returned
/xxx doesn't match anything -> None is returned



It may be possible to use Aho-Corasick algorithm as suggested here[1]. Unfortunately esmre did not work for me well (doesn't seem to respect ^ and $ chars).

[1] http://stackoverflow.com/questions/12904860/how-to-match-a-string-against-a-set-of-wildcard-strings-efficiently
[2] http://code.google.com/p/esmre/

Comment 1 Ales Kozumplik 2013-12-17 16:19:14 UTC
It might be a bit surprising but there has been no other complains about the speed of filename matching since this bug. Depriortizing.

Comment 2 Fedora End Of Life 2013-12-21 09:28:24 UTC
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 3 Honza Silhan 2015-07-21 08:53:43 UTC
see bug 1192811 comment 3 for requested solution

Comment 4 Honza Silhan 2015-07-21 09:09:23 UTC
*** Bug 1186604 has been marked as a duplicate of this bug. ***

Comment 5 Fedora Admin XMLRPC Client 2016-07-08 09:24:18 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 6 Fedora Admin XMLRPC Client 2017-03-10 13:43:05 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 7 Daniel Mach 2018-05-29 14:46:37 UTC
I no longer work with mash / pungi and nobody else is interested in this feature. Closing.


Note You need to log in before you can comment on or make changes to this bug.