Description of problem: I noticed that my glusterfind on 3.12.3 ran for 100s of hours straight without terminating. A quick strace showed that there were tons of pread64() syscalls in between each open() of a CHANGELOG.* file. Looking in /proc/$(pidof glusterfind)/fd, I found that the file it's pread64()ing from is the `tmp_output_1` sqlite file. It was clearly reading the entire database in via those syscalls for each *line* of each CHANGELOG.* file. To make it very clear, it was doing: for each CHANGELOG file: for each line in that file: read in the entire SQL database contents (9 MB in my case) Looking into the code, it beacame clear that there's a simple check implemented in glusterfind whether some line of a CHANGELOG.* file is already in the DB. That is done by checking whether some `gfid` is already in the `gfid` column. Unfortunately that column didn't have an SQL index defined, thus resulting in a full scan over the database for each check if the line already exists. If you use sqlite you must really make sure to use indexes, because otherwise any O(1) or O(log n) operation turns into a O(n) operation, thus giving glusterfind O(n²) complexity. I will submit a patch. It makes glusterfind 150x faster for me.
REVIEW: https://review.gluster.org/19114 (glusterfind: Speed up gfid lookup 100x by using an SQL index) posted (#1) for review on master by Niklas Hambüchen
COMMIT: https://review.gluster.org/19114 committed in master by with a commit message- glusterfind: Speed up gfid lookup 100x by using an SQL index Fixes #1529883. This fixes some bits of `glusterfind`'s horrible performance, making it 100x faster. Until now, glusterfind was, for each line in each CHANGELOG.* file, linearly reading the entire contents of the sqlite database in 4096-bytes-sized pread64() syscalls when executing the SELECT COUNT(1) FROM %s WHERE 1=1 AND gfid = ? query through the code path: get_changes() parse_changelog_to_db() when_data_meta() gfidpath_exists() _exists() In a quick benchmark on my laptop, doing one such `SELECT` query took ~75ms on a 10MB-sized sqlite DB, while doing the same query with an index took < 1ms. Change-Id: I8e7fe60f1f45a06c102f56b54d2ead9e0377794e BUG: 1529883 Signed-off-by: Niklas Hambüchen <mail>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.0, please open a new bug report. glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html [2] https://www.gluster.org/pipermail/gluster-users/