Red Hat Bugzilla – Bug 839068
inverted index: subtle corruption in cscope.in.po
Last modified: 2012-07-19 21:52:15 EDT
+++ This bug was initially created as a clone of Bug #837626 +++
Description of problem:
On x86_64 cscope creates a subtly corrupted inverted indices.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. [in kernel git repo] git ls-files "*.[ch]" | cscope -bqkv -i-
Subtle corruption in the inverted index.
Correct inverted index.
0) While researching bug #837296 I wrote a simple index parser (which will attach shortly).
1) After _lots_ of staring at the cscope code and the inverted indices it generates I noticed it subtly corrupts the inverted its indices (on x86_64).
2) Some background:
- the index consist of blocks of 16384 bytes.
- each blocks starts with a 3 longs sized header:
* number of terms in this block
* number of next block
* number of previous block
- this is followed by a list (number of terms long) of entries, of size 2 long. These entries contain (amongst other things) the offset to an entry in the next list.
- this next list (also number of terms long) contains the strings of these terms and their offset in the postings file (ie, cscope.po.out)
The first list grows up, the second list grows down. Due to errors in the handling of x86_64's long size these two lists end up overwriting each other. Net effect: some terms are not properly indexed.
3) I'll also attach two patches that (together) seem to fix this issue.
--- Additional comment from email@example.com on 2012-07-04 08:50:26 EDT ---
Created attachment 596212 [details]
simple parse of inverted index file (cscope.in.out)
Usage: perl index.pl cscope.in.out
--- Additional comment from firstname.lastname@example.org on 2012-07-04 08:55:48 EDT ---
Created attachment 596217 [details]
fix code that is fairly obviously wrond on x86_64
0) This patch fixes code (and some comments) that is fairly obviously wrong. "amtused" (amount used?) starts at 16 (3 * 4 plus 4 for security). I suggest changing this to simply 3 * sizeof(long).
1) So this also drops the security margin: that shouldn't be necessary.
2) Comments welcome.
3) I'd be happy to turn this into a proper patch after review. (Does cscope have a git repo?)
--- Additional comment from email@example.com on 2012-07-04 09:07:28 EDT ---
Created attachment 596219 [details]
fix code that is apparently also incorrect
0) The previous patch in itself doesn't fix all corruption. Some further staring at the code suggested this patch, which simply changes another suspicious magic constant at a strategic spot. (The spot were the code somehow manages to remove the last term from a block, ie drop the term that would be too large to also fit in that block.)
1) This code seems to work. I'm far from sure why. To become sure requires in depth analysis of invmake() and invnewterm(). But I happen to value my sanity. So chances are I will not be a to do that analysis...
2) Review appreciated.
--- Additional comment from firstname.lastname@example.org on 2012-07-04 13:02:45 EDT ---
thanks, I'll take a look at this tomorrow!
--- Additional comment from email@example.com on 2012-07-10 15:57:26 EDT ---
Paul, I've loked at your code and all your changes look reasonable to me. When cscope was written back in Bell labs in the 80's I imagine no one ever thought that longs would be more than 4 bytes.
I've been meaning to convert the cscope tree from cvs to git forever now, but for now you can browse the upstream source at sourceforge (http://cscope.sourceforge.net). i'll integrate these patches as they are upstream and pull them back to fedora shortly. Thanks!
cscope-15.8-2.fc17 has been submitted as an update for Fedora 17.
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing cscope-15.8-2.fc17'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
cscope-15.8-2.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.