Red Hat Bugzilla – Bug 837626
inverted index: subtle corruption in cscope.in.po
Last modified: 2012-07-19 21:53:35 EDT
Description of problem:
On x86_64 cscope creates a subtly corrupted inverted indices.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. [in kernel git repo] git ls-files "*.[ch]" | cscope -bqkv -i-
Subtle corruption in the inverted index.
Correct inverted index.
0) While researching bug #837296 I wrote a simple index parser (which will attach shortly).
1) After _lots_ of staring at the cscope code and the inverted indices it generates I noticed it subtly corrupts the inverted its indices (on x86_64).
2) Some background:
- the index consist of blocks of 16384 bytes.
- each blocks starts with a 3 longs sized header:
* number of terms in this block
* number of next block
* number of previous block
- this is followed by a list (number of terms long) of entries, of size 2 long. These entries contain (amongst other things) the offset to an entry in the next list.
- this next list (also number of terms long) contains the strings of these terms and their offset in the postings file (ie, cscope.po.out)
The first list grows up, the second list grows down. Due to errors in the handling of x86_64's long size these two lists end up overwriting each other. Net effect: some terms are not properly indexed.
3) I'll also attach two patches that (together) seem to fix this issue.
Created attachment 596212 [details]
simple parse of inverted index file (cscope.in.out)
Usage: perl index.pl cscope.in.out
Created attachment 596217 [details]
fix code that is fairly obviously wrond on x86_64
0) This patch fixes code (and some comments) that is fairly obviously wrong. "amtused" (amount used?) starts at 16 (3 * 4 plus 4 for security). I suggest changing this to simply 3 * sizeof(long).
1) So this also drops the security margin: that shouldn't be necessary.
2) Comments welcome.
3) I'd be happy to turn this into a proper patch after review. (Does cscope have a git repo?)
Created attachment 596219 [details]
fix code that is apparently also incorrect
0) The previous patch in itself doesn't fix all corruption. Some further staring at the code suggested this patch, which simply changes another suspicious magic constant at a strategic spot. (The spot were the code somehow manages to remove the last term from a block, ie drop the term that would be too large to also fit in that block.)
1) This code seems to work. I'm far from sure why. To become sure requires in depth analysis of invmake() and invnewterm(). But I happen to value my sanity. So chances are I will not be a to do that analysis...
2) Review appreciated.
thanks, I'll take a look at this tomorrow!
Paul, I've loked at your code and all your changes look reasonable to me. When cscope was written back in Bell labs in the 80's I imagine no one ever thought that longs would be more than 4 bytes.
I've been meaning to convert the cscope tree from cvs to git forever now, but for now you can browse the upstream source at sourceforge (http://cscope.sourceforge.net). i'll integrate these patches as they are upstream and pull them back to fedora shortly. Thanks!
cscope-15.8-2.fc16 has been submitted as an update for Fedora 16.
(In reply to comment #5)
> I've been meaning to convert the cscope tree from cvs to git forever now,
> but for now you can browse the upstream source at sourceforge
It's probably quite a job to convert from cvs to git. I can't help with that job (having never done it myself). That would make it rather self-serving to just say that a git repository for cscope would be greatly appreciated. So I'll refrain from saying that.
> i'll integrate these patches as they are
> upstream and pull them back to fedora shortly. Thanks!
Would you prefer that future patches, if any, be submitted upstream? Fedora cscope is now patch clean, so Fedora specific patches should be rare.
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing cscope-15.8-2.fc16'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
cscope-15.8-2.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.