Bug 1648154

Summary: pdfgrep crashing when searching recursive
Product: [Fedora] Fedora Reporter: A. Nielsen <bugz>
Component: pdfgrepAssignee: Joachim de Groot <jdegroot>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 29CC: bugz, jdegroot, redhat-bugzilla
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: pdfgrep-2.1.2-1.fc29 pdfgrep-2.1.2-1.fc28 pdfgrep-2.1.2-1.fc27 pdfgrep-2.1.2-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-22 03:21:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description A. Nielsen 2018-11-08 22:53:33 UTC
Description of problem:

pdfgrep -r and -R switch not working as expected. pdfgrep crashing in certain scenarios.


Version-Release number of selected component (if applicable):

 pdfgrep -V
This is pdfgrep version 2.1.1.

Using poppler version 0.67.0
Using libpcre version 8.42 2018-03-20

 uname -r
4.18.16-300.fc29.x86_64

How reproducible:

Tested on AMD FX-8120 workstation and Intel i3-3210 laptop both Fedora 29. Same result. Before upgrade, running Fedora 28, pdfgrep working as expected.
As far as I remember it was also working after upgrade to 29, but some later update has broken it.

Steps to Reproduce:
1.
Stay in dir above multiple dirs containing pdf's to be searched.

2.
Apply 'pdfgrep -r "searchstring" *'

3.
$ pdfgrep -r "Test" *

Actual results:

/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = char; _Alloc = std::allocator<char>; std::vector<_Tp, _Alloc>::reference = char&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
Aborted (smed kerne)

Expected results:
List of pdf's containing "searchstring" (Using self-compiled pdfgrep here) like:

~/Temp/Compile/pdfgrep/src/pdfgrep -r "Test" *
Fakturaer 2017 H1/Bogført/Faktura 254.pdf:  10000           1      Testsæt - Revnesøger                                                     706,50                        20        565,20
Fakturaer 2017 H2/Bogført/Faktura 347.pdf:Fyret prøvekørt på værksted OK. Testet ombord - OK, men bemærkninger - Se mail af dd.                             Side:                   1 af 1
Fakturaer 2017 H2/Bogført/Faktura 357.pdf:Fejlsøgt m. eksternt skærmkort - BIOS opdateret - Testet OK                                                       




Additional info:
As seen, if I compile from GIT source, it works as expected.

$ ~/Temp/Compile/pdfgrep/src/pdfgrep -V
This is pdfgrep version 2.1.1.

Using poppler version 0.67.0
Using libpcre version 8.42 2018-03-20
Built from git-commit v2.1.1-12-g3e7a9ae

Comment 1 Robert Scheck 2018-11-08 23:19:15 UTC
Did you use the same compiler flags for your own build like the rpm build
process does? At https://gitlab.com/pdfgrep/pdfgrep/commits/master, I do not
see the commit g3e7a9ae?!

Comment 2 A. Nielsen 2018-11-08 23:44:38 UTC
I did not do anything to match compiler flags. I don't know which flags the rpm is build with and I haven't checked the one I used.


The last commit at the gitlab masterbranch is

3e7a9ae5

- Too close to g3e7a9ae to be a coincidence?

Axel

Comment 3 Robert Scheck 2018-11-09 00:09:42 UTC
(In reply to bugz from comment #2)
> I did not do anything to match compiler flags. I don't know which flags the
> rpm is build with and I haven't checked the one I used.

https://koji.fedoraproject.org/koji/taskinfo?taskID=30746508 - can you please
test this build and let me know? This is 3e7a9ae5 actually (as scratch build).

Comment 4 A. Nielsen 2018-11-09 07:27:33 UTC
Just tried, still crashes:

$ pdfgrep -r "Test" *
/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = char; _Alloc = std::allocator<char>; std::vector<_Tp, _Alloc>::reference = char&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
Aborted (smed kerne)

If I downgrade (rpm) to:
 pdfgrep                       x86_64                       1.3.1-10.fc29

$ pdfgrep -V
This is pdfgrep version 1.3.1

It then works!

I'll check compiler flag diff rpm/my local version later.

-Axel

Comment 5 A. Nielsen 2018-11-10 22:33:28 UTC
Created a build log:

cat Build_101118.log 
make  all-recursive
make[1]: Entering directory '/home/axel/data/Temp/Compile/pdfgrep'
Making all in src
make[2]: Entering directory '/home/axel/data/Temp/Compile/pdfgrep/src'
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT pdfgrep.o -MD -MP -MF .deps/pdfgrep.Tpo -c -o pdfgrep.o pdfgrep.cc
mv -f .deps/pdfgrep.Tpo .deps/pdfgrep.Po
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT output.o -MD -MP -MF .deps/output.Tpo -c -o output.o output.cc
mv -f .deps/output.Tpo .deps/output.Po
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT exclude.o -MD -MP -MF .deps/exclude.Tpo -c -o exclude.o exclude.cc
mv -f .deps/exclude.Tpo .deps/exclude.Po
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT regengine.o -MD -MP -MF .deps/regengine.Tpo -c -o regengine.o regengine.cc
mv -f .deps/regengine.Tpo .deps/regengine.Po
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT search.o -MD -MP -MF .deps/search.Tpo -c -o search.o search.cc
mv -f .deps/search.Tpo .deps/search.Po
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT cache.o -MD -MP -MF .deps/cache.Tpo -c -o cache.o cache.cc
mv -f .deps/cache.Tpo .deps/cache.Po
g++ -std=c++14 -DHAVE_CONFIG_H -I. -I..  -I/usr/include/poppler/cpp -I/usr/include/poppler        -g -O2 -MT intervals.o -MD -MP -MF .deps/intervals.Tpo -c -o intervals.o intervals.cc
mv -f .deps/intervals.Tpo .deps/intervals.Po
g++ -std=c++14  -g -O2   -o pdfgrep pdfgrep.o output.o exclude.o regengine.o search.o cache.o intervals.o -lpoppler-cpp   -lpcre   -lgcrypt -ldl -lgpg-error


-Axel

Comment 6 Robert Scheck 2018-11-10 22:59:52 UTC
Yeah, you are building with about no compiler flags at all. Thus the cause is
likely buggy upstream code; filed: https://gitlab.com/pdfgrep/pdfgrep/issues/31

Comment 7 A. Nielsen 2018-11-12 14:26:44 UTC
Thanks!

Comment 8 Robert Scheck 2018-11-17 17:55:52 UTC
Could you test https://koji.fedoraproject.org/koji/taskinfo?taskID=30948220,
please? It's from f0513102.

Comment 9 A. Nielsen 2018-11-17 19:34:28 UTC
Hi,
I can confirm it's now working as expected!
Have only tested on x86_64 though.

-Axel

Comment 10 Fedora Update System 2018-11-19 22:21:13 UTC
pdfgrep-2.1.2-1.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-a5f70d1856

Comment 11 Fedora Update System 2018-11-19 22:22:14 UTC
pdfgrep-2.1.2-1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-00c02a772b

Comment 12 Fedora Update System 2018-11-19 22:22:34 UTC
pdfgrep-2.1.2-1.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-4379fa3f2a

Comment 13 Fedora Update System 2018-11-19 22:22:59 UTC
pdfgrep-2.1.2-1.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2018-49a1c4161e

Comment 14 Fedora Update System 2018-11-21 04:31:28 UTC
pdfgrep-2.1.2-1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-00c02a772b

Comment 15 Fedora Update System 2018-11-21 05:10:35 UTC
pdfgrep-2.1.2-1.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-a5f70d1856

Comment 16 Fedora Update System 2018-11-21 14:30:57 UTC
pdfgrep-2.1.2-1.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-4379fa3f2a

Comment 17 Fedora Update System 2018-11-21 18:35:25 UTC
pdfgrep-2.1.2-1.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2018-49a1c4161e

Comment 18 Fedora Update System 2018-11-22 03:21:01 UTC
pdfgrep-2.1.2-1.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 19 Fedora Update System 2018-11-29 02:27:00 UTC
pdfgrep-2.1.2-1.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 20 Fedora Update System 2018-11-29 04:52:33 UTC
pdfgrep-2.1.2-1.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.

Comment 21 Fedora Update System 2018-12-07 00:43:39 UTC
pdfgrep-2.1.2-1.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.