Bug 1967370 - grep with large --file using all memory
Summary: grep with large --file using all memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: grep
Version: 33
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
Assignee: Jaroslav Škarvada
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-03 02:14 UTC by Trevor Cordes
Modified: 2021-07-01 02:17 UTC (History)
3 users (show)

Fixed In Version: grep-3.4-6.fc33
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-01 02:17:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
attempt to backport the commit changes that fix my bug (5.25 KB, patch)
2021-06-04 20:55 UTC, Trevor Cordes
no flags Details | Diff
Updated patch (6.04 KB, patch)
2021-06-15 14:14 UTC, Jaroslav Škarvada
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1391442 1 None None None 2021-06-03 02:14:45 UTC

Description Trevor Cordes 2021-06-03 02:14:46 UTC
Description of problem:

Similar bug as one from 2016.
https://bugzilla.redhat.com/show_bug.cgi?id=1391442

Do this:
grep --only-matching --ignore-case  --word-regexp --file /usr/share/dict/words /usr/share/doc/dovecot/ChangeLog

Then watch your memory get exhausted (at least up to my 50GB RAM+swap).

The problem seems to be big files passed in with --file.  Strangely, unlike 2016, the haystack file also has to be big-ish.

Upstream had fixed it after that.  Upstream broke it again for Fedora 33's grep-3.4-5.fc33.x86_64.  Upstream has it fixed in current git version 3.6.18-70517.

Can we get F33 updated to the latest upstream?  I didn't check if it's fixed in their last stable release version, but I can do that if required.


Version-Release number of selected component (if applicable):
grep-3.4-5.fc33.x86_64


How reproducible:
always


Steps to Reproduce:
1. grep --only-matching --ignore-case  --word-regexp --file /usr/share/dict/words /usr/share/doc/dovecot/ChangeLog


Actual results:
Your RAM/swap get exhausted, no output, oom killer has to reap


Expected results:
Fixed version takes about 1.1GB RAM to run this and outputs the proper result

Additional info:
Fixed in upstream 3.6.18-70517 or earlier; between current and 3.4

Comment 1 Jaroslav Škarvada 2021-06-03 18:27:55 UTC
I don't want to rebase grep in the stable release. We did it in the past and it introduced problems - grep is heavily used by various tools and also during the build process. It was part of the criticial path and rebase could lead to unexpected troubles. But I could try backporting the fix. Could you point me to the upstream commit? There are only few, but none seems related.

Comment 2 Trevor Cordes 2021-06-04 06:57:19 UTC
Ok.  I did a git bisect to see what I could find.  I had to do it in reverse though, where good was bad and bad was good because bisect doesn't let me say newest is good and oldest is bad.  Weird.

Anyhow, it's commit ae65513edc80a1b65f19264b9bed95d870602967 that fixes my problem.  It's a bit puzzling, as my example is clearly using --only-matching which is the -o case the log mentions.  I tested with and without -o and --color and no matter what, the bug doesn't manifest as long as I'm using ae65513edc80a1b65f19264b9bed95d870602967.

It looks like 3.5 stable release would also work for me if perhaps it's less onerous to rebase to a stable.

As for yanking a surgical patch, I'll try that tomorrow and we'll see how easy that is.  Though looking at the diffs I can't fathom why this commit fixes my problem.  Perhaps someone familiar with the code would understand.

To double check, I recompiled the previous commit 34ada37baac651312b0e30092c086833a7223df6 and it does have my bug.  So my good/bad determination should be correct.

Comment 3 Trevor Cordes 2021-06-04 20:54:19 UTC
I managed to make a patch against grep-3.4-5.fc33.src.rpm and it compiles and it fixes my bug.

However rpmbuild -ba on the testing phase fails with 15 fails, but I have a hunch that's just my environment since I do this rpm stuff in my own quirky way.  Can you run the patch on a Fedora-approved build setup and see if it passes the tests?

If not, then backporting the required code is beyond my level of ability, as that means the fix is dependent on further commits.  That would require someone familiar with the code.

Comment 4 Trevor Cordes 2021-06-04 20:55:39 UTC
Created attachment 1789028 [details]
attempt to backport the commit changes that fix my bug

But fails rpmbuild tests on my system.

Comment 5 Jaroslav Škarvada 2021-06-15 13:17:31 UTC
(In reply to Trevor Cordes from comment #4)
> Created attachment 1789028 [details]
> attempt to backport the commit changes that fix my bug
> 
> But fails rpmbuild tests on my system.

Thanks, LGTM.

Comment 6 Jaroslav Škarvada 2021-06-15 13:48:44 UTC
(In reply to Jaroslav Škarvada from comment #5)
> (In reply to Trevor Cordes from comment #4)
> > Created attachment 1789028 [details]
> > attempt to backport the commit changes that fix my bug
> > 
> > But fails rpmbuild tests on my system.
> 
> Thanks, LGTM.

Nope, it's not good, it's introducing new regression:
FAIL word-delim-multibyte (exit status: 1)

The test starts failing.

Comment 7 Jaroslav Škarvada 2021-06-15 14:14:17 UTC
Created attachment 1791274 [details]
Updated patch

Comment 8 Fedora Update System 2021-06-15 14:45:30 UTC
FEDORA-2021-b70a99a8fb has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2021-b70a99a8fb

Comment 9 Fedora Update System 2021-06-16 01:44:34 UTC
FEDORA-2021-b70a99a8fb has been pushed to the Fedora 33 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-b70a99a8fb`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-b70a99a8fb

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Trevor Cordes 2021-06-19 03:37:13 UTC
Thanks for that, I'll test now and leave karma if it works.  The patches were almost the same; I guessed some of the .h stuff was superfluous.  But the latest git has that switcharoo in the order in GEAcompile... not sure why, or what else had to go with it to make it work.  But it doesn't matter if your patch works!

Comment 11 Fedora Update System 2021-07-01 02:17:23 UTC
FEDORA-2021-b70a99a8fb has been pushed to the Fedora 33 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.