Bug 743343
| Summary: | gawk + byacc = YACC Stack Overflow regression | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Vojtech Vitek <vvitek> | ||||
| Component: | byacc | Assignee: | Petr Machata <pmachata> | ||||
| Status: | CLOSED ERRATA | QA Contact: | qe-baseos-tools-bugs | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.1 | CC: | dickey, fche, hripps, konrad, mnewsome, mnowak, mpolacek, vvitek | ||||
| Target Milestone: | rc | Keywords: | Reopened | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 743242 | Environment: | |||||
| Last Closed: | 2012-06-20 11:57:10 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 743242 | ||||||
| Attachments: |
|
||||||
|
Comment 2
RHEL Program Management
2011-10-04 16:48:48 UTC
Version-Release number of selected component: byacc-1.9.20070509-6.1.el6 The problem arises because of deep structure of else-ifs. The grammar rule is as follows:
if_statement : /* ... */
| LEX_IF '(' exp r_paren opt_nls statement
LEX_ELSE opt_nls statement /* ... */
This is right-recursive, and costs stack space. Maximum stack size can be adjusted by putting the following line into awkgram.y:
#define YYMAXDEPTH 5000 /* default is 500 in yacc */
But since right recursion spends stack, there will always be a hard limit on number of else-ifs that you can parse. That's an inherent shortcoming of the LR algorithm.
If that ever worked, it means that either the rule used left recursion back then (in principle it should be possible to rewrite the rule using left recursion, but in practice one may need to restructure the grammar to resolve lookahead conflicts); or it was compiled against bison, which sets much larger maximum stack depth, 10000.
Oh, but yacc actually used to set YYMAXDEPHT to 10K _too_. Interesting, I'm looking into why would they decrease the value. Petr --
Did RH change the build to link against byacc instead of bison on 2010-06-25 ?
From gawk.spec:
* Fri Jun 25 2010 Jan Zeleny <jzeleny> - 3.1.7-5
- fixed two syntax issues (#607915, #607916), added byacc to BuildRequires
Am I reading the CHANGELOG from the SPEC file correctly ?
gawk defaults to link bison by default. From gawk-3.1.7 configure --help:
Some influential environment variables:
CC C compiler command
CFLAGS C compiler flags
LDFLAGS linker flags, e.g. -L<lib dir> if you have libraries in a
nonstandard directory <lib dir>
LIBS libraries to pass to the linker, e.g. -l<library>
CPPFLAGS (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
you have headers in a nonstandard directory <include dir>
CPP C preprocessor
YACC The `Yet Another C Compiler' implementation to use. Defaults to
the first program found out of: `bison -y', `byacc', `yacc'.
YFLAGS The list of arguments that will be passed by default to $YACC.
This script will default YFLAGS to the empty string to avoid a
default value of `-d' given by some make applications.
Again, the same machine-generated ProcessParms.awk script has been running
without issue on RH machines all the way back to RedHat 4.2 ...
Thanks.
-- kjh
Created attachment 526494 [details]
Le fix
Can't find anything relevant in the documentation or otherwise. The upstream, naturally, doesn't use a VCS, so we are out of luck here, too. Dunno. I'll just bump it back to 10K, it's a maximum value, yacc still allocates it dynamically.
Note that the above about the recursion still holds. Even for max of 10K, you can still construct a source file that overflows.
(In reply to comment #6) > Did RH change the build to link against byacc instead of bison on 2010-06-25 ? > > From gawk.spec: > > * Fri Jun 25 2010 Jan Zeleny <jzeleny> - 3.1.7-5 > - fixed two syntax issues (#607915, #607916), added byacc to BuildRequires Maybe. It's hard to be sure, this might simply be a formalization of already existing rule. The build roots contain a lot of packages by default, and sometimes you just don't need to spell out all the dependencies. For example, nobody lists gcc as a BuildRequires, because it's simply always there, by default. This might be one of these things. Jan Zeleny would know better, I'll have him chime in. But yacc upstream really changed the max from 10K to 500 for some reason, and this actually is a change that, other things being equal, would result in awk working before and failing now on that test file. Actually, you are right! We used bison for builds in RHEL 5 throughout. Then in RHEL 6, first several builds are with neither bison nor byacc, which means it would presumably use the pre-distributed parser, so the resulting binary would contain whatever limit corresponds to the tool that upstream used to generate that parser. In gawk-3.1.7-5.el6 we started pulling in byacc instead, which has a limit of 500. That's the change documented in the spec file. Petr --
I agree ... I've been lucky up to now with the number of } else if {
constructs.
I've changed my awk script-generator to replace the insanely long series of
if()
{
...
}
else if ()
{
...
}
...
constructs with
if ()
{
... ;
continue ;
}
if ()
{
... ;
continue ;
}
...
This approach works for ProcessParms.awk ( my original attachment ) but now some of my 'main' scripts hit the same wall :(
-- kjh
Yikes YACC ! From 10,000 to 500 on the stack depth. I suppose I am the only RHEL 6 user to misuse gawk so badly :) -- kjh Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0749.html Note on comment #8: byacc did not "change" the limit; probably what is referred to is a patch by Red Hat which was overlooked when switching to the version which I maintain. Ironically, the answer to that would be in Red Hat's vcs, though I made a to-do item to check for errata of this type (see also http://invisible-island.net/personal/cm-tools.html). |