Created attachment 1677803 [details] Minimal reproducer Description of problem: gcc verison 4.8.5-39 runs out of memory whereas same works on old version. Tried on gcc 4.4.7-3. Version-Release number of selected component (if applicable): gcc 4.8.5-39 How reproducible: Untar attached tar file and use below mentioned compile command. On old g++ (4.4.7-3) compilation is successful. On 4.8.5-39, it fails. Steps to Reproduce: 1. tar zxvf gcc_data.tgz 2. cd gcc_data Using -fstack-reuse=named-vars or -fstack-reuse=none does not allow the latest code to compile: ============================================================== $ g++ -march=nocona -mfpmath=sse -include inclall.h -o s1823_1.o -shared -O0 -fno-strict-aliasing -fstack-reuse=named_vars -fexceptions -fsigned-char -Wno-invalid-offsetof -fpic -fpermissive -DNZDEBUG=0 -DGENCODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DFOR_SPU -I./include s4_7.cpp ============================================================== Or ============================================================== $ g++ -march=nocona -mfpmath=sse -include inclall.h -o s1823_1.o -shared -O0 -fno-strict-aliasing -fstack-reuse=none -fexceptions -fsigned-char -Wno-invalid-offsetof -fpic -fpermissive -DNZDEBUG=0 -DGENCODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DFOR_SPU -I./include s4_7.cpp ============================================================== Actual results: $ g++ -march=nocona -mfpmath=sse -include inclall.h -o s1823_1.o -shared -O0 -fno-strict-aliasing -fstack-reuse=named_vars -fexceptions -fsigned-char -Wno-invalid-offsetof -fpic -fpermissive -DNZDEBUG=0 -DGENCODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DFOR_SPU -I./include s4_7.cpp g++: internal compiler error: Killed (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See <http://bugzilla.redhat.com/bugzilla> for instructions. Expected results: $ g++ -march=nocona -mfpmath=sse -include inclall.h -o s1823_1.o -shared -O0 -fno-strict-aliasing -fstack-reuse=named_vars -fexceptions -fsigned-char -Wno-invalid-offsetof -fpic -fpermissive -DNZDEBUG=0 -DGENCODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DFOR_SPU -I./include s4_7.cpp $ echo $? 0 Additional info: use of -fstack-reuse=named-vars or -fstack-reuse=none was suggested in similar https://bugzilla.redhat.com/show_bug.cgi?id=1787575 Opening fresh bug as customer looking for workaround on priority since earlier suggestions not working in this situation.
This compiles on my box with trunk, but it ate almost 60GB RAM. The function GenPlan is just so insanely huge (over 300,000 lines!). Compiling with -fno-exceptions doesn't seem like an option because the code has throw std::bad_alloc(). I don't think this can be (safely) fixed in GCC 4.8 or GCC 8.
Created attachment 1698079 [details] Latest reproducer Considering to todays upcoming call, pushing customer comment and reproducer as it is.
Created attachment 1698914 [details] preprocessor file 4.4.7
Created attachment 1698915 [details] preprocessed file 4.8
So as I mentioned in the meeting, I've been investigating -O1 plus some additional -fno-<pass> arguments as a mitigation strategy. In simplest terms -O1 introduces some optimization, but in general it bypasses optimizations which tend to blow up on large codes (such as we find in the testcases from Netezza). The hope was that the light optimization provided by -O1 would dramatically reduce the number of blocks, edges and SSA_NAMEs which in turn would dramatically reduce the compile time memory usage, particularly in the coalescing phase of the out-of-ssa pass. The good news is that -O1 does provide the benefits we hoped for. The CFG and SSA_NAMEs are dramatically reduced and the peak memory requirements drop from ~50G to around 7G for the testcase in this BZ. The bad news is -O1 does trigger some significant nonlinear behavior for the original testcase in this BZ as well as in the testcase from BZ1787575. The non-linear behavior is part of the IPA pipeline and unfortunately the knobs to control behavior of this aspect of the IPA pipeline do not exist in gcc-4.8. -- The other option I looked at is -Og. I hadn't mentioned it earlier because I thought -Og wasn't introduced until gcc-4.9. However it was actually introduced in gcc-4.8. The advantage -Og has is that it bypasses the IPA pipeline. The downside is it's not quite as good at reducing the size of the CFG and SSA_NAME tables. As a result its peak memory usage on s4_7 is 9G (vs 7G for -O1). That may be enough to avoid the OOM killer for Netezza's customers in the immediate term and give Netezza some time to address their code generator. -Og also seems to perform reasonably well on the other tests I checked (s8_1 and s1823_1). So the immediate recommendation I have for Netezza is to replace "-O0" with "-Og -fno-move-loop-invariants" and see if that's sufficient to work around this problem in the immediate term. Also note, we do not have the "s730_7" test. So no evaluation could be done with that test. Given c#7, I would expect that test to probably still fail and is another strong signal that the Netezza code generator needs to be fixed.
Created attachment 1702353 [details] new code
As has been expressed to Netezza several months ago, this is not something we're going to fix in a gcc-4.8 era compiler. Even the -Og workaround isn't always going to work -- which is a symptom of the underlying issue -- the size of the generated functions being passed to the compiler needs to be fixed. Trying to run GCC on functions of this size on highly memory constrained systems just isn't going to work out in the long run.