Bug 1515865

Summary: redhat-rpm-config: Enable -fstack-clash-protection (except armhfp)
Product: [Fedora] Fedora Reporter: Florian Weimer <fweimer>
Component: redhat-rpm-configAssignee: Florian Weimer <fweimer>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: ajax, dzickus, ffesti, igor.raits, john.j5live, jonathan, pmatilai, ppisar, praiskup
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: redhat-rpm-config-76-1.fc28 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-10 12:37:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1512529    
Bug Blocks:    

Description Florian Weimer 2017-11-21 14:02:31 UTC
Jeff is working on adding GCC support for the -fstack-clash-protection flag (bug 1512529).  Once this is in place, we need to enable it for RPM builds.

Comment 1 Florian Weimer 2017-11-29 13:07:39 UTC
I submitted a pull request:

https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/5

Comment 2 Panu Matilainen 2017-12-08 11:21:00 UTC
As briefly discussed in irc, this looks like it should have a distro-wide change associated with it. And without an accepted change, I might as well flip a coin whether to accept a change or not, I have no way of knowing (which is why redhat-rpm-config maintainership is so ridiculously impossible position)

Comment 3 Florian Weimer 2017-12-13 13:54:51 UTC
Enabling -fstack-clash-protection on armhfp is blocked on bug 1522678.

I will look into filing a system-wide change proposal for this change and other hardening changes.

Comment 4 Florian Weimer 2018-01-03 19:46:14 UTC
(In reply to Florian Weimer from comment #3)
> Enabling -fstack-clash-protection on armhfp is blocked on bug 1522678.

Due to lack of upstream support, we can't fix bug 1522678.  Consequently, we need to guard the -fstack-clash-protection flag with

%ifnarch %{arm}

or something like that.  aarch64 is not affected by this.

> I will look into filing a system-wide change proposal for this change and
> other hardening changes.

https://fedoraproject.org/wiki/Changes/HardeningFlags28

Comment 5 Panu Matilainen 2018-01-04 11:05:30 UTC
Nothing like %if(n)arch is available in the rpmrc/macro file context, that's a spec-only construct. I guess you could do it similar to how the dwz arch-specific settings are generated (see macros.dwz) or implement the logic in %{lua:...} which obviously can do if/else logic.

Comment 6 Florian Weimer 2018-01-15 13:40:57 UTC
This change has been approved by Fesco:

https://meetbot-raw.fedoraproject.org/fedora-meeting/2018-01-12/fesco.2018-01-12-16.00.log.html#l-201

Panu, do you want me to come up with a patch, or do you want to implement this by yourself?

Note that we still have to exclude %{arm} architectures, but not aarch64.

Comment 7 Petr Pisar 2018-01-30 12:46:23 UTC
This breaks JIT in pcre <https://koji.fedoraproject.org/koji/buildinfo?buildID=1021975> and pcre2 <https://koji.fedoraproject.org/koji/buildinfo?buildID=1021960> on i686 only:

If you unpack pcre2 sources and configure with:

$ ./configure --enable-jit 'CFLAGS=-O2 -m32 -fstack-clash-protection' && make

Then code that utilizes JIT segfaults:

$ printf '/./jit\na\n' | ./pcre2test 
PCRE2 version 10.31-RC1 2018-01-11
/./jit
a
Segmentation fault (core dumped)

This is triggered by adding "-fstack-clash-protection" option to the compiler flags in i686 only. Does the option changes calling convention (different function prolog or epilog required by the called (jitted) functions)?

Comment 8 Petr Pisar 2018-01-30 13:01:32 UTC
Debugging this shows something strange. It crashes in this function:

static SLJIT_NOINLINE int jit_machine_stack_exec(jit_arguments *arguments, jit_function executable_func)
{
sljit_u8 local_space[MACHINE_STACK_SIZE];
struct sljit_stack local_stack;

local_stack.min_start = local_space;
local_stack.start = local_space;
local_stack.end = local_space + MACHINE_STACK_SIZE;
local_stack.top = local_space + MACHINE_STACK_SIZE;
arguments->stack = &local_stack;
return executable_func(arguments);
}

on the last assignment to arguments->stack because arguments is NULL at that time. But when entering this function, the arguments variable is not NULL. It changes its value to NULL right after executing the first "local_stack.min_start = local_space;" statement:

Starting program: /home/test/fedora/pcre2/pcre2-10.31-RC1/.libs/lt-pcre2test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
PCRE2 version 10.31-RC1 2018-01-11
  re> /./jit
data> a

Breakpoint 1, jit_machine_stack_exec (arguments=arguments@entry=0xbfff9afc, 
    executable_func=0xb7d38008) at src/pcre2_jit_match.c:48
48      {
(gdb) print arguments 
$6 = (jit_arguments *) 0xbfff9afc
(gdb) n
52      local_stack.min_start = local_space;
(gdb) print arguments 
$7 = (jit_arguments *) 0x0
(gdb) n
53      local_stack.start = local_space;
(gdb) 
54      local_stack.end = local_space + MACHINE_STACK_SIZE;
(gdb) 
55      local_stack.top = local_space + MACHINE_STACK_SIZE;
(gdb) 
56      arguments->stack = &local_stack;
(gdb) 

Program received signal SIGSEGV, Segmentation fault.
0xb7f31ba6 in jit_machine_stack_exec (arguments=0x0, arguments@entry=0xbfff9afc, 
    executable_func=0xb7d38008) at src/pcre2_jit_match.c:56
56      arguments->stack = &local_stack;

Comment 9 Petr Pisar 2018-01-30 13:04:39 UTC
The mysterious change happens like this:

Breakpoint 1, jit_machine_stack_exec (arguments=arguments@entry=0xbfff9afc, 
    executable_func=0xb7cc3008) at src/pcre2_jit_match.c:48
48      {
(gdb) watch arguments
Watchpoint 2: arguments
(gdb) n

Watchpoint 2: arguments

Old value = (jit_arguments *) 0xbfff9afc
New value = (jit_arguments *) 0x0
0xb7ebcb7e in jit_machine_stack_exec (arguments=0x0, arguments@entry=0xbfff9afc, 
    executable_func=0xb7cc3008) at src/pcre2_jit_match.c:48
48      {
(gdb) disassemble 
Dump of assembler code for function jit_machine_stack_exec:
   0xb7ebcb70 <+0>:     push   %eax
   0xb7ebcb71 <+1>:     lea    -0x8000(%esp),%eax
   0xb7ebcb78 <+8>:     sub    $0x1000,%esp
=> 0xb7ebcb7e <+14>:    orl    $0x0,(%esp)
   0xb7ebcb82 <+18>:    cmp    %eax,%esp
   0xb7ebcb84 <+20>:    jne    0xb7ebcb78 <jit_machine_stack_exec+8>

Comment 10 Florian Weimer 2018-01-30 13:26:11 UTC
orl doesn't change the value, so this looks like a debugging artifact.  The problem is that push/pop is used to save and restore %eax, but the stack frame is adjusted in-between.  This is a GCC bug which we need to fix in GCC.

Comment 11 Florian Weimer 2018-01-30 13:52:12 UTC
Petr, thanks for tracking this down.  I filed bug 1540221 against gcc.

Comment 12 Fedora End Of Life 2018-02-20 15:34:49 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 28 development cycle.
Changing version to '28'.