Bug 1309787 - groupadd segfaults in libaudit
Summary: groupadd segfaults in libaudit
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: 24
Hardware: s390x
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2016-02-18 17:04 UTC by Dan Horák
Modified: 2016-03-03 09:15 UTC (History)
6 users (show)

Fixed In Version: gcc-6.0.0-0.14.fc24
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-03 09:15:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
preprocessed source file (142.23 KB, text/plain)
2016-02-26 11:51 UTC, Dan Horák
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 70025 0 None None None 2016-03-01 12:03:40 UTC

Description Dan Horák 2016-02-18 17:04:44 UTC
groupadd segfaults in libaudit code, see below for details from F-24 mock chroot.


<mock-chroot>sh-4.3# gdb groupadd
GNU gdb (GDB) Fedora 7.10.50.20160121-46.fc24
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "s390x-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from groupadd...Reading symbols from /usr/lib/debug/usr/sbin/groupadd.debug...done.
done.

(gdb) set args foo1
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/sbin/groupadd foo1

Program received signal SIGSEGV, Segmentation fault.
check_ack (seq=<optimized out>, fd=<optimized out>) at netlink.c:287
287			int error = rep.error->error;
(gdb) where
#0  check_ack (seq=<optimized out>, fd=<optimized out>) at netlink.c:287
#1  audit_send_internal (fd=fd@entry=3, type=type@entry=1116, data=data@entry=0x3ffffffcfc6, size=<optimized out>) at netlink.c:244
#2  0x000003fffdfac592 in audit_send_user_message_internal (fd=fd@entry=3, type=type@entry=1116, hide_error=hide_error@entry=REAL_ERR, 
    message=message@entry=0x3ffffffcfc6 "op=add-group id=1001 exe=\"/usr/sbin/groupadd\" hostname=? addr=? terminal=? res=success") at deprecated.c:47
#3  0x000003fffdfab874 in audit_log_acct_message (audit_fd=<optimized out>, type=<optimized out>, pgname=<optimized out>, op=0x2aa0000c3ec "add-group", 
    name=0x3fffffff8ef "foo1", id=1001, host=<optimized out>, addr=0x0, tty=<optimized out>, result=1) at audit_logging.c:457
#4  0x000002aa00004d44 in audit_logger (type=<optimized out>, pgname=<optimized out>, op=<optimized out>, name=0x3fffffff8ef "foo1", id=<optimized out>, 
    result=SHADOW_AUDIT_SUCCESS) at audit_help.c:86
#5  0x000002aa0000417c in close_files () at groupadd.c:275
#6  main (argc=<optimized out>, argv=<optimized out>) at groupadd.c:621



Version-Release number of selected component (if applicable):
audit-2.5-2.fc24


Additional info:
This audit build is from the GCC6 rebuild, when audit is downgraded to audit-2.5-1.fc24 (pre-mass rebuild), everything works.

Comment 1 Dan Horák 2016-02-18 17:18:02 UTC
rebuild with -fno-delete-null-pointer-checks didn't help

Comment 2 Dan Horák 2016-02-18 17:33:38 UTC
also using -fno-strict-aliasing doesn't help, but switching to -O1 does help, maybe a compiler error then ...

Comment 3 Steve Grubb 2016-02-18 17:45:16 UTC
I can't see how the error is occurring in the code. The code that sets things up is this:

        switch (rep->type) {
                case NLMSG_ERROR:
                        rep->error   = NLMSG_DATA(rep->nlh);
                        break;

And the dereference is:

        else if (rc > 0 && rep.type == NLMSG_ERROR) {
                int error = rep.error->error;

This part of the code has not changed in probably 8 years.

Comment 4 Dan Horák 2016-02-18 18:51:29 UTC
yeah, I couldn't see anything obvious as well, lets ask the gcc team for opinions

Comment 5 Dan Horák 2016-02-18 18:54:42 UTC
compiler used was gcc-6.0.0-0.9.fc24.s390x, my test builds were with gcc-6.0.0-0.11.fc24.s390x and from what I can see both s390 and s390x are affected

full logs are available at http://s390.koji.fedoraproject.org/koji/buildinfo?buildID=378112

Comment 6 Jan Kurik 2016-02-24 15:26:20 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase

Comment 7 Dan Horák 2016-02-26 11:51:33 UTC
Created attachment 1130824 [details]
preprocessed source file

so it's the static int adjust_reply(struct audit_reply *rep, int len) function in lib/netlink.c that requires switching to -O1 to avoid the segfaults

gcc -DHAVE_CONFIG_H -I. -I.. -I. -I.. -I../auparse -fPIC -DPIC -D_GNU_SOURCE -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -march=z9-109 -mtune=z10 -c netlink.c -o netlink.o

Comment 8 Dan Horák 2016-03-01 11:07:08 UTC
when __attribute__((noinline, noclone)) is applied to adjust_reply(), the segfault doesn't occur with -O2

Comment 9 Jakub Jelinek 2016-03-01 12:03:41 UTC
This is a dup of http://gcc.gnu.org/PR70025.  r227382 broke it too:
--- netlink.s.227381	2016-03-01 13:00:48.321371161 +0100
+++ netlink.s.227382	2016-03-01 13:00:50.793337109 +0100
@@ -374,8 +374,8 @@ audit_get_reply_internal:
 .L42:
 	.loc 1 193 0
 	lg	%r1,176(%r15)
-	la	%r2,32(%r1)
-	stg	%r2,9008(%r1)
+	la	%r1,32(%r1)
+	stg	%r1,9008(%r1)
 .LVL43:
 .L35:
 .LBE28:
and the bad
        lg      %r1,176(%r15)
        la      %r1,32(%r1)
        stg     %r1,9008(%r1)
still appears even in r233777.

Comment 10 Jakub Jelinek 2016-03-01 12:08:12 UTC
The code also has the same pattern:
 rep->nlh = &rep->msg.nlh;
...
   rep->signal_info = ((void*)(((char*) rep->nlh) + ((0) + ((int) ( ((sizeof(struct nlmsghdr))+4U -1) & ~(4U -1) )))));
thus it is again
  *(ptr + off1) = ptr + off2;
where off1 and off2 are constants and off1 is large enough that it is not valid address offset for s390x.

Comment 11 Dan Horák 2016-03-03 09:14:03 UTC
I can confirm that using gcc-6.0.0-0.14.fc24 to build audit there is no more a segfault in libaudit when running eg. groupadd.

Comment 12 Jakub Jelinek 2016-03-03 09:15:16 UTC
Fixed then.


Note You need to log in before you can comment on or make changes to this bug.