| Summary: | groupadd segfaults in libaudit | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Dan Horák <dan> | ||||
| Component: | gcc | Assignee: | Jakub Jelinek <jakub> | ||||
| Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 24 | CC: | davejohansen, jakub, jwakely, law, mpolacek, sgrubb | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | s390x | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | gcc-6.0.0-0.14.fc24 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-03-03 09:15:16 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | |||||||
| Bug Blocks: | 467765 | ||||||
| Attachments: |
|
||||||
|
Description
Dan Horák
2016-02-18 17:04:44 UTC
rebuild with -fno-delete-null-pointer-checks didn't help also using -fno-strict-aliasing doesn't help, but switching to -O1 does help, maybe a compiler error then ... I can't see how the error is occurring in the code. The code that sets things up is this:
switch (rep->type) {
case NLMSG_ERROR:
rep->error = NLMSG_DATA(rep->nlh);
break;
And the dereference is:
else if (rc > 0 && rep.type == NLMSG_ERROR) {
int error = rep.error->error;
This part of the code has not changed in probably 8 years.
yeah, I couldn't see anything obvious as well, lets ask the gcc team for opinions compiler used was gcc-6.0.0-0.9.fc24.s390x, my test builds were with gcc-6.0.0-0.11.fc24.s390x and from what I can see both s390 and s390x are affected full logs are available at http://s390.koji.fedoraproject.org/koji/buildinfo?buildID=378112 This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle. Changing version to '24'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase Created attachment 1130824 [details]
preprocessed source file
so it's the static int adjust_reply(struct audit_reply *rep, int len) function in lib/netlink.c that requires switching to -O1 to avoid the segfaults
gcc -DHAVE_CONFIG_H -I. -I.. -I. -I.. -I../auparse -fPIC -DPIC -D_GNU_SOURCE -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -march=z9-109 -mtune=z10 -c netlink.c -o netlink.o
when __attribute__((noinline, noclone)) is applied to adjust_reply(), the segfault doesn't occur with -O2 This is a dup of http://gcc.gnu.org/PR70025. r227382 broke it too: --- netlink.s.227381 2016-03-01 13:00:48.321371161 +0100 +++ netlink.s.227382 2016-03-01 13:00:50.793337109 +0100 @@ -374,8 +374,8 @@ audit_get_reply_internal: .L42: .loc 1 193 0 lg %r1,176(%r15) - la %r2,32(%r1) - stg %r2,9008(%r1) + la %r1,32(%r1) + stg %r1,9008(%r1) .LVL43: .L35: .LBE28: and the bad lg %r1,176(%r15) la %r1,32(%r1) stg %r1,9008(%r1) still appears even in r233777. The code also has the same pattern: rep->nlh = &rep->msg.nlh; ... rep->signal_info = ((void*)(((char*) rep->nlh) + ((0) + ((int) ( ((sizeof(struct nlmsghdr))+4U -1) & ~(4U -1) ))))); thus it is again *(ptr + off1) = ptr + off2; where off1 and off2 are constants and off1 is large enough that it is not valid address offset for s390x. I can confirm that using gcc-6.0.0-0.14.fc24 to build audit there is no more a segfault in libaudit when running eg. groupadd. Fixed then. |