Bug 1608824

Summary: valgrind: arch_prctl aborts process for unknown flags (seen with ARCH_CET_STATUS)
Product: [Fedora] Fedora Reporter: Florian Weimer <fweimer>
Component: valgrindAssignee: Mark Wielaard <mjw>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: dodji, jakub, mjw
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: valgrind-3.13.0-22.fc29 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1608826 (view as bug list) Environment:
Last Closed: 2018-07-27 15:35:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1608826, 1654309    

Description Florian Weimer 2018-07-26 10:49:15 UTC
Building glibc in rawhide with --enable-cet fails with:

==19934== Memcheck, a memory error detector
==19934== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19934== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19934== Command: elf/ld.so --library-path .:elf:nptl:dlfcn /usr/bin/true
==19934== 

valgrind: the 'impossible' happened:
   Unsupported arch_prctl option

host stacktrace:
==19934==    at 0x5803B102: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x5803B214: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x5803B459: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x5803B480: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x580CFD08: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x58096FFA: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x58093A72: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x58095206: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)
==19934==    by 0x580A4ACA: ??? (in /usr/lib64/valgrind/memcheck-amd64-linux)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 19934)
==19934==    at 0x121A15: get_cet_status (cpu-features.c:28)
==19934==    by 0x121A15: init_cpu_features (cpu-features.c:474)
==19934==    by 0x121A15: dl_platform_init (dl-machine.h:228)
==19934==    by 0x121A15: _dl_sysdep_start (dl-sysdep.c:231)
==19934==    by 0x10A1D7: _dl_start_final (rtld.c:413)
==19934==    by 0x10A1D7: _dl_start (rtld.c:520)
==19934==    by 0x109117: ??? (in /builddir/build/BUILD/glibc-2.27.9000-645-gcfba5dbb10/build-x86_64-redhat-linux/elf/ld.so)
==19934==    by 0x3: ???
==19934==    by 0x1FFF0009E6: ???
==19934==    by 0x1FFF0009F0: ???
==19934==    by 0x1FFF0009FF: ???
==19934==    by 0x1FFF000A10: ???

The offending function is:

static inline int __attribute__ ((always_inline))
get_cet_status (void)
{
  unsigned long long cet_status[3];
  INTERNAL_SYSCALL_DECL (err);
  if (INTERNAL_SYSCALL (arch_prctl, err, 2, ARCH_CET_STATUS,
			cet_status) == 0)
    return cet_status[0];
  return 0;
}

ARCH_CET_STATUS is unfortunately not in the upstream kernel.

Comment 1 Florian Weimer 2018-07-26 10:50:38 UTC
Is there a reason why arch_prctl cannot be passed to the kernel, or at least made to fail with ENOSYS?

Comment 2 Mark Wielaard 2018-07-26 10:53:28 UTC
(In reply to Florian Weimer from comment #1)
> Is there a reason why arch_prctl cannot be passed to the kernel, or at least
> made to fail with ENOSYS?

The correct way seems to be to return EINVAL (code is not a valid subcommand). Would that be helpful?

Comment 3 Mark Wielaard 2018-07-26 10:54:53 UTC
I mean of course, return -1 and set errno to EINVAL.

Comment 4 Florian Weimer 2018-07-26 10:56:14 UTC
(In reply to Mark Wielaard from comment #2)
> (In reply to Florian Weimer from comment #1)
> > Is there a reason why arch_prctl cannot be passed to the kernel, or at least
> > made to fail with ENOSYS?
> 
> The correct way seems to be to return EINVAL (code is not a valid
> subcommand). Would that be helpful?

Any results that is not zero will do.  An EINVAL error is okay as well.

Comment 5 Florian Weimer 2018-07-26 10:57:15 UTC
FWIW, glibc says that 0x3001 is the value of ARCH_CET_STATUS.

Comment 6 Mark Wielaard 2018-07-26 18:49:10 UTC
valgrind-3.13.0-22.fc29 should have fixed this.

* Thu Jul 26 2018 Mark Wielaard <mjw> - 3.13.0-22
- Add valgrind-3.13.0-arch_prctl.patch (#1608824)

But the /bin/true check fails on ppc64, ppc64le, s390x and armh7l.
aarch64, x86_64 and i686 build fine.
https://koji.fedoraproject.org/koji/taskinfo?taskID=28624474

There isn't enough information in the build.log to understand what is going wrong on those architectures.

Comment 7 Mark Wielaard 2018-07-27 15:35:38 UTC
Turned out the issue from comment #6 was a buggy binutils. Which has been fixed. And valgrind has been rebuild.