Bug 1790052

Summary: [RHEL 7] strace doesn't print stack trace for early syscalls
Product: Red Hat Enterprise Linux 7 Reporter: Eugene Syromiatnikov <esyr>
Component: straceAssignee: Eugene Syromiatnikov <esyr>
Status: CLOSED ERRATA QA Contact: Edjunior Barbosa Machado <emachado>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.8CC: bugproxy, esyr, extras-qa, hannsj_uhl, law, ldv, ohudlick, pzhukov, roland
Target Milestone: rcKeywords: Patch
Target Release: 7.9   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: strace-4.24-6.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1788636 Environment:
Last Closed: 2020-09-29 20:20:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1783584, 1788636, 1790054, 1790057, 1790058    
Bug Blocks: 1729246, 1788588    

Description Eugene Syromiatnikov 2020-01-11 15:00:43 UTC
+++ This bug was initially created as a clone of Bug #1788636 +++

Description of problem:
strace -k doesn't produce stack trace due to no dwarf found if it is being attached to running process.

Version-Release number of selected component (if applicable):
strace-5.3-1.fc31.x86_64

# rpm -qa '*-debuginfo'
dhcp-debuginfo-4.4.1-19.fc31.x86_64
dhcp-server-debuginfo-4.4.1-19.fc31.x86_64
glibc-debuginfo-2.30-8.fc31.x86_64

How reproducible:
100%

Steps to Reproduce:
1. dhcpd -f 
2. strace -k -p `pidof dhcpd`


Actual results:
select(9, [5<RAW:[0.0.0.0:1]> 6<socket:[41801]> 8<UDP:[0.0.0.0:67]>], [], NULL, NULL) = ? ERESTARTNOHAND (To be restarted if no handler)
 > No DWARF information found


Expected results:
select(9, [5<RAW:[0.0.0.0:1]> 6<socket:[43189]> 8<UDP:[0.0.0.0:67]>], [], NULL, NULL) = ? ERESTARTNOHAND (To be restarted if no handler)
 > /usr/lib64/libc-2.30.so(__select+0x1a) [0xf8f1a]
 > /usr/sbin/dhcpd(isc__socketmgr_waitevents+0x76) [0x257e06]
 > /usr/sbin/dhcpd(evloop+0x12f) [0x24725f]
 > /usr/sbin/dhcpd(isc__app_ctxrun+0x138) [0x247778]
 > /usr/sbin/dhcpd(dispatch+0x1b) [0x6cf7b]
 > /usr/sbin/dhcpd(main+0xf66) [0x18d16]
 > /usr/lib64/libc-2.30.so(__libc_start_main+0xf2) [0x271a2]
 > /usr/sbin/dhcpd(_start+0x2d) [0x1949d]


Additional info:

If the application is started under strace the option works fine (as per Expected result)

--- Additional comment from Dmitry V. Levin on 2020-01-07 19:40:11 UTC ---

This happens due to a cache initialization bug that results to a failure of printing stack trace for early syscalls.
Early syscalls in this case are those that precede the first syscall from memory mapping or execve families.
When a process is started by strace itself, the first syscall is usually execve.

Fixed by upstream commit v5.4-18-g69b2c33a77fa687feb41fafdbe187013aa812384, available at
https://github.com/strace/strace/commit/69b2c33a77fa687feb41fafdbe187013aa812384
https://gitlab.com/strace/strace/commit/69b2c33a77fa687feb41fafdbe187013aa812384

Thanks for reporting!

--- Additional comment from Dmitry V. Levin on 2020-01-07 19:44:40 UTC ---

FWIW, I have no idea when the fix will reach Fedora because updates of strace package are blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1783584

--- Additional comment from Dmitry V. Levin on 2020-01-07 20:12:11 UTC ---

Meanwhile, you can install a fixed strace package from the scratch build for f31 at
https://koji.fedoraproject.org/koji/taskinfo?taskID=40246259

Comment 1 Eugene Syromiatnikov 2020-01-22 23:53:18 UTC
Upstream commits:
69b2c33a77fa687feb41fafdbe187013aa812384 "unwind-libdw: fix initialization of libdwfl cache"
35e080ae319d25c1df82855cda3a1bb014e90ba6 "syscall: do not capture stack trace while the tracee executes strace code"
8e515c744935fe67e6a1b941f4c5414472c163b7 "tests: add strace-k-p test"

Comment 2 Eugene Syromiatnikov 2020-01-23 14:12:55 UTC
Correction: 35e080ae319d25c1df82855cda3a1bb014e90ba6 "syscall: do not capture stack trace while the tracee executes strace code" is not needed, as it is based on top of  v4.26~72 "Enhance error diagnostics when the first exec fails".

Comment 7 errata-xmlrpc 2020-09-29 20:20:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (strace bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3983