Bug 1056828

Summary: strace -cf java causes Segmentation fault
Product: Red Hat Enterprise Linux 6 Reporter: Tetsuo Handa <penguin-kernel>
Component: straceAssignee: Jeff Law <law>
Status: CLOSED ERRATA QA Contact: Michael Petlan <mpetlan>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.5CC: mnewsome, mpetlan, ohudlick, penguin-kernel
Target Milestone: rcKeywords: Rebase
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: strace-4.8-9.el6 Doc Type: Rebase: Bug Fixes and Enhancements
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 06:26:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1153397    
Attachments:
Description Flags
A patch file. none

Description Tetsuo Handa 2014-01-23 01:58:42 UTC
Created attachment 854158 [details]
A patch file.

Description of problem:
strace -cf causes Segmentation fault before printing summary.

Version-Release number of selected component (if applicable):
strace-4.5.19-1.17.el6

How reproducible:
100%

Steps to Reproduce:
1. Install strace package and openjdk package.
2. Run "strace -cf java".

Actual results:
Segmentation fault

Expected results:
Summary is printed correctly.

Additional info:

The outf == NULL is the trigger when call_summary() in cleanup() is called.

  static void
  cleanup()
  {
  (...snipped...)
          if (cflag)
                  call_summary(outf);
  }

The outf == NULL is caused by assignments in handle_stopped_tcbs()
with tcp->outf == NULL and tcp->wait_status == 256 and tcp->pid == 0.

  static int
  handle_stopped_tcbs(struct tcb *tcp)
  {
          for (; tcp; tcp = tcp->next_need_service) {
                  int pid;
                  int status;
  
                  outf = tcp->outf;
                  status = tcp->wait_status;
                  pid = tcp->pid;
  (...snipped...)
  }

The tcp->outf == NULL is caused by assignments in droptcb() with
tcp->outf == stderr.

  void
  droptcb(tcp)
  struct tcb *tcp;
  {
          if (tcp->pid == 0)
                  return;
  (...snipped...)
          if (outfname && followfork > 1 && tcp->outf)
                  fclose(tcp->outf);
  
          tcp->outf = 0;
  }

Oh, why are we closing stderr when we will later use stderr at call_summary()?
Attached patch solves "strace -cf java" case, but does not solve
"strace -cf -o outfile java" case. I don't know how to fix this bug.

Regards.

Comment 2 Jeff Law 2014-01-24 20:13:06 UTC
I'm not an expert in this code, but it looks like droptcb can drop both the incoming tcp argument, but also its parent via this call from within droptcb:

#ifdef LINUX
                /* Update `tcp->parent->parent->nchildren' and the other fields
                   like NCLONE_DETACHED, only for zombie group leader that has
                   already reported and been short-circuited at the top of this
                   function.  The same condition as at the top of DETACH.  */
                if ((tcp->flags & TCB_CLONE_THREAD) &&
                    tcp->parent->nclone_threads == 0 &&
                    (tcp->parent->flags & TCB_EXITING))
                        droptcb(tcp->parent);
#endif

We then continue the main loop in handle_stopped_tcbs which walks down a list of tcbs.  If the parent appears later in that list, it will have already been dropped by the call to droptcb shown above.  This ultimately results in the problems with tcp->outf that you're seeing.

My immediate recommendation as a workaround would be to use strace from the developer toolset.  It does not exhibit this problem.

Comment 4 Jeff Law 2014-08-06 16:41:10 UTC
*** Bug 1125279 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2015-07-22 06:26:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1308.html