Bug 1212472 - docker using gcc-go crashes in stacktrace
Summary: docker using gcc-go crashes in stacktrace
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: 22
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
Depends On:
Blocks: ZedoraTracker PPCTracker F-ExcludeArch-ppc64le, PPC64LETracker
TreeView+ depends on / blocked
Reported: 2015-04-16 13:33 UTC by Jakub Čajka
Modified: 2015-06-18 13:29 UTC (History)
8 users (show)

Clone Of:
Last Closed: 2015-06-18 13:29:10 UTC

Attachments (Terms of Use)
go-caller patch (531 bytes, patch)
2015-04-16 13:33 UTC, Jakub Čajka
no flags Details | Diff
patch (566 bytes, patch)
2015-04-23 15:33 UTC, Jakub Čajka
no flags Details | Diff

Description Jakub Čajka 2015-04-16 13:33:24 UTC
Created attachment 1015215 [details]
go-caller patch

Description of problem:
docker using gcc-go crashes

Version-Release number of selected component (if applicable):

How reproducible:
Always(reproduced on x86_64, ppc64le, s390x, but expecting ppc64 too)

Steps to Reproduce:
1.install docker from https://repos.fedorapeople.org/repos/jcajka/docker-gccgo/ (ppc64(le), s390x) or https://copr.fedoraproject.org/coprs/jcajka/docker-gccgo/ (x86_64) 
2.systemctl start docker
3.mkdir /root/dir
4.chcon -Rt svirt_sandbox_file_t /root/dir/ (just in case)
5.1 for x86: 
  docker run fedora -itv /root/dir/:/root/dir/ /bin/bash 
5.2 for ppc64(le),s390x(replace arch in docker.io/jcajka/fedora22-...):
  docker run docker.io/jcajka/fedora22-ppc64 -itv /root/dir/:/root/dir/ /bin/bash

Actual results
on ppc64le/x86:

docker run docker.io/jcajka/fedora22-ppc64le -itv /root/dir/:/root/dir/ /bin/bash
panic: runtime error: invalid memory address or nil pointer dereference
on ppc64le: [signal 0xb code=0x1 addr=0x8]
on x86_64:  [signal 0xb code=0x1 addr=0x0]

goroutine 16 [running]:

goroutine 18 [finalizer wait]:
created by runtime_createfing

goroutine 19 [syscall]:
	goroutine in C code; stack unavailable
created by os_signal..import

on s390x(from log, timestamp and hostname ommited):
kernel: SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
kernel: SELinux: initialized (dev tmpfs, type tmpfs), uses mountpoint labeling
kernel: SELinux: initialized (dev tmpfs, type tmpfs), uses mountpoint labeling
kernel: User process fault: interruption code 003b ilc:2in systemd-coredump[2aacf268000+1a000]
kernel: failing address: 0000000000000000 TEID: 0000000000000400
kernel: Fault in primary space mode while using user ASCE.
kernel: AS:000000004f8e01c7 R3:0000000000000024 
kernel: CPU: 0 PID: 34376 Comm: systemd-coredum Not tainted 3.19.0-0.rc5.git2.1.fc22.s390x #1
kernel: task: 0000000060c557c0 ti: 00000000037cc000 task.ti: 00000000037cc000
kernel: User PSW : 0705000180000000 000002aacf274ce6
kernel:            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:0 PM:0 EA:3
                                                    User GPRS: ffffffff00000000 0000000000000000 0000000000000000 0000000000000001
kernel:            000002aacf274c6c 000003fffd5249f8 000002aacf27d928 000003ffffa7cd28
kernel:            000003ffffa7cd30 ffffffff00000000 000002aafe081250 000003ffffa7cc78
kernel:            000002aacf282ac8 000002aacf27d8f8 000002aacf274c6c 000003ffffa7cc50
kernel: User Code: 000002aacf274cd8: e320b0a80004        lg        %r2,168(%r11)
                                                               000002aacf274cde: a7980000                lhi        %r9,0
                                                              #000002aacf274ce2: 41121000                la        %r1,0(%r2,%r1)
                                                              >000002aacf274ce6: 92001000                mvi        0(%r1),0
                                                               000002aacf274cea: e310b0a00004        lg        %r1,160(%r11)
                                                               000002aacf274cf0: e320b0b00004        lg        %r2,176(%r11)
                                                               000002aacf274cf6: e32010000024        stg        %r2,0(%r1)
                                                               000002aacf274cfc: a7190000                lghi        %r1,0
kernel: Last Breaking-Event-Address:
kernel:  [<000002aacf274c74>] 0x2aacf274c74
kernel: docker0: port 1(veth260f764) entered disabled state
kernel: device veth260f764 left promiscuous mode
kernel: docker0: port 1(veth260f764) entered disabled state

Expected results:
docker run docker.io/jcajka/fedora22-s390x -itv /root/dir/:/root/dir/ /bin/bash
FATA[0000] Error response from daemon: Cannot start container fb6ed2f3efd57fa07ddf5c4d8576408d0f4cb95fe645a335063f7bed014a0654: [8] System error: exec: "-itv": executable file not found in $PATH

Additional info:
Crash seems to be triggered by creating new Frame using invalid pc/filename/line in libcontainer(seen this crash from time to time, but haven't got reliable reproducer, until now). As libgo's runtime.Caller(capture.go:14) returns seemingly invalid data(0,"",0) later used in runtime.FuncForPC/parseFunctionName(frame.go). This could be workaround by extending/fixing check in capture.go:15, but I think that runtime.Caller shouldn't return ok state when returned data seems invalid. Patch in attachment prevents this crash.

Comment 1 laboger 2015-04-17 20:31:15 UTC
I will submit a gccgo bug on this.

Comment 2 laboger 2015-04-17 21:17:29 UTC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65798 has been opened for this.

Comment 3 laboger 2015-04-20 12:24:18 UTC
A fix was committed for this gcc bugzilla.

It is not exactly as was suggested -- please verify this resolves the problem and if not we need to understand what was on the stack at the point runtime.Caller was called to understand why it returned invalid results.

Comment 4 Jakub Čajka 2015-04-23 15:33:43 UTC
Created attachment 1018046 [details]

Hm..., it doesn't fix it. I have messed it up a bit..., problem might not be the runtime.Caller...( as the rejected function name part was working around the problem). I have prepare small program triggering the crash outside docker.

package main

import "runtime"
import "fmt"

func main() {
    for i := 1;i <= 5;i++ {
    pc, fname, line,ok := runtime.Caller(i)
    fmt.Print(" ",fname," ",line," ",ok,"\n")
    fn := runtime.FuncForPC(pc)

Crashes with gcc-go:

7fefdc22b2c6 ../../../libgo/runtime/proc.c 550 true

7fefdc228f8b ../../../libgo/runtime/proc.c 235 true

7fefdb20ffff  0 true
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0]

goroutine 16 [running]:

goroutine 18 [finalizer wait]:
created by runtime_createfing

but works fine with golang(built with --ldflags "-linkmode external"):

4125e3 /usr/lib/golang/src/runtime/proc.go 63 true

4377e1 /usr/lib/golang/src/runtime/asm_amd64.s 2232 true

0  0 false

0  0 false

0  0 false

There is difference(probably expected?, this confused me...), in number of callers. But also libgo seems to behave differently than golang when accessing nil Func's Name, as ommiting/replacing the fn.Name() with just fn yields with gcc-go:

7feb8a48e2c6 ../../../libgo/runtime/proc.c 550 true

7feb8a48bf8b ../../../libgo/runtime/proc.c 235 true

7feb89472fff  0 true

0  0 false

0  0 false

(Also tried just fn := runtime.FuncForPC(0)
 which works with golang, but fails with gcc-go)

I have checked golang's code and indeed it does checking for nil.

func cfuncname(f *_func) *byte {
	if f == nil || f.nameoff == 0 {
		return nil
	datap := findmoduledatap(f.entry) // inefficient
	return (*byte)(unsafe.Pointer(&datap.pclntable[f.nameoff]))

but libgo doesn't:
 	runtime_funcname_go (Func *f)
 	  return f->name;

changing it:

@@ -231,7 +231,13 @@ String runtime_funcname_go (Func *f)
 runtime_funcname_go (Func *f)
-  return f->name;
+  String str;
+  if (!f)
+  {
+    runtime_memclr (&str, sizeof str);
+    return str;
+  }
+  else return f->name;

seems to fix the crash(tested on x86, will do ppc). Also it seems that Entry() is not protected in both golang and libgo. I would expect same behavior as in Name(). Will open golang ticket. I hope now it is finally correct fix(or at least at right spot)... Patch in attachment.

Comment 5 Jakub Čajka 2015-05-05 10:46:47 UTC
Golang upstream ticket:


Comment 6 Jakub Čajka 2015-05-05 10:49:51 UTC
Opened gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66016

Comment 7 Fedora Update System 2015-06-13 20:18:19 UTC
gcc-5.1.1-3.fc22,gcc-python-plugin-0.14-2.fc22 has been submitted as an update for Fedora 22.

Comment 8 Fedora Update System 2015-06-14 17:30:10 UTC
Package gcc-5.1.1-3.fc22, gcc-python-plugin-0.14-2.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing gcc-5.1.1-3.fc22 gcc-python-plugin-0.14-2.fc22'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).

Comment 9 Fedora Update System 2015-06-18 13:29:10 UTC
gcc-5.1.1-3.fc22, gcc-python-plugin-0.14-2.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.