Bug 1462296

Summary: oci-register-machine fails on s390x, because godbus sends only little endian messages
Product: Red Hat Enterprise Linux 7 Reporter: Dan Horák <dhorak>
Component: oci-register-machineAssignee: Frantisek Kluknavsky <fkluknav>
Status: CLOSED ERRATA QA Contact: atomic-bugs <atomic-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: dwalsh, jjarvis, lsm5, qcai, yselkowi
Target Milestone: rcKeywords: Extras
Target Release: ---   
Hardware: s390x   
OS: Unspecified   
Whiteboard:
Fixed In Version: oci-register-machine-0-3.12.gitcbf1b8f.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-19 15:18:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1425142, 1477926, 1478982, 1480422    
Attachments:
Description Flags
temporary fix none

Description Dan Horák 2017-06-16 16:04:05 UTC
Description of problem:
Docker can't start containers on s390x, because oci-register-machine fails to register the container, because godbus sends only little endian messages.

from my reply in an email thread on the powerle list (https://www.redhat.com/mailman/listinfo/powerle)
---
> 
> As discussed in today's meeting here are some Docker messages on LoZ.
> 
> As noted, Docker on Z is able to pull images from DockerHub but fails
> to run them with following exception:
> 
> Jun  8 10:07:38 csz25062 systemd: Started Virtual Machine and
> Container Registration Service.
> Jun  8 10:07:38 csz25062 oci-register-machine[27519]: 2017/06/08
> 10:07:38 Register machine failed: Operation not supported

^^^ the most important message

> Jun  8 10:07:38 csz25062 systemd: Stopped docker container
> 5882c014f5022c06d89962905108c10bb88c39516eb83a224545531bd65fee55.
> Jun  8 10:07:38 csz25062 oci-register-machine[27526]: 2017/06/08

I dug into it and figured out
- oci-register-machine fails due the dbus systemd-machined server return
an error on the container registration

- with systemd debugging enabled (systemd.log_level=debug on kernel cmd
line) one can see

systemd-machined[3172]: Assertion '!BUS_MESSAGE_NEED_BSWAP(m)' failed at
src/libsystemd/sd-bus/bus-message.c:4799, function
sd_bus_message_read_array(). Ignoring.

in the journal

- the libsystemd/sd-bus/bus-message.c code contains a check for the
array type whether the message is with the same endianness as the host

- because registering a machine worked from python, the cause should be
in the oci-register-machine code or in the libraries it uses

- the https://github.com/godbus/dbus library hard-codes little endian
into all messages it sends
(https://github.com/godbus/dbus/blob/master/transport_unix.go#L178)

- https://github.com/godbus/dbus/pull/86 is my proposed solution,
please review/comment/improve as my Go knowledge is limited

After building a new oci-register-machine binary, that used the fixed
godbus library, I was able to start a container on RHEL-7 s390x host.
---

Version-Release number of selected component (if applicable):
oci-register-machine-0-3.11.1.gitdd0daef.el7exarch.s390x

How reproducible:
100%

Steps to Reproduce:
1. docker run --rm -it docker.io/s390x/hello-world

Actual results:
container not started

Expected results:
container started and registered

Additional info:
Because the official way thru all upstreams will be long, I'm thinking about providing a patched oci-register-machine using a fix for the bundled godbus. I'll attach my proposed solution here on Monday.

Comment 1 Dan Horák 2017-06-19 11:47:54 UTC
Created attachment 1289068 [details]
temporary fix

a scratch build with the patch applied https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13465803

Container registration with systemd/machined now works OK.

Comment 2 Dan Horák 2017-06-19 12:37:46 UTC
My https://github.com/godbus/dbus/pull/86 PR got upstream attention, so we should get an accepted fix soon.

Comment 3 Dan Horák 2017-06-21 07:35:18 UTC
the godbus fix is now https://github.com/godbus/dbus/commit/37252881b3a87eaa2eb04b0ff2211f54f45199ab

I've asked about updating standalone golang-github-godbus-dbus package in Fedora in bug 1463511, but we also need the godbus bundled with oci-register-machine to be updated.

Comment 4 Daniel Walsh 2017-06-26 18:24:19 UTC
Dan could you verify that
https://github.com/projectatomic/oci-register-machine/pull/31
fixes your issues?

Comment 5 Daniel Walsh 2017-06-28 19:39:39 UTC
Dan I vendored inot oci-register-machine the new go-dbus bindings.

oci-register-machine-0-3.9.git76fc0b3.fc27 is building in Rawhide

Comment 6 Daniel Walsh 2017-06-28 19:40:14 UTC
Lokesh can  you build a new version for RHEL7.4 release.

Comment 7 Dan Horák 2017-07-14 16:28:53 UTC
I've tried a build with updated source archive and it's failing to build - https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13659994

from build.log
...
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.4xgrte
+ umask 022
+ cd /builddir/build/BUILD
+ cd oci-register-machine-76fc0b3e8cb7f838fe852b8a5b61920db2c05920
+ mkdir -p src/github.com/projectatomic
+ ln -s ../../../ src/github.com/projectatomic/oci-register-machine
++ pwd
++ pwd
+ export GOPATH=/builddir/build/BUILD/oci-register-machine-76fc0b3e8cb7f838fe852b8a5b61920db2c05920:/builddir/build/BUILD/oci-register-machine-76fc0b3e8cb7f838fe852b8a5b61920db2c05920/Godeps/_workspace:/usr/share/gocode
+ GOPATH=/builddir/build/BUILD/oci-register-machine-76fc0b3e8cb7f838fe852b8a5b61920db2c05920:/builddir/build/BUILD/oci-register-machine-76fc0b3e8cb7f838fe852b8a5b61920db2c05920/Godeps/_workspace:/usr/share/gocode
+ make -j16 build docs
GOPATH=$GOPATH:/usr/share/gocode go build -a -ldflags " -B 0x03268ff1213dbfc1372d7f5ec22c6812f08adead" -o oci-register-machine
go-md2man -in "oci-register-machine.1.md" -out "oci-register-machine.1"
sed -i 's|$HOOKSDIR|/usr/libexec/oci/hooks.d|' oci-register-machine.1
# github.com/godbus/dbus
Godeps/_workspace/src/github.com/godbus/dbus/conn.go:48: undefined: Handler
Godeps/_workspace/src/github.com/godbus/dbus/conn.go:54: undefined: SignalHandler
make: *** [oci-register-machine] Error 2
RPM build errors:
error: Bad exit status from /var/tmp/rpm-tmp.4xgrte (%build)
    Bad exit status from /var/tmp/rpm-tmp.4xgrte (%build)
Child returncode was: 1


Building with unbundled libs in Fedora is OK - https://koji.fedoraproject.org/koji/buildinfo?buildID=912966

Comment 8 Dan Horák 2017-07-14 16:43:08 UTC
And I see it, during the import of the fixed godbus you missed the 3 new files - default_handler.go, server_interfaces.go and transport_unixcred_freebsd.go - I'll an upstream ticked for it.

Comment 9 Dan Horák 2017-07-14 17:25:49 UTC
and with freshly updated upstream it builds again - https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13660326

Comment 10 Daniel Walsh 2017-07-17 20:38:24 UTC
Lokesh we have a fix for this in Fedora,
oci-register-machine-0-3.10.gitcbf1b8f.fc27 can you build for RHEL.

Comment 11 Lokesh Mandvekar 2017-07-17 20:57:25 UTC
ack, i'll check with frantisek.

Comment 12 Josh Boyer 2017-07-25 20:19:58 UTC
(In reply to Lokesh Mandvekar from comment #11)
> ack, i'll check with frantisek.

I don't think this was ever built.  Would it be possible to get a build soon?

Comment 16 errata-xmlrpc 2017-10-19 15:18:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2962