Bug 975395

Summary: On Origin on Fedora 19 installation, OpenShift::NodeException: Could not connect to ActiveMQ Server: SIGTERM
Product: OKD Reporter: Jan Pazdziora <jpazdziora>
Component: PodAssignee: Krishna Raman <kraman>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.xCC: dmcphers, jpazdziora, mfisher
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-12 20:33:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Pazdziora 2013-06-18 11:17:10 UTC
Description of problem:

With fresh OpenShift Origin on Fedora 19 installation as of today, running oo-diagnostics seems to be stuck in the

   test_node_profiles_districts_from_broker

step. The oo-diagnostics process blocks after

INFO: broker application cache permissions appear fine
INFO: running: test_node_profiles_districts_from_broker
INFO: checking node profiles via MCollective

at

recvfrom(14, "=)\201\203\0\1\0\0\0\1\0\0\6broker\7example\3com\3"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.16.96.113")}, [16]) = 116
poll([{fd=14, events=POLLIN}], 1, 4999) = 1 ([{fd=14, revents=POLLIN}])
ioctl(14, FIONREAD, [116])              = 0
recvfrom(14, "\263\261\201\203\0\1\0\0\0\1\0\0\6broker\7example\3com\3"..., 1932, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.16.96.113")}, [16]) = 116
close(14)                               = 0
uname({sys="Linux", node="cloud-qe-2.idm.lab.bos.redhat.com", ...}) = 0
futex(0x63746d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x63746d0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x6374750, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a5bf64, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x1a5bf60, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x1a5bf30, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a5c634, FUTEX_WAIT_BITSET_PRIVATE, 47, {6941, 514163964}, ffffffff

for multiple seconds (minutes?) and cycles. When I kill the process, it says

FAIL: rescue in block in run_tests
error running test_node_profiles_districts_from_broker: #<OpenShift::NodeException: Could not connect to ActiveMQ Server: SIGTERM>
INFO: running: test_broker_accept_scripts
INFO: running oo-accept-broker

I tried to service activemq restart but it did not help. There is nothing interesting (== pointing to error) in /var/log/activemq/*.

Version-Release number of selected component (if applicable):

OpenShift Origin nightly on Fedora 19 as of today.

How reproducible:

Deterministic, I have seen this on multiple installations. And I did not see it yesterday.

Steps to Reproduce:
1. Do OpenShift Origin nightly on Fedora 19 installation.
2. Run oo-diagnostics -v -w 1

Actual results:

The process gets "stuck" at or around the test_node_profiles_districts_from_broker step.

Expected results:

See it go through the steps without getting stuck anywhere, like shown in bug 974497.

Additional info:

Comment 1 Jan Pazdziora 2013-06-18 11:18:14 UTC
Actually, I can see the same behaviour on installation with separate node + the other machine having all the other services as well.