Bug 1042938 - node-web-proxy crashes on send call to a closed socket
Summary: node-web-proxy crashes on send call to a closed socket
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Vincent Batts
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-13 16:36 UTC by Andy Grimm
Modified: 2016-11-08 03:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-30 00:53:22 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Andy Grimm 2013-12-13 16:36:26 UTC
Description of problem:

We are seeing frequent restarts of node-web-proxy on some nodes, with the following error in the supervisor log:

1386949409546:ERROR:[13/Dec/2013:10:43:29 -0500] - io_timeout: 300
1386949409547:ERROR:[13/Dec/2013:10:43:29 -0500] - Error listener on proxied request: Error: socket hang up
    at createHangUpError (http.js:1424:15)
    at Socket.socketCloseListener (http.js:1474:23)
    at Socket.EventEmitter.emit (events.js:117:20)
    at TCP.close (net.js:451:12)

Version-Release number of selected component (if applicable):

openshift-origin-node-proxy-1.17.1-1.el6oso.noarch

Additional info:

It appears that we're setting the socket timeout on the incoming request in the _requestHandler function in openshift-node-web-proxy/lib/proxy/ProxyServer.js , but we don't set a timeout handler on it.  We only set a timeout handler on the proxy request to the backend.  It seems logical that if the timeouts are the same, the incoming request socket will always timeout before the proxy request socket does, so we either need a handler there or a longer timeout (and the timeout may need to be much longer due to app unidling).

Comment 1 Andy Grimm 2013-12-13 16:45:22 UTC
I completely misinterpreted the log.  This error in the log is not what causes the crash.  The relevant error is:

/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/WebSocket.js:175
    else throw new Error('not opened');
         ^
Error: not opened
    at WebSocket.send (/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/WebSocket.js:175:16)
    at WebSocket.<anonymous> (/opt/rh/nodejs010/root/usr/lib/node_modules/openshift-node-web-proxy/lib/proxy/ProxyServer.js:494:8)
    at WebSocket.EventEmitter.emit (events.js:98:17)
    at Receiver.self._receiver.ontext (/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/WebSocket.js:555:10)
    at Receiver.opcodes.1.finish (/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/Receiver.js:397:14)
    at Receiver.expectHandler (/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/Receiver.js:384:31)
    at Receiver.add (/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/Receiver.js:93:24)
    at Socket.firstHandler (/opt/rh/nodejs010/root/usr/lib/node_modules/ws/lib/WebSocket.js:535:22)
    at Socket.EventEmitter.emit (events.js:95:17)
    at Socket.<anonymous> (_stream_readable.js:720:14)
DEBUG: Program node /opt/rh/nodejs010/root/usr/lib/node_modules/openshift-node-web-proxy/bin/web-proxy.js --config /etc/openshift/web-proxy-config.json exited with code 8

Comment 2 Andy Grimm 2013-12-16 19:50:32 UTC
I'm collecting more data about this. I have tcpdump data from one crash, and I've put the proxy on that node in debug mode to capture additional log data.  The process dies about twice an hour, so I should have more details soon.

Comment 3 Andy Grimm 2013-12-18 21:35:00 UTC
Hmm, this appears to duplicate https://bugzilla.redhat.com/show_bug.cgi?id=1034077

but that was release into production and we are still seeing some issues here.  I need to do some follow-up to make sure exactly when that fix landed on our systems.

Comment 4 Andy Grimm 2013-12-19 14:32:25 UTC
Ahh, looking more closely at the traceback, this is not the same.  It's the other connection.  It looks like proxy_ws.on('message', ...) in ProxyServer.js needs a similar fix to the one in ws.on('message', ...) in the earlier issue.

Comment 6 Vincent Batts 2014-01-07 18:53:24 UTC
I'm thinking that this is primarily needing an error handler, so we can log errors, but not crash the whole proxy. But I am a good deal curious why the connections are being reset.

Comment 9 Meng Bo 2014-01-15 09:58:02 UTC
Since the bug is hard to reproduce... QE do regression testing for the existing websocket cases, no regression issue found.

Move bug to verified.


Note You need to log in before you can comment on or make changes to this bug.