Bug 1020946

Summary: websockets.log format breaks oo-last-access
Product: OpenShift Online Reporter: Andy Grimm <agrimm>
Component: ContainersAssignee: Jhon Honce <jhonce>
Status: CLOSED NOTABUG QA Contact: libra bugs <libra-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.xCC: jgoulding
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-18 16:00:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andy Grimm 2013-10-18 14:48:24 UTC
A line like this breaks oo-last-access, which expects a number instead of "NaN":

1.2.3.4 foo-bar.rhcloud.com [23/Aug/2013:03:09:46 -0400] "GET / HTTP/1.1" "Websocket RFC-6455/13" "NaN" "msgs:0,0 bytes:0,0" (3ms)

The result is this traceback:

/usr/sbin/oo-last-access:111:in `strptime': invalid date (ArgumentError)
	from /usr/sbin/oo-last-access:111:in `block (3 levels) in parse_log'
	from /usr/sbin/oo-last-access:99:in `each'
	from /usr/sbin/oo-last-access:99:in `block (2 levels) in parse_log'
	from /usr/sbin/oo-last-access:98:in `popen'
	from /usr/sbin/oo-last-access:98:in `block in parse_log'
	from /usr/sbin/oo-last-access:97:in `each'
	from /usr/sbin/oo-last-access:97:in `parse_log'
	from /usr/sbin/oo-last-access:166:in `block in <main>'
	from /usr/sbin/oo-last-access:148:in `call'
	from /usr/sbin/oo-last-access:148:in `single_instance'
	from /usr/sbin/oo-last-access:154:in `<main>'

The implication is that the auto-idler will not have up-to-date information and will mistakenly idle apps.

Comment 1 Andy Grimm 2013-10-18 15:14:00 UTC
I may have misdiagnosed this.  The parsing of websockets.log might just be entirely broken.  The results from the awk command seem to give gibberish for all files I've tried.

Comment 2 Andy Grimm 2013-10-18 15:17:34 UTC
access.log format:

1.2.3.4 foo-bar.rhcloud.com - - [19/Dec/2012:05:47:32 -0500] "GET / HTTP/1.1" 500 48 "-" "Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20100101 Firefox/17.0" (7ms) -

websockets.log format:

1.2.3.4 foo-bar.rhcloud.com [23/Aug/2013:03:09:46 -0400] "GET / HTTP/1.1" "Websocket RFC-6455/13" "NaN" "msgs:0,0 bytes:0,0" (3ms)

The latter is lacking two fields prior to the date, which messes everything up.  However, we are only seeing this on 2/3 of our nodes, so it may be a config issue specific to Online.