Bug 1023576 - [rhc-watchman] watchman is dying when parsing invalid byte sequence in UTF-8
Summary: [rhc-watchman] watchman is dying when parsing invalid byte sequence in UTF-8
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: mfisher
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-25 19:40 UTC by Kenny Woodson
Modified: 2015-05-14 23:31 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-24 03:27:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Kenny Woodson 2013-10-25 19:40:40 UTC
Description of problem:
Watchman on a few of our production nodes is struggling to stay alive.  We have automation that restarts the process but this seems to be happening regularly. Upon further investigation I was able to determine the error:

invalid byte sequence in UTF-8 (ArgumentError)

This happens on line 109 in rhc-watchman when it runs the following command:

File.open(@message_file).grep(/ killed as a result of limit of /).each {|msg|

The .grep call actually throws an exception and causes the script to reach its max limit of exceptions rather quickly.

Written to syslog:
Oct 25 15:25:49 ex-std-node264 rhc-watchman[6498]: watchman caught #<ArgumentError: invalid byte sequence in UTF-8>: invalid byte sequence in UTF-8. Retries left: 0
Oct 25 15:28:09 ex-std-node264 rhc-watchman[21731]: Starting rhc-watchman => delay: 20s, exception threshold: 10
Oct 25 15:28:09 ex-std-node264 rhc-watchman[21736]: Starting throttler => throttle at: 30.00%, restore at: 70.00%, period: 120, check_interval: 5.00

Version-Release number of selected component (if applicable):
Current release.

How reproducible:
This is very reproducible.  Placing an invalid byte sequence in UTF-8 inside of /var/log/messages will cause watchman to die.

Steps to Reproduce:
1. Write a invalid UTF-8 byte sequence to syslog.
2. Run rhc-watchman and watch syslog to see if you see the ArgumentError noted above.
3.

Actual results:
rhc-watchman dies after 10 failed subsequent attempts to read the file.

Expected results:
Should skip over the bad lines in the file.

Additional info:
Will provide rmillner with the corresponding logs.

Comment 1 Rob Millner 2013-10-25 21:35:53 UTC
Ruby has known issues comparing unicode read in from files against a regexp.  
Switching the file IO to binary mode makes the problem go away.

Pull request:
https://github.com/openshift/li/pull/2047

Comment 2 openshift-github-bot 2013-10-26 01:48:54 UTC
Commit pushed to master at https://github.com/openshift/li

https://github.com/openshift/li/commit/242d8453e244838294622d04e394bbbd84d7bb80
Bug 1023576 - ruby 1.9 has trouble dealing with unicode strings comparing file input to a regexp.

Comment 3 Meng Bo 2013-10-29 12:11:55 UTC
Tested on devenv_3960, after echo some invalid string to /var/log/messages

rhc-watchman still running.

Move bug to verified.


Note You need to log in before you can comment on or make changes to this bug.