Description of problem: Watchman's jboss plugin fails with "invalid byte sequence in UTF-8" if a jboss log contains, ISO-8859-1 bytes which are not valid UTF-8 (such as \xe9). Version-Release number of selected component (if applicable): openshift-origin-node-util-1.33.4-1.el6oso.noarch How reproducible: Always Steps to Reproduce: 1. create a jbossews app 2. echo -e '\xe9' >> ~/app-root/logs/jbossews.log Actual results: watchman will event print something like this to /var/log/messages: Feb 9 17:41:38 ex-std-nodeXXX watchman[217144]: Unhandled exception (invalid byte sequence in UTF-8) from Watchman plugin #<JbossPlugin:0x00000002c277a0>: invalid byte sequence in UTF-8 Expected results: watchman should ignore invalid characters in the log file. Additional info: This has been dealt with two different ways in the past. It was first fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1023576 by treating the file as binary (opening it with 'rb'). When the code was refactored and moved into origin-server, the 'b' was lost, and the bug came up again as https://bugzilla.redhat.com/show_bug.cgi?id=1059804 This time, it was fixed by opening the file with "r:utf-8", but that only works as long as all of the byte sequences are valid utf-8, which we cannot control. I'd suggest either going back to "rb", or doing something like: File.open(log, 'r:utf-8').each_line do |event| next unless event.valid_encoding? && event =~ / java.lang.OutOfMemoryError/ ... In my tests, this is about 15-20% more expensive than grep, but it works.
https://github.com/openshift/origin-server/pull/6073
Checked on devenv_5430, after insert the invalid ISO-8859-1 string "\xe9" to the jboss log, the watchman will not report unhandled exception and still works well on the existing features. Move bug to verified.