Bug 1254429

Summary: vdsm logs unicode literals explictly
Product: Red Hat Enterprise Virtualization Manager Reporter: Pavel Zhukov <pzhukov>
Component: vdsmAssignee: Nobody <nobody>
Status: CLOSED WONTFIX QA Contact: Aharon Canan <acanan>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5.4CC: ahoness, amureini, bazulay, danken, ecohen, gklein, lpeer, lsurette, oourfali, pkliczew, pzhukov, ycui, yeylon
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-07 08:55:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Zhukov 2015-08-18 06:58:17 UTC
Description of problem:
Before we had logs in format  which could be parsed by almost any json parsers like this:
2015-08-04 10:04:08,001::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {'37f6b4bf-85e5-4098-899c-a83be5fe1667': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000351651', 'lastCheck': '9.9', 'valid': True}, 'fec4c8a1-78aa-4a18-8b9e-703efaeb7606': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000428059', 'lastCheck': '4.7', 'valid': True}, '61812dbb-db41-4a1c-9dae-7c17ee546453': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.00030616', 'lastCheck': '4.5', 'valid': True}, '4e276476-f71a-4c7e-a280-c2514cf079e0': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000420733', 'lastCheck': '1.5', 'valid': True}, '3da5f396-6a2c-493b-bf8e-c3bc37544837': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000250344', 'lastCheck': '3.5', 'valid': True}, '1685fca2-551b-48bb-a4dd-1666711a0fd7': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000207157', 'lastCheck': '9.9', 'valid': True}, '03d33acc-1612-42f5-883a-843fb671e466': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000353145', 'lastCheck': '8.4', 'valid': True}, 'cbf69ee9-2f61-41b6-9ae1-60262bceed72': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000479004', 'lastCheck': '1.6', 'valid': True}, '7afa9422-9949-49fa-b531-2f926865f093': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.00126919', 'lastCheck': '4.5', 'valid': True}, '5844229a-1667-4c76-afdb-23c256219fde': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000393182', 'lastCheck': '0.0', 'valid': True}, '193915ef-a870-4d12-b206-7db94a2d12c3': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.00043152', 'lastCheck': '1.6', 'valid': True}, '31efc1df-32d7-404a-bac8-25aadf598517': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000420744', 'lastCheck': '0.6', 'valid': True}}

Now vdsm writes logs with Unicode prefix:
2015-08-17 08:00:09,300::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {u'2964c048-3787-4be8-8de4-a06604491190': {'code': 0, 'actual': True, 'version': 3, 'acquired': True, 'delay': '0.048734', 'lastCheck': '7.5', 'valid': True}, u'd6a71de5-d51e-4693-bded-6d049b0eb1cf': {'code': 0, 'actual': True, 'version': 3, 'acquired': True, 'delay': '0.00920437', 'lastCheck': '0.0', 'valid': True}, u'6c0ccdb5-d853-4c8b-aeea-70d9630a2d26': {'code': 0, 'actual': True, 'version': 3, 'acquired': True, 'delay': '0.0759361', 'lastCheck': '3.0', 'valid': True}, u'd0569de2-c8dc-4d70-a966-e765a2d26f6e': {'code': 0, 'actual': True, 'version': 3, 'acquired': True, 'delay': '0.0672134', 'lastCheck': '7.2', 'valid': True}, u'dbeedcac-3cc2-48fe-bd0a-a8c525e24ab3': {'code': 0, 'actual': True, 'version': 3, 'acquired': True, 'delay': '0.0174146', 'lastCheck': '7.5', 'valid': True}}

This broke all scripts and made debug even worst than before.

Version-Release number of selected component (if applicable):
vdsm-4.16.24-2.el7ev.x86_64

How reproducible:
100%

Actual results:
Traceback (most recent call last):
  File "lastdelayavg.py", line 17, in <module>
    json_raw = json.loads(json_string)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 2 (char 1)


Expected results:


Additional info:

Comment 1 Dan Kenigsberg 2015-08-24 06:37:17 UTC
I believe that this change is due to the move to jsonrpc; there has been no explicit change in vdsm logging to have caused u''.

Pavel, please list the scripts broken by this. I think that we'd better off fixing them than hacking vdsm.log not to mark unicode strings as such.

Comment 2 Pavel Zhukov 2015-08-25 16:50:40 UTC
(In reply to Dan Kenigsberg from comment #1)
> I believe that this change is due to the move to jsonrpc; there has been no
> explicit change in vdsm logging to have caused u''.
> 
> Pavel, please list the scripts broken by this. I think that we'd better off
> fixing them than hacking vdsm.log not to mark unicode strings as such.

It's set of my own script to parse vdsm.log (for performance measures). They have been working since 3.2 and broken now. And I don't think adding u'' improves readability of the logs or size of huge vdsm.log.

This piece of script doesn't work anymore:
        date = ":".join(line.split(":")[0:3]).split(",")[0]
        json_string = line[line.index('{'):]
        json_raw = json.loads(json_string)

Comment 3 Dan Kenigsberg 2015-08-26 11:20:07 UTC
You can post your scripts to vdsm's contrib subdirectory. Would you share a complete function here, so the context is clearer?

Comment 4 Pavel Zhukov 2015-09-01 15:09:26 UTC
(In reply to Dan Kenigsberg from comment #3)
> You can post your scripts to vdsm's contrib subdirectory. Would you share a
> complete function here, so the context is clearer?

I don't have time for review of this Q&D scripts (the contain hardcoded paths, ticket numbers etc). I'm sorry. It's mix of bash/python scripts/snippets.
The only thing I'd like to say by creating this bug that we (GSS) have to parse huge amount of data (vdsm.log.*.xz) to find the reason of problem and the format of the logs was radically changed from (almost) valid JSON to something hardly usable without any notification (and without any reasons).