Bug 1004072

Summary: The Node parent generates a file called units.json.gz, but the file does not contain valid json
Product: [Retired] Pulp Reporter: Randy Barlow <rbarlow>
Component: nodesAssignee: Jeff Ortel <jortel>
Status: CLOSED NOTABUG QA Contact: Preethi Thomas <pthomas>
Severity: medium Docs Contact:
Priority: high    
Version: 2.2 BetaCC: skarmark
Target Milestone: ---Keywords: Triaged
Target Release: 2.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-12 16:26:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Randy Barlow 2013-09-03 20:43:37 UTC
The units.json.gz file contains one json document per line, rather than one json document in the file. The .json extension is misleading in this case.

It would perhaps be an improvement if the units.json file contained a single json object that had an attribute called 'units' with a list of units. Currently, the json parser cannot open the file:

$ python
Python 2.6.6 (r266:84292, May 27 2013, 05:35:12) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> with open('units.json') as f:
...     a = json.load(f)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/lib64/python2.6/json/__init__.py", line 267, in load
    parse_constant=parse_constant, **kw)
  File "/usr/lib64/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.6/json/decoder.py", line 322, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 82 column 1 (char 858 - 2527719)

Comment 1 Jeff Ortel 2013-09-09 16:57:37 UTC
This is by design.  Each unit is written on a separate line so that the entire document is not read into memory.  Instead, each content unit (document) is written on a separate line and read/processed individually.  As for the extension ... the document does contain json.  do you have a better suggestion?