Bug 1294543

Summary: Drop/demote dependency on simplejson
Product: [Fedora] Fedora Reporter: Ville Skyttä <ville.skytta>
Component: python-fedoraAssignee: Fedora Infrastructure SIG <infra-sig>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: cheese, infra-sig, jonstanley, lmacken, pingou, rbean, relrod, ricky
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-04 15:03:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ville Skyttä 2015-12-28 22:23:48 UTC
python-fedora's dependency on simplejson is not a hard dependency, in absence it'll fall back to json.

I'm not sure if the use of simplejson is warranted at all nowadays, perhaps it should be removed upstream and plain json used instead?

If not, the dependency in the package should be demoted from Requires to at least Recommends if not Suggests.

Comment 1 Pierre-YvesChibon 2015-12-29 08:44:49 UTC
AFAIK simplejson is still the fastest JSON parser out there for python.

Is there an issue with Requiring it? Not that I am opposed to Recommends but merely curious of the issue.

Comment 2 Ville Skyttä 2015-12-29 09:54:21 UTC
(In reply to Pierre-YvesChibon from comment #1)
> AFAIK simplejson is still the fastest JSON parser out there for python.

Pretty much all benchmarks out there that I've found show that the builtin json is on par or beats simplejson with Python >= 2.7; only for Python 2.6 simplejson is clearly faster. There are also reports of simplejson having/had issues that the builtin one does not, for example with bytes vs str on Python 3 etc.

Here are a few sillyish microbenchmarchs adopted from https://code.djangoproject.com/ticket/18023 run on my box:

$ python -m timeit -s "from json import loads, dumps" "loads(dumps(list(range(32))))"
100000 loops, best of 3: 14.7 usec per loop
$ python3 -m timeit -s "from json import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 20.9 usec per loop

$ python -m timeit -s "from simplejson import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 18.7 usec per loop
$ python3 -m timeit -s "from simplejson import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 31.2 usec per loop

$ python -m timeit -s "from json import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 24.8 usec per loop
$ python3 -m timeit -s "from json import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 22.9 usec per loop

$ python -m timeit -s "from simplejson import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 21.2 usec per loop
$ python3 -m timeit -s "from simplejson import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 28.1 usec per loop

json is faster in all of the above. I also modified https://gist.github.com/lightcatcher/1136415 so that it works on Python 3, runs 100000 iterations instead of 10000 and tested only between json and simplejson, results on my F-23 box:

    $ python benchmark.py 
    JSON Benchmark
    2.7.10 (default, Sep  8 2015, 17:20:17) 
    [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    -----------------------------
    ENCODING
    json: 6.305129s
    simplejson: 2.511114s

    DECODING
    json: 2.86301s
    simplejson: 4.539282s

    $ python3 benchmark.py 
    JSON Benchmark
    3.4.3 (default, Jun 29 2015, 12:16:01) 
    [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    -----------------------------
    ENCODING
    json: 3.667106s
    simplejson: 3.939518s

    DECODING
    json: 3.440348s
    simplejson: 5.340811999999998s

So with 2.7 encoding was faster with simplejson, in all other cases json beat it. This was with simplejson 3.5.3, 3.8.1 produces similar results.

> Is there an issue with Requiring it? Not that I am opposed to Recommends but
> merely curious of the issue.

I'd guess JSON processing performance in python-fedora is pretty much a non-issue to begin with, and especially with the miniscule performance differences between json and simplejson. Bugs that the builtin one doesn't have are a more important thing. Also, simplejson appears to be pretty much unmaintained in Fedora; at the moment rawhide is at 3.5.3 and upstream at 3.8.1, and the maintainer hasn't acted on the update notifications in almost 1.5 years now.

https://bugzilla.redhat.com/show_bug.cgi?id=1124246
https://github.com/simplejson/simplejson/blob/master/CHANGES.txt

Comment 3 Pierre-YvesChibon 2015-12-29 10:23:23 UTC
On python3 I agree that the stdlib json module is at least as good, I should have specified that I was thinking of py2

That being said, we may want to drop simplejson all together based on these arguments

Comment 4 Ralph Bean 2016-01-04 15:03:51 UTC
Dropped from the rawhide branch here:  http://pkgs.fedoraproject.org/cgit/python-fedora.git/commit/?id=896e0b815fd0cddcafd0ce5d23258d1b02abf564

It'll go out to the other branches after we have a next upstream release ready.  :)  Thanks all!