Bug 1294543 - Drop/demote dependency on simplejson
Summary: Drop/demote dependency on simplejson
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: python-fedora
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Infrastructure SIG
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-28 22:23 UTC by Ville Skyttä
Modified: 2016-01-04 15:03 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-04 15:03:51 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ville Skyttä 2015-12-28 22:23:48 UTC
python-fedora's dependency on simplejson is not a hard dependency, in absence it'll fall back to json.

I'm not sure if the use of simplejson is warranted at all nowadays, perhaps it should be removed upstream and plain json used instead?

If not, the dependency in the package should be demoted from Requires to at least Recommends if not Suggests.

Comment 1 Pierre-YvesChibon 2015-12-29 08:44:49 UTC
AFAIK simplejson is still the fastest JSON parser out there for python.

Is there an issue with Requiring it? Not that I am opposed to Recommends but merely curious of the issue.

Comment 2 Ville Skyttä 2015-12-29 09:54:21 UTC
(In reply to Pierre-YvesChibon from comment #1)
> AFAIK simplejson is still the fastest JSON parser out there for python.

Pretty much all benchmarks out there that I've found show that the builtin json is on par or beats simplejson with Python >= 2.7; only for Python 2.6 simplejson is clearly faster. There are also reports of simplejson having/had issues that the builtin one does not, for example with bytes vs str on Python 3 etc.

Here are a few sillyish microbenchmarchs adopted from https://code.djangoproject.com/ticket/18023 run on my box:

$ python -m timeit -s "from json import loads, dumps" "loads(dumps(list(range(32))))"
100000 loops, best of 3: 14.7 usec per loop
$ python3 -m timeit -s "from json import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 20.9 usec per loop

$ python -m timeit -s "from simplejson import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 18.7 usec per loop
$ python3 -m timeit -s "from simplejson import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 31.2 usec per loop

$ python -m timeit -s "from json import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 24.8 usec per loop
$ python3 -m timeit -s "from json import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 22.9 usec per loop

$ python -m timeit -s "from simplejson import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 21.2 usec per loop
$ python3 -m timeit -s "from simplejson import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 28.1 usec per loop

json is faster in all of the above. I also modified https://gist.github.com/lightcatcher/1136415 so that it works on Python 3, runs 100000 iterations instead of 10000 and tested only between json and simplejson, results on my F-23 box:

    $ python benchmark.py 
    JSON Benchmark
    2.7.10 (default, Sep  8 2015, 17:20:17) 
    [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    -----------------------------
    ENCODING
    json: 6.305129s
    simplejson: 2.511114s

    DECODING
    json: 2.86301s
    simplejson: 4.539282s

    $ python3 benchmark.py 
    JSON Benchmark
    3.4.3 (default, Jun 29 2015, 12:16:01) 
    [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    -----------------------------
    ENCODING
    json: 3.667106s
    simplejson: 3.939518s

    DECODING
    json: 3.440348s
    simplejson: 5.340811999999998s

So with 2.7 encoding was faster with simplejson, in all other cases json beat it. This was with simplejson 3.5.3, 3.8.1 produces similar results.

> Is there an issue with Requiring it? Not that I am opposed to Recommends but
> merely curious of the issue.

I'd guess JSON processing performance in python-fedora is pretty much a non-issue to begin with, and especially with the miniscule performance differences between json and simplejson. Bugs that the builtin one doesn't have are a more important thing. Also, simplejson appears to be pretty much unmaintained in Fedora; at the moment rawhide is at 3.5.3 and upstream at 3.8.1, and the maintainer hasn't acted on the update notifications in almost 1.5 years now.

https://bugzilla.redhat.com/show_bug.cgi?id=1124246
https://github.com/simplejson/simplejson/blob/master/CHANGES.txt

Comment 3 Pierre-YvesChibon 2015-12-29 10:23:23 UTC
On python3 I agree that the stdlib json module is at least as good, I should have specified that I was thinking of py2

That being said, we may want to drop simplejson all together based on these arguments

Comment 4 Ralph Bean 2016-01-04 15:03:51 UTC
Dropped from the rawhide branch here:  http://pkgs.fedoraproject.org/cgit/python-fedora.git/commit/?id=896e0b815fd0cddcafd0ce5d23258d1b02abf564

It'll go out to the other branches after we have a next upstream release ready.  :)  Thanks all!


Note You need to log in before you can comment on or make changes to this bug.