Bug 1294543 - Drop/demote dependency on simplejson
Drop/demote dependency on simplejson
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: python-fedora (Show other bugs)
rawhide
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Fedora Infrastructure SIG
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-28 17:23 EST by Ville Skyttä
Modified: 2016-01-04 10:03 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-04 10:03:51 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ville Skyttä 2015-12-28 17:23:48 EST
python-fedora's dependency on simplejson is not a hard dependency, in absence it'll fall back to json.

I'm not sure if the use of simplejson is warranted at all nowadays, perhaps it should be removed upstream and plain json used instead?

If not, the dependency in the package should be demoted from Requires to at least Recommends if not Suggests.
Comment 1 Pierre-YvesChibon 2015-12-29 03:44:49 EST
AFAIK simplejson is still the fastest JSON parser out there for python.

Is there an issue with Requiring it? Not that I am opposed to Recommends but merely curious of the issue.
Comment 2 Ville Skyttä 2015-12-29 04:54:21 EST
(In reply to Pierre-YvesChibon from comment #1)
> AFAIK simplejson is still the fastest JSON parser out there for python.

Pretty much all benchmarks out there that I've found show that the builtin json is on par or beats simplejson with Python >= 2.7; only for Python 2.6 simplejson is clearly faster. There are also reports of simplejson having/had issues that the builtin one does not, for example with bytes vs str on Python 3 etc.

Here are a few sillyish microbenchmarchs adopted from https://code.djangoproject.com/ticket/18023 run on my box:

$ python -m timeit -s "from json import loads, dumps" "loads(dumps(list(range(32))))"
100000 loops, best of 3: 14.7 usec per loop
$ python3 -m timeit -s "from json import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 20.9 usec per loop

$ python -m timeit -s "from simplejson import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 18.7 usec per loop
$ python3 -m timeit -s "from simplejson import loads, dumps" "loads(dumps(list(range(32))))"
10000 loops, best of 3: 31.2 usec per loop

$ python -m timeit -s "from json import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 24.8 usec per loop
$ python3 -m timeit -s "from json import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 22.9 usec per loop

$ python -m timeit -s "from simplejson import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 21.2 usec per loop
$ python3 -m timeit -s "from simplejson import loads, dumps" "loads(dumps(dict(enumerate('abcdefghijklmno'))))"
10000 loops, best of 3: 28.1 usec per loop

json is faster in all of the above. I also modified https://gist.github.com/lightcatcher/1136415 so that it works on Python 3, runs 100000 iterations instead of 10000 and tested only between json and simplejson, results on my F-23 box:

    $ python benchmark.py 
    JSON Benchmark
    2.7.10 (default, Sep  8 2015, 17:20:17) 
    [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    -----------------------------
    ENCODING
    json: 6.305129s
    simplejson: 2.511114s

    DECODING
    json: 2.86301s
    simplejson: 4.539282s

    $ python3 benchmark.py 
    JSON Benchmark
    3.4.3 (default, Jun 29 2015, 12:16:01) 
    [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    -----------------------------
    ENCODING
    json: 3.667106s
    simplejson: 3.939518s

    DECODING
    json: 3.440348s
    simplejson: 5.340811999999998s

So with 2.7 encoding was faster with simplejson, in all other cases json beat it. This was with simplejson 3.5.3, 3.8.1 produces similar results.

> Is there an issue with Requiring it? Not that I am opposed to Recommends but
> merely curious of the issue.

I'd guess JSON processing performance in python-fedora is pretty much a non-issue to begin with, and especially with the miniscule performance differences between json and simplejson. Bugs that the builtin one doesn't have are a more important thing. Also, simplejson appears to be pretty much unmaintained in Fedora; at the moment rawhide is at 3.5.3 and upstream at 3.8.1, and the maintainer hasn't acted on the update notifications in almost 1.5 years now.

https://bugzilla.redhat.com/show_bug.cgi?id=1124246
https://github.com/simplejson/simplejson/blob/master/CHANGES.txt
Comment 3 Pierre-YvesChibon 2015-12-29 05:23:23 EST
On python3 I agree that the stdlib json module is at least as good, I should have specified that I was thinking of py2

That being said, we may want to drop simplejson all together based on these arguments
Comment 4 Ralph Bean 2016-01-04 10:03:51 EST
Dropped from the rawhide branch here:  http://pkgs.fedoraproject.org/cgit/python-fedora.git/commit/?id=896e0b815fd0cddcafd0ce5d23258d1b02abf564

It'll go out to the other branches after we have a next upstream release ready.  :)  Thanks all!

Note You need to log in before you can comment on or make changes to this bug.