Bug 2096621 - Review Request: simdjson - Parsing gigabytes of JSON per second
Summary: Review Request: simdjson - Parsing gigabytes of JSON per second
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Jonathan Wright
QA Contact: Jonathan Wright
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-14 08:26 UTC by Ali Erdinc Koroglu
Modified: 2022-08-09 15:17 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-09 15:17:53 UTC
Type: Bug
Embargoed:
jonathan: fedora-review+


Attachments (Terms of Use)

Description Ali Erdinc Koroglu 2022-06-14 08:26:22 UTC
SPEC Url: https://download.copr.fedorainfracloud.org/results/aekoroglu/fedora/fedora-rawhide-x86_64/04531970-simdjson/simdjson.spec
SRPM Url: https://download.copr.fedorainfracloud.org/results/aekoroglu/fedora/fedora-rawhide-x86_64/04531970-simdjson/simdjson-2.0.3-1.fc37.src.rpm

Description:
Parsing gigabytes of JSON per second

JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms to parse JSON 4x faster than RapidJSON and 25x faster than JSON for Modern C++.

Comment 1 Benson Muite 2022-06-14 17:50:57 UTC
Unofficial review:

Package Review
==============

Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated
[ ] = Manual review needed



===== MUST items =====

C/C++:
[x]: Package does not contain kernel modules.
[x]: Package contains no static executables.
[x]: If your application is a C or C++ application you must list a
     BuildRequires against gcc, gcc-c++ or clang.
[x]: Header files in -devel subpackage, if present.
[x]: ldconfig not called in %post and %postun for Fedora 28 and later.
[x]: Package does not contain any libtool archives (.la)
[x]: Rpath absent or only used for internal libs.
[x]: Development (unversioned) .so files in -devel subpackage, if present.

Generic:
[?]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[x]: If (and only if) the source package includes the text of the
     license(s) in its own file, then that file, containing the text of the
     license(s) for the package is included in %license.
[?]: License field in the package spec file matches the actual license.
     Note: Checking patched sources after %prep for licenses. Licenses
     found: "Unknown or generated", "*No copyright* Apache License 2.0",
     "*No copyright* MIT License BSD 3-Clause License Boost Software
     License Apache License 2.0", "*No copyright* JSON License", "MIT
     License", "ISC License BSD 2-clause NetBSD License BSD 2-Clause
     License", "*No copyright* ISC License", "BSD 3-Clause License", "Boost
     Software License 1.0". 494 files have unknown license. Detailed output
     of licensecheck in
     /home/simdjson/2096621-simdjson/licensecheck.txt
[x]: License file installed when any subpackage combination is installed.
[x]: %build honors applicable compiler flags or justifies otherwise.
[!]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[?]: Sources contain only permissible code or content.
[-]: Package contains desktop file if it is a GUI application.
[x]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory
     names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[-]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[?]: Requires correct, justified where necessary.
[x]: Spec file is legible and written in American English.
[-]: Package contains systemd file(s) if in need.
[x]: Useful -debuginfo package or justification otherwise.
[?]: Package is not known to require an ExcludeArch tag.
[?]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
     Note: Documentation size is 20480 bytes in 1 files.
[?]: Package complies to the Packaging Guidelines
[x]: Package successfully compiles and builds into binary rpms on at least
     one supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
     Note: There are rpmlint messages (see attachment).
[x]: Package requires other packages for directories it uses.
[x]: Package must own all directories that it creates.
[x]: Package does not own files or directories owned by other packages.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Dist tag is present.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package must not depend on deprecated() packages.
[x]: Package use %makeinstall only when make install DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package does not use a name that already exists.
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as
     provided in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Packages must not store files under /srv, /opt or /usr/local

===== SHOULD items =====

Generic:
[x]: If the source package does not include license text(s) as a separate
     file from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[?]: Fully versioned dependency in subpackages if applicable.
     Note: No Requires: %{name}%{?_isa} = %{version}-%{release} in
     simdjson-devel
[?]: Package functions as described.
[x]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[-]: Sources are verified with gpgverify first in %prep if upstream
     publishes signatures.
     Note: gpgverify is not used.
[?]: Package should compile and build into binary rpms on all supported
     architectures.
[x]: %check is present and all tests pass.
[x]: Packages should try to preserve timestamps of original installed
     files.
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Sources can be downloaded from URI in Source: tag
[x]: SourceX is a working URL.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on debuginfo package(s).
     Note: There are rpmlint messages (see attachment).
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Large data in /usr/share should live in a noarch subpackage if package
     is arched.
[x]: Spec file according to URL is the same as in SRPM.


Rpmlint
-------
Cannot parse rpmlint output:


Rpmlint (debuginfo)
-------------------
Cannot parse rpmlint output:



Rpmlint (installed packages)
----------------------------
Cannot parse rpmlint output:


Source checksums
----------------
https://github.com/simdjson/simdjson/archive/v2.0.3/simdjson-2.0.3.tar.gz :
  CHECKSUM(SHA256) this package     : c1bcf65b3bd830bf8f747b8dd7126edd4bb7562bebb92698c1750acf4c979df6
  CHECKSUM(SHA256) upstream package : c1bcf65b3bd830bf8f747b8dd7126edd4bb7562bebb92698c1750acf4c979df6


Requires
--------
simdjson (rpmlib, GLIBC filtered):
    libc.so.6()(64bit)
    libgcc_s.so.1()(64bit)
    libgcc_s.so.1(GCC_3.0)(64bit)
    libgcc_s.so.1(GCC_3.3.1)(64bit)
    libstdc++.so.6()(64bit)
    libstdc++.so.6(CXXABI_1.3)(64bit)
    libstdc++.so.6(CXXABI_1.3.8)(64bit)
    libstdc++.so.6(CXXABI_1.3.9)(64bit)
    rtld(GNU_HASH)

simdjson-devel (rpmlib, GLIBC filtered):
    cmake-filesystem(x86-64)
    libsimdjson.so.11()(64bit)

simdjson-debuginfo (rpmlib, GLIBC filtered):

simdjson-debugsource (rpmlib, GLIBC filtered):



Provides
--------
simdjson:
    libsimdjson.so.11()(64bit)
    simdjson
    simdjson(x86-64)

simdjson-devel:
    cmake(simdjson)
    simdjson-devel
    simdjson-devel(x86-64)

simdjson-debuginfo:
    debuginfo(build-id)
    libsimdjson.so.11.0.0-2.0.3-1.fc37.x86_64.debug()(64bit)
    simdjson-debuginfo
    simdjson-debuginfo(x86-64)

simdjson-debugsource:
    simdjson-debugsource
    simdjson-debugsource(x86-64)



Generated by fedora-review 0.8.0 (e988316) last change: 2022-04-07
Command line :/usr/bin/fedora-review -b 2096621
Buildroot used: fedora-rawhide-x86_64
Active plugins: Shell-api, C/C++, Generic
Disabled plugins: Ocaml, fonts, Perl, SugarActivity, PHP, R, Python, Java, Haskell
Disabled flags: EPEL6, EPEL7, DISTTAG, BATCH, EXARCH

Comments:
1) The doc directory contains markdown files. It may be helpful for users to have these packaged.
2) Is it possible to make a build on Copr to check that it will work on supported architectures,  
x86_64, AArch64 and ARM-hfp https://docs.fedoraproject.org/en-US/packaging-guidelines/#_architecture_support
3) jsoncpp is already packaged, https://packages.fedoraproject.org/pkgs/jsoncpp/jsoncpp/ can the packaged version rather than the bundled version be used? Note that jsoncpp has a different license
4) Why does the build require git?
5) For full tests, may wish to compile with Developer Mode to get full test suite https://github.com/simdjson/simdjson/blob/master/HACKING.md - this seems to require google_benchmark which is packaged https://packages.fedoraproject.org/pkgs/google-benchmark/google-benchmark/

Comment 2 Ali Erdinc Koroglu 2022-08-01 17:13:48 UTC
Hello, I upgraded the spec file.
Simdjson is different than JsonCpp because it uses SIMD instructions and microparallel algorithms to parse the Json files.
Btw we can't compile the developer mode for now because of a gcc (12.1.1) bug [1] on Fedora36 and Rawhide.

SPEC Url: https://download.copr.fedorainfracloud.org/results/aekoroglu/fedora/fedora-rawhide-x86_64/04695423-simdjson/simdjson.spec
SRPM Url: https://download.copr.fedorainfracloud.org/results/aekoroglu/fedora/fedora-rawhide-x86_64/04695423-simdjson/simdjson-2.2.2-1.fc37.src.rpm

1: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105329

Comment 3 Jonathan Wright 2022-08-01 22:45:21 UTC
The spec URL above is different from the spec in the SRPM.  I'm assuming the latter is the latest.

Package Review
==============
Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated

===== MUST items =====
C/C++:
[x]: Package does not contain kernel modules.
[x]: Package contains no static executables.
[x]: If your application is a C or C++ application you must list a
     BuildRequires against gcc, gcc-c++ or clang.
[x]: Header files in -devel subpackage, if present.
[x]: ldconfig not called in %post and %postun for Fedora 28 and later.
[x]: Package does not contain any libtool archives (.la)
[x]: Rpath absent or only used for internal libs.
[x]: Development (unversioned) .so files in -devel subpackage, if present.

Generic:
[x]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[x]: License field in the package spec file matches the actual license.
     Note: Checking patched sources after %prep for licenses. Licenses
     found: "Unknown or generated", "*No copyright* Apache License 2.0",
     "*No copyright* MIT License BSD 3-Clause License Boost Software
     License Apache License 2.0", "*No copyright* JSON License", "MIT
     License", "ISC License BSD 2-clause NetBSD License BSD 2-Clause
     License", "*No copyright* ISC License", "BSD 3-Clause License", "Boost
     Software License 1.0". 504 files have unknown license. Detailed output
     of licensecheck in /home/jonathan/fedora-
     review/2096621-simdjson/licensecheck.txt
[!]: License file installed when any subpackage combination is installed.
[!]: If the package is under multiple licenses, the licensing breakdown
     must be documented in the spec.
[x]: %build honors applicable compiler flags or justifies otherwise.
[x]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[x]: Sources contain only permissible code or content.
[-]: Package contains desktop file if it is a GUI application.
[x]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory
     names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[-]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[x]: Requires correct, justified where necessary.
[x]: Spec file is legible and written in American English.
[-]: Package contains systemd file(s) if in need.
[x]: Useful -debuginfo package or justification otherwise.
[x]: Package is not known to require an ExcludeArch tag.
[x]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
     Note: Documentation size is 30720 bytes in 3 files.
[x]: Package complies to the Packaging Guidelines
[x]: Package successfully compiles and builds into binary rpms on at least
     one supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
     Note: There are rpmlint messages (see attachment).
[x]: If (and only if) the source package includes the text of the
     license(s) in its own file, then that file, containing the text of the
     license(s) for the package is included in %license.
[x]: Package requires other packages for directories it uses.
[x]: Package must own all directories that it creates.
[x]: Package does not own files or directories owned by other packages.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Dist tag is present.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package must not depend on deprecated() packages.
[x]: Package use %makeinstall only when make install DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package does not use a name that already exists.
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as
     provided in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Packages must not store files under /srv, /opt or /usr/local

===== SHOULD items =====

Generic:
[x]: If the source package does not include license text(s) as a separate
     file from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[x]: Fully versioned dependency in subpackages if applicable.
     Note: No Requires: %{name}%{?_isa} = %{version}-%{release} in
     simdjson-devel
[x]: Package functions as described.
[x]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[-]: Sources are verified with gpgverify first in %prep if upstream
     publishes signatures.
     Note: gpgverify is not used.
[x]: Package should compile and build into binary rpms on all supported
     architectures.
[x]: %check is present and all tests pass.
[x]: Packages should try to preserve timestamps of original installed
     files.
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Sources can be downloaded from URI in Source: tag
[x]: SourceX is a working URL.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on debuginfo package(s).
     Note: There are rpmlint messages (see attachment).
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Large data in /usr/share should live in a noarch subpackage if package
     is arched.
[x]: Spec file according to URL is the same as in SRPM.


Rpmlint
-------
Cannot parse rpmlint output:


Rpmlint (debuginfo)
-------------------
Cannot parse rpmlint output:



Rpmlint (installed packages)
----------------------------
Cannot parse rpmlint output:


Source checksums
----------------
https://github.com/simdjson/simdjson/archive/v2.2.2/simdjson-2.2.2.tar.gz :
  CHECKSUM(SHA256) this package     : b0e36beab240bd827c1103b4c66672491595930067871e20946d67b07758c010
  CHECKSUM(SHA256) upstream package : b0e36beab240bd827c1103b4c66672491595930067871e20946d67b07758c010


Requires
--------
simdjson (rpmlib, GLIBC filtered):
    libc.so.6()(64bit)
    libgcc_s.so.1()(64bit)
    libgcc_s.so.1(GCC_3.0)(64bit)
    libgcc_s.so.1(GCC_3.3.1)(64bit)
    libstdc++.so.6()(64bit)
    libstdc++.so.6(CXXABI_1.3)(64bit)
    libstdc++.so.6(CXXABI_1.3.8)(64bit)
    libstdc++.so.6(CXXABI_1.3.9)(64bit)
    rtld(GNU_HASH)

simdjson-devel (rpmlib, GLIBC filtered):
    cmake-filesystem(x86-64)
    libsimdjson.so.13()(64bit)

simdjson-doc (rpmlib, GLIBC filtered):

simdjson-debuginfo (rpmlib, GLIBC filtered):

simdjson-debugsource (rpmlib, GLIBC filtered):



Provides
--------
simdjson:
    libsimdjson.so.13()(64bit)
    simdjson
    simdjson(x86-64)

simdjson-devel:
    cmake(simdjson)
    simdjson-devel
    simdjson-devel(x86-64)

simdjson-doc:
    simdjson-doc
    simdjson-doc(x86-64)

simdjson-debuginfo:
    debuginfo(build-id)
    libsimdjson.so.13.0.0-2.2.2-1.fc37.x86_64.debug()(64bit)
    simdjson-debuginfo
    simdjson-debuginfo(x86-64)

simdjson-debugsource:
    simdjson-debugsource
    simdjson-debugsource(x86-64)



Generated by fedora-review 0.8.0 (e988316) last change: 2022-04-07
Command line :/usr/bin/fedora-review -b 2096621
Buildroot used: fedora-rawhide-x86_64
Active plugins: Generic, C/C++, Shell-api
Disabled plugins: R, Haskell, SugarActivity, PHP, Java, Python, Perl, Ocaml, fonts
Disabled flags: EPEL6, EPEL7, DISTTAG, BATCH, EXARCH

Comments:
1) The "License:" value to be SPDX format per https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/#_valid_license_short_names

- License:        ASL 2.0 AND MIT
+ License:        Apache-2.0 AND MIT

2) You need to add the license into all sub-packages and document the license breakdown in the spec file.

Comment 6 Gwyn Ciesla 2022-08-09 14:22:29 UTC
(fedscm-admin):  The Pagure repository was created at https://src.fedoraproject.org/rpms/simdjson

Comment 7 Ali Erdinc Koroglu 2022-08-09 15:17:53 UTC
Thank you all..


Note You need to log in before you can comment on or make changes to this bug.