Bug 843997 - Review Request: mlpack - scalable C++ machine learning library
Summary: Review Request: mlpack - scalable C++ machine learning library
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: 18
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: José Matos
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-27 19:14 UTC by Ryan Curtin
Modified: 2013-01-22 03:32 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-22 03:32:51 UTC
Type: ---
Embargoed:
jamatos: fedora-review+
gwync: fedora-cvs+


Attachments (Terms of Use)
mlpack 1.0.1-4 spec file (7.03 KB, application/octet-stream)
2012-09-25 21:42 UTC, Ryan Curtin
no flags Details
mlpack 1.0.1-4 srpm (424.72 KB, application/x-redhat-package-manager)
2012-09-25 21:42 UTC, Ryan Curtin
no flags Details

Description Ryan Curtin 2012-07-27 19:14:33 UTC
Spec URL: http://www.mlpack.org/files/mlpack.spec
SRPM URL: http://www.mlpack.org/files/mlpack-1.0.1-2.fc17.src.rpm
Description: A scalable C++ machine learning library
Fedora Account System Username: rcurtin

mlpack is a C++ machine learning library  with emphasis on scalability, speed,
and ease-of-use. Its aim is to make machine learning possible for novice users
by means of a simple, consistent API, while simultaneously exploiting C++
language features to provide maximum performance and maximum flexibility for
expert users. mlpack outperforms competing machine learning libraries by large
margins.

This package provides mlpack (the .so), mlpack-devel (header files), and mlpack-bin (standalone mlpack executables and man pages for them).

For more information on the library, see http://www.mlpack.org/

I successfully built this package on koji:
http://koji.fedoraproject.org/koji/taskinfo?taskID=4336824

Comment 1 José Matos 2012-07-27 23:21:05 UTC
Hi Ryan,
  I am glad that you ask. :-) I am a sponsor: :-)

As a standard disclaimer I would expect you to read
https://fedoraproject.org/wiki/How_to_get_sponsored_into_the_packager_group

as well as naturally
https://fedoraproject.org/wiki/Package_Review_Process

So other than submitting new packages it will a sign of your understanding of the rules to review other packages.

Please refer them here for me to follow, please.

I will be glad to answer any question you have.

Comment 2 Michael Schwendt 2012-07-28 10:49:03 UTC
An honest attempt at trying to review your own package could be enlightening:

  https://fedoraproject.org/wiki/Packaging:ReviewGuidelines

If you're unsure about the possibly overwhelming list of MUST/SHOULD items on that list, feel free to ask.

Just a few comments after a brief look at the spec file:


> Summary:        A scalable, fast C++ machine learning library

Not covered by any guidelines, but in many cases the leading article is superfluous and doesn't increase conciseness.

  Summary: Scalable, fast C++ machine learning library


> Group:          Development/Libraries

"Group: System Environment/Libraries" for run-time libs.


> # More recent Armadillo versions require specific calls to arma::as_scalar().
> Patch3:         fully_qualified_as_scalar.patch

It's good that there are comments on these patch files, but even better would be a comment on the upstream status. That's a "SHOULD" item, however.

https://fedoraproject.org/wiki/Packaging:Guidelines#All_patches_should_have_an_upstream_bug_link_or_comment


> # Don't perform tests on sparse matrices.
> Patch4:         no_sparse_tests.patch

A redundant comment IMO. Better tell _why_ those tests get disabled. ;o)


> Requires:       armadillo >= 2.4.0
> Requires:       libxml2
> Requires:       boost, boost-program-options, boost-math
> Requires:       lapack

https://fedoraproject.org/wiki/Packaging:Guidelines#Explicit_Requires

Which may read as if it is specific to library packages, but basically can be understood as "there should be a spec file comment justifying" the explicit dependency. For example, highlight whether the required packages are strictly required or why they are needed.


> %package bin
> Summary:        Command-line executables for mlpack (machine learning library)
> Group:          Development/Libraries

Questionable group for executables.

> Requires:       %{name} = %{version}-%{release}

https://fedoraproject.org/wiki/Packaging:Guidelines#Requiring_Base_Package


> %files
> %defattr(-,root,root,-)
> %{_libdir}/libmlpack.so

https://fedoraproject.org/wiki/Packaging:Guidelines#Devel_Packages



> %files bin
> %doc %{_mandir}/man1/allknn.1.gz

%_mandir implicitly is marked as documentation. Also, the gzip compression may change or be disabled within a buildsystem, so using wildcards can be beneficial.

  %{_mandir}/man1/allknn.1*


> %files devel
> %{_includedir}/mlpack/
> %{_includedir}/mlpack/core.hpp

The first entry already includes the directory *and* everything in it. The second line is superfluous.
https://fedoraproject.org/wiki/Packaging:Guidelines#File_and_Directory_Ownership

Comment 3 Susi Lehtola 2012-07-28 20:48:41 UTC
The binary filenames look potentially dangerous. For instance, "radical" might be in use by other packages as well.

Since you're upstream, I'd recommend considering a renamal of the binaries to have, say, a ml_ prefix.

Comment 4 Ryan Curtin 2012-07-31 04:19:16 UTC
Hello there,

Thank you for the numerous comments.  It appears I have many things to learn before becoming a sponsored packager!

I have unofficially reviewed two packages and in doing so have learned much about the Fedora packaging guidelines which I thought I knew... but apparently did not.

https://bugzilla.redhat.com/show_bug.cgi?id=820115
https://bugzilla.redhat.com/show_bug.cgi?id=843678

> Not covered by any guidelines, but in many cases the leading article is superfluous and doesn't increase conciseness.
>
>  Summary: Scalable, fast C++ machine learning library

You are right; fixed.

I also fixed the group issues which were mentioned; the libraries and headers now reside in 'System Environment/Libraries', and the executables live in 'Applications/Engineering'.  If those choices are wrong, let me know and I will have it fixed quickly.

I commented the entire spec file more heavily, including better reasons for each of the patches and justification for versioned dependencies.

> %_mandir implicitly is marked as documentation. Also, the gzip compression may change or be disabled within a buildsystem, so using wildcards can be beneficial.
>
>  %{_mandir}/man1/allknn.1*

Neat, I did not know that.  I've fixed that also.

The other issues you've mentioned have also been fixed.

> Since you're upstream, I'd recommend considering a renamal of the binaries to have, say, a ml_ prefix.

I am not sure what we want to do upstream yet, but for now in this package I rename all the binaries to mlpack_$name (i.e. mlpack_radical) and rename the man pages correspondingly using a little bash for loop.  I think that should resolve the ambiguity for now, and that may be the long-term upstream solution too.

> An honest attempt at trying to review your own package could be enlightening.

You were right; I went through all the MUST/SHOULDs and did not find any others which are a problem (other than the ones you pointed out).

I then rebuilt on koji successfully (and tested the executables locally):
http://koji.fedoraproject.org/koji/taskinfo?taskID=4344100

The updated spec: http://www.mlpack.org/files/mlpack.spec
The updated srpm: http://www.mlpack.org/files/mlpack-1.0.1-3.fc17.src.rpm

If there is anything else I've missed, let me know and I will fix it quickly.

Thanks!

Comment 5 José Matos 2012-09-13 23:02:14 UTC
I will start with small issues as it is easier to follow.

1) Why is not the LICENSE.txt in the main package?
https://fedoraproject.org/wiki/Packaging:LicensingGuidelines

2) Another issue that is not an error but I am curious, why is not the documentation (doc directory content) packaged?

3) In patch 4 there is a reference to the eminent release of 3.6.
I suspect that you referring to armadillo 3.4. :-)

4) Do you intend to release this package for EPEL5? If not the package can be simplified in some parts.

Comment 6 Ryan Curtin 2012-09-17 00:05:53 UTC
> 1) Why is not the LICENSE.txt in the main package?

Oops, I did not realize I should be distributing that.  I modified the spec to install LICENSE.txt.

> 2) Another issue that is not an error but I am curious, why is not the documentation (doc directory content) packaged?

The documentation is Doxygen-generated HTML which a user could more easily access at mlpack.org (it is prettier there, too).  I don't think it's necessary to include that amount of stuff with the distribution, especially when a user could generate it themselves using Doxygen painlessly.

> 3) In patch 4 there is a reference to the eminent release of 3.6.
I suspect that you referring to armadillo 3.4. :-)

At the time I wrote the comment, sparse matrix support was expected in 3.6, but fortunately that was not how it turned out.

> 4) Do you intend to release this package for EPEL5? If not the package can be simplified in some parts.

It may be useful to release for EPEL5 because some academic settings may still be using RHEL5 (in fact where I am, RHEL5 is still the main choice and there are even RHEL4 boxes floating around in some particularly unfortunate cases).

I have again updated the spec file and srpm, built on koji, and made it all available:

spec: http://www.mlpack.org/files/mlpack.spec
srpm: http://www.mlpack.org/files/mlpack-1.0.1-4.fc17.src.rpm
koji: http://koji.fedoraproject.org/koji/taskinfo?taskID=4490465

Let me know if there are more issues I should address.

Comment 7 José Matos 2012-09-25 16:08:35 UTC
I am sorry Ryan for taking so long.
I took a long time to understand what was going on until I found out the problem, (shortening a long story) the problem is that the url's above for spec and srpm do not work (for me).

Comment 8 Ryan Curtin 2012-09-25 21:42:06 UTC
Created attachment 617274 [details]
mlpack 1.0.1-4 spec file

Comment 9 Ryan Curtin 2012-09-25 21:42:25 UTC
Created attachment 617275 [details]
mlpack 1.0.1-4 srpm

Comment 10 Ryan Curtin 2012-09-25 21:43:02 UTC
I have verified the URLs (turns out the SRPM was missing) but also added the files as attachments in case.

Comment 11 José Matos 2012-09-26 15:19:37 UTC
There is no need to add the files as attachments. :-)

Some more low hanging fruit.

1) There is no need to put the license file in place in the %install section.

It is enough to do this in the %files section:

%files
...
%doc LICENSE.txt

and voila... this replaces all the logic you have now and it does precisely the same. :-)

2) In reference to point 2 from #6 above:
> The documentation is Doxygen-generated HTML which a user could more easily access at mlpack.org (it is prettier there, too).  I don't think it's necessary to include that amount of stuff with the distribution, especially when a user could generate it themselves using Doxygen painlessly.

I have a different opinion regarding this. If the documentation is provided by upstream it is my opinion that we should provide it, even if we place it on a -doc sub-package that it is not required by the main package.
This subject is discussed here:
https://fedoraproject.org/wiki/Packaging:Guidelines#Documentation

As a rationale for the above sometimes I am in places with small or no connectivity at all and having an update documentation is a boon. I am perfectly able to run doxygen but if this is a new package that I am exploring pre-available off-line documentation is a bonus in my book. :-)

3) Running rpmlint on the generated rpms shows that libxml is a superfluous requirement and it should be dropped since that dependency is automatically taken from BuildRequires.
The same point applies to the boost* libraries, to armadillo and to lapack.

Comment 12 Ryan Curtin 2012-09-27 22:09:11 UTC
I didn't know about the '%doc LICENSE.txt' trick; I have updated that.

> I have a different opinion regarding this. If the documentation is provided by upstream it is my opinion that we should provide it, even if we place it on a -doc sub-package that it is not required by the main package.

Okay; the spec file also now builds mlpack-doc which contains all of the Doxygen documentation.  It seems as though an EPEL5 package would not support a noarch subpackage so I did not force the mlpack-doc package to be noarch.  Should I include a conditional there (i.e. if not EPEL5, build noarch)?

> 3) Running rpmlint on the generated rpms shows that libxml is a superfluous requirement and it should be dropped since that dependency is automatically taken from BuildRequires.

Oops, I did not run rpmlint on the final rpms.  I removed all the superfluous dependencies.

There is one issue I think you will bring up which I'd like to pre-empt:

> mlpack.i686: W: shared-lib-calls-exit /usr/lib/libmlpack.so.1.0.1 exit

MLPACK provides a logging infrastructure including a 'fatal' log, which terminates the application when an EOL is received.  This is intentional behavior and MLPACK functionality depends on it.

New spec and SRPM:

Spec: http://www.mlpack.org/files/mlpack.spec
SRPM: http://www.mlpack.org/files/mlpack-1.0.1-5.fc17.src.rpm

Comment 13 José Matos 2012-09-27 23:09:22 UTC
I am running out of excuses to approve this package. ;-)

One small issue, no need to generate a new srpm for this. There still a leftover from the previous license installation:

# Put the license file in place.
mkdir -p $RPM_BUILD_ROOT/%{_docdir}/%{name}-%{version}/


Regarding the rpmlint output note that it issues a warning and not an error, and as you stated in this case _exit_ should really be there so we can ignore rpmlint in this case.

I will complete the review tomorrow since in my time zone I am past midnight and tired. :-)

Comment 14 Ryan Curtin 2012-09-28 00:06:14 UTC
Ah, I removed the leftover license installation and the updated spec is at the usual place:

http://www.mlpack.org/files/mlpack.spec

I did not bump the version since this was such a trivial change and it has no effect on the output.  If I should have bumped it anyway, let me know and I'll do so.

Thanks!

Ryan

Comment 15 José Matos 2012-11-15 15:47:52 UTC
The package has been APPROVED.

I have sponsored you into the packagers group, welcome to Fedora packagers. :-)

Comment 16 Ryan Curtin 2012-11-15 17:25:45 UTC
New Package SCM Request
=======================
Package Name: mlpack
Short Description: Scalable, fast C++ machine learning library
Owners: rcurtin
Branches: f16 f17 f18 el6
InitialCC:

Comment 17 Gwyn Ciesla 2012-11-15 18:23:37 UTC
Git done (by process-git-requests).

Jose, please take ownership of revire BZs, thanks!

Comment 18 José Matos 2012-11-15 19:05:00 UTC
Oops, I thought I already had.

Done now.

Comment 19 Fedora Update System 2012-11-20 20:14:44 UTC
mlpack-1.0.1-5.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/mlpack-1.0.1-5.fc18

Comment 20 Fedora Update System 2012-11-20 20:17:08 UTC
mlpack-1.0.1-5.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/mlpack-1.0.1-5.fc17

Comment 21 Fedora Update System 2012-11-27 05:09:18 UTC
mlpack-1.0.1-5.fc18 has been pushed to the Fedora 18 stable repository.

Comment 22 Fedora Update System 2012-12-01 08:40:17 UTC
mlpack-1.0.1-5.fc17 has been pushed to the Fedora 17 stable repository.

Comment 23 Fedora Update System 2013-01-03 01:01:33 UTC
mlpack-1.0.3-4.el6 has been submitted as an update for Fedora EPEL 6.
https://admin.fedoraproject.org/updates/mlpack-1.0.3-4.el6

Comment 24 Fedora Update System 2013-01-04 19:43:49 UTC
mlpack-1.0.3-4.el6 has been pushed to the Fedora EPEL 6 testing repository.

Comment 25 Fedora Update System 2013-01-22 03:32:53 UTC
mlpack-1.0.3-4.el6 has been pushed to the Fedora EPEL 6 stable repository.


Note You need to log in before you can comment on or make changes to this bug.