Bug 611873 - Review Request: R-Rsolid - Quantile normalization and base calling for second generation sequencing data
Summary: Review Request: R-Rsolid - Quantile normalization and base calling for second...
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review (Show other bugs)
(Show other bugs)
Version: rawhide
Hardware: All Linux
medium
medium
Target Milestone: ---
Assignee: Mattias Ellert
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-06 18:43 UTC by Adam Huffman
Modified: 2010-11-11 14:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-11 14:20:55 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
mattias.ellert: fedora-review+
kevin: fedora-cvs+


Attachments (Terms of Use)

Description Adam Huffman 2010-07-06 18:43:52 UTC
Spec URL: http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid.spec
SRPM URL: http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid-0.9.2-1.fc12.src.rpm
Description: 

Rsolid is an R package for normalizing fluorescent intensity data from
ABI/SOLiD second generation sequencing platform. It has been observed
that the color-calls provided by factory software contain technical
artifacts, where the proportions of colors called are extremely
variable across sequencing cycles. Under the random DNA fragmentation
assumption, these proportions should be equal across sequencing cycles
and proportional to the dinucleotide frequencies of the sample.

Rsolid implements a version of the quantile normalization algorithm
that transforms the intensity values before calling colors. Results
show that after normalization, the total number of mappable reads
increases by around 5%, and number of perfectly mapped reads increases
by 10%. Moreover a 2-5% reduction in overall error rates is observed,
with a 2-6% reduction in the rate of valid adjacent color
mis-matches. The latter is important, since it leads to a decrease in
false-positive SNP calls.

The normalization algorithm is computationally efficient. In a test we
are able to process 300 million reads in 2 hours using 10 computer
cluster nodes. The engine functions of the package are written in C
for better performance.

Comment 1 Mattias Ellert 2010-07-14 08:58:51 UTC
Fedora Review R-Rsolid 2010-07-14

rpmlint output:

R-Rsolid.x86_64: W: spelling-error Summary(en_US) Quantile -> Quintile, Quartile, Quantize
R-Rsolid.x86_64: W: spelling-error %description -l en_US dinucleotide -> di nucleotide, di-nucleotide, nucleotide
R-Rsolid.x86_64: W: spelling-error %description -l en_US quantile -> quintile, quartile, quantize
R-Rsolid.x86_64: W: spelling-error %description -l en_US mis -> mus, mos, mid
R-Rsolid-devel.x86_64: W: only-non-binary-in-usr-lib
R-Rsolid-devel.x86_64: W: no-documentation
R-Rsolid.src: W: spelling-error Summary(en_US) Quantile -> Quintile, Quartile, Quantize
R-Rsolid.src: W: spelling-error %description -l en_US dinucleotide -> di nucleotide, di-nucleotide, nucleotide
R-Rsolid.src: W: spelling-error %description -l en_US quantile -> quintile, quartile, quantize
R-Rsolid.src: W: spelling-error %description -l en_US mis -> mus, mos, mid
R-Rsolid.src:4: W: mixed-use-of-spaces-and-tabs (spaces: line 4, tab: line 3)
4 packages and 0 specfiles checked; 0 errors, 11 warnings.

You should fix the "mixed-use-of-spaces-and-tabs" - the others looks
like noise.



+ Package named according to R packaging guidelines
+ Specfile named after package
+ Package license "Artistic 2.0" is a Fedora approved license

- I can not find any licensing information for this package, neither
  in the package source, nor on the upstream website. Could you please
  indicate your source for stating that this package is licensed under
  the "Artistic 2.0" license.

+ Specfile is written in legible English
+ Source matches upstream:

$ md5sum Rsolid_0.9-2.tar.gz srpm/Rsolid_0.9-2.tar.gz 
a86608e6c0599e3af9b8082c4e1d31bf  Rsolid_0.9-2.tar.gz
a86608e6c0599e3af9b8082c4e1d31bf  srpm/Rsolid_0.9-2.tar.gz

+ Package compiles

- BuildRequires tetex-latex is deprecated - tex(latex) should be used instead

- The package installs the /usr/lib64/R/library/Rsolid directory but
  does not own it

+ No duplicate files, and %files has %defattr
+ Specfile uses macros consistently
  (Though you could say --configure-args="--with-hdf5=%{_prefix}")

+ Package contains code
+ %doc is not runtime essential
+ Headers are in -devel
+ -devel depends on main with fully qualified version

- Package does not own others directories, but since it installs files
  in /usr/lib64/R/library it must have a Requires: R-core (see the
  template in the Fedora R packaging guidelines).

+ Installed files have valid UTF8 filenames.

- R modules packages no longer run scriptlets, so the Requires(post)
  and Requires(postun) should be removed.

Comment 2 Adam Huffman 2010-07-14 09:12:12 UTC
Thanks for taking a look.  I already have a version that fixes the spec formatting issue - I'll add fixes for the other problems.

I too couldn't find licence information, so I contacted the upstream authors, who said it was Artistic.  To clear this up, I'll ask if they can include explicit information with the release.

Comment 3 Mattias Ellert 2010-07-14 09:20:50 UTC
If upstream could fix the license in the DESCRIPTION file the warning during the build would go away:

* checking DESCRIPTION meta-information ... WARNING
Non-standard license specification:

Some other minor comments - which are not really packaging guidelines violations:
 - You mix listing directories in %files with trailing slash and without.
 - You have different format on different changelog entries (with and without a -)

Comment 4 Adam Huffman 2010-07-16 12:13:21 UTC
New version at:

http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid.spec

http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid-0.9.2-3.fc13.src.rpm

Upstream added a licensing clarification to the DESCRIPTION file, without changing the release number.  The rest of your comments should have been addresses - some of the R problems were because I relied on the r2spec package, which is clearly a little out of date.

I've also noted that it doesn't build on EPEL5, though I don't think that should hold out acceptance into Fedora.  I'll pursue that separately with the authors.

There is a new warning in rpmlint:

R-Rsolid-devel.x86_64: W: only-non-binary-in-usr-lib
There are only non binary files in /usr/lib so they should be in /usr/share.

on which I'd welcome guidance.

Comment 5 Mattias Ellert 2010-07-16 15:12:36 UTC
New rpmlint output:

R-Rsolid.src: W: spelling-error Summary(en_US) Quantile -> Quintile, Quartile, Quantize
R-Rsolid.src: W: spelling-error %description -l en_US dinucleotide -> di nucleotide, di-nucleotide, nucleotide
R-Rsolid.src: W: spelling-error %description -l en_US quantile -> quintile, quartile, quantize
R-Rsolid.src: W: spelling-error %description -l en_US mis -> mus, mos, mid
R-Rsolid.x86_64: W: spelling-error Summary(en_US) Quantile -> Quintile, Quartile, Quantize
R-Rsolid.x86_64: W: spelling-error %description -l en_US dinucleotide -> di nucleotide, di-nucleotide, nucleotide
R-Rsolid.x86_64: W: spelling-error %description -l en_US quantile -> quintile, quartile, quantize
R-Rsolid.x86_64: W: spelling-error %description -l en_US mis -> mus, mos, mid
R-Rsolid-devel.x86_64: W: only-non-binary-in-usr-lib
R-Rsolid-devel.x86_64: W: no-documentation
4 packages and 0 specfiles checked; 0 errors, 10 warnings.

Same as before except that the "mixed-use-of-spaces-and-tabs" is no longer there. The remaining warnings can be ignored. (The "only-non-binary-in-usr-lib" is not new - it was there before too.)


+ New source states the license in the DESCRIPTION file
+ New source matches upstream:

$ md5sum srpm/Rsolid_0.9-2.tar.gz Rsolid_0.9-2.tar.gz 
4a619392ad87865bc4f4c1588199f75a  srpm/Rsolid_0.9-2.tar.gz
4a619392ad87865bc4f4c1588199f75a  Rsolid_0.9-2.tar.gz

+ BuildRequires are now good
+ Package now owns directories it creates

- New specfile lists some files more than once. Warnings during rpmbuild:

warning: File listed twice: /usr/lib64/R/library/Rsolid/DESCRIPTION
warning: File listed twice: /usr/lib64/R/library/Rsolid/html
warning: File listed twice: /usr/lib64/R/library/Rsolid/html/00Index.html
warning: File listed twice: /usr/lib64/R/library/Rsolid/html/Rsolid.html

This is a better %files section:

%files
%defattr(-, root, root, -)
%dir %{_libdir}/R/library/%{packname}
%doc %{_libdir}/R/library/%{packname}/DESCRIPTION
%doc %{_libdir}/R/library/%{packname}/html
%{_libdir}/R/library/%{packname}/INDEX
%{_libdir}/R/library/%{packname}/Meta
%{_libdir}/R/library/%{packname}/R
%{_libdir}/R/library/%{packname}/data
%{_libdir}/R/library/%{packname}/help
%{_libdir}/R/library/%{packname}/libs

+ Package now properly depend on packages owning directories where files are
  installed
+ The relic Requires for the no longer existing scriptlets have been removed

Comment 6 Adam Huffman 2010-07-16 15:36:59 UTC
That was roughly how I had the %files section before but you pointed out that it didn't then own the directory /usr/lib64/R/library/Rsolid.

How about adding extra %exclude entries for those duplicate files instead?

Comment 7 Mattias Ellert 2010-07-16 16:31:47 UTC
I added the

%dir %{_libdir}/R/library/%{packname}

which adds the directory without its contents.

Using %exclude in the way you propose won't work. An %excluded file will not be made part of the package no matter how many times you list it.

Comment 8 Adam Huffman 2010-07-16 17:04:00 UTC
Okay, new version at:

http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid.spec

http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid-0.9.2-4.fc13.src.rpm

The documentation on %exclude that I could find was confusing, and I suppose it shows.

Comment 9 Mattias Ellert 2010-07-16 17:09:47 UTC
Package approved.

Comment 10 Adam Huffman 2010-07-16 19:28:25 UTC
Package Change Request
======================
Package Name: R-Rsolid
Short Description: Quantile normalization and base calling for second generation sequencing data
New Branches:  F-12 F-13 EL-5 EL-6
Owners: verdurin
InitialCC:

Comment 11 Kevin Fenzi 2010-07-17 05:50:24 UTC
You seem to have added a "Package Change Request" when you meant to do a "New Package CVS Request" ? :) This confuses the tool we use to process these. ;) 

Processed manually, cvs done.

Comment 12 Mattias Ellert 2010-11-11 14:20:55 UTC
The package has been built long ago - someone forgot to close the review request.


Note You need to log in before you can comment on or make changes to this bug.