Bug 1244678 - Review Request: duperemove - Tools for deduping file systems
Summary: Review Request: duperemove - Tools for deduping file systems
Keywords:
Status: CLOSED DUPLICATE of bug 1638987
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Eduardo Mayorga
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-20 09:24 UTC by Francesco Frassinelli (frafra)
Modified: 2018-10-13 19:51 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-13 19:51:33 UTC
Type: ---
e: fedora-review?


Attachments (Terms of Use)

Description Francesco Frassinelli (frafra) 2015-07-20 09:24:38 UTC
Spec URL: https://about.frafra.eu/duperemove.spec
SRPM URL: https://about.frafra.eu/duperemove-0.09.5-1.fc22.src.rpm
Description: Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication
Fedora Account System Username: frafra

Comment 1 Eduardo Mayorga 2015-07-30 21:39:27 UTC
Now you must use the %license macro to include the license text instead of %docs.
See: https://fedoraproject.org/wiki/Packaging:LicensingGuidelines?rd=Packaging/LicensingGuidelines#License_Text

Comment 2 Francesco Frassinelli (frafra) 2015-07-30 22:41:21 UTC
(In reply to Eduardo Mayorga from comment #1)
> Now you must use the %license macro to include the license text instead of
> %docs.
> See:
> https://fedoraproject.org/wiki/Packaging:LicensingGuidelines?rd=Packaging/
> LicensingGuidelines#License_Text

Fixed, thank you.

Spec URL: https://about.frafra.eu/duperemove.spec
SRPM URL: https://about.frafra.eu/duperemove-0.09.5-2.fc22.src.rpm

Comment 3 Eduardo Mayorga 2015-08-01 20:50:33 UTC
Package Review
==============

Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated


Issues:
=======
- Bundled libraries:
  list.h; provided by kernel-devel
  rbtree.c, rbtree.h; provided by kcbench-data
- Correct license is GPLv2. GPLv2+ corresponds to bundled rbtree* files.
- Package must not own /usr/share/man/man8.
  Suggestion: %{_mandir}/man8/*.8*
- install commands in Makefile do not preserve timestamps with the -p parameter.
- CFLAGS and LDFLAGS not set. 

===== MUST items =====

C/Cou++:
[x]: Package does not contain kernel modules.
[x]: Package contains no static executables.
[x]: Package does not contain any libtool archives (.la)
[x]: Rpath absent or only used for internal libs.

Generic:
[x]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[!]: License field in the package spec file matches the actual license.
[!]: Package does not own files or directories owned by other packages.
     Note: This package is owning /usr/share/man/man8(filesystem).
[!]: %build honors applicable compiler flags or justifies otherwise.
     Note: CFLAGS and LDFLAGS not set.
[!]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[x]: Sources contain only permissible code or content.
[-]: Package contains desktop file if it is a GUI application.
[-]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory
     names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[-]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[x]: Requires correct, justified where necessary.
[x]: Spec file is legible and written in American English.
[-]: Package contains systemd file(s) if in need.
[x]: Useful -debuginfo package or justification otherwise.
[x]: Package is not known to require an ExcludeArch tag.
[-]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
     Note: Documentation size is 20480 bytes in 3 files.
[!]: Package complies to the Packaging Guidelines
     See Issues above.
[x]: Package successfully compiles and builds into binary rpms on at least
     one supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
     Note: There are rpmlint messages (see attachment).
[x]: If (and only if) the source package includes the text of the
     license(s) in its own file, then that file, containing the text of the
     license(s) for the package is included in %license.
[x]: Package requires other packages for directories it uses.
[x]: Package must own all directories that it creates.
[x]: All build dependencies are listed in BuildRequires, except for any
     that are listed in the exceptions section of Packaging Guidelines.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Dist tag is present.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package use %makeinstall only when make install DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package does not use a name that already exists.
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as
     provided in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Packages must not store files under /srv, /opt or /usr/local

===== SHOULD items =====

Generic:
[-]: If the source package does not include license text(s) as a separate
     file from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[x]: Package functions as described.
[x]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[-]: Description and summary sections in the package spec file contains
     translations for supported Non-English languages, if available.
[x]: Package should compile and build into binary rpms on all supported
     architectures.
[-]: %check is present and all tests pass.
[!]: Packages should try to preserve timestamps of original installed
     files.
     Note: install commands in Makefile do not preserve timestamps with the -p parameter.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Sources can be downloaded from URI in Source: tag
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: Uses parallel make %{?_smp_mflags} macro.
[x]: SourceX is a working URL.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on debuginfo package(s).
     Note: There are rpmlint messages (see attachment).
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Large data in /usr/share should live in a noarch subpackage if package
     is arched.
[x]: Spec file according to URL is the same as in SRPM.


Rpmlint
-------
Checking: duperemove-0.09.5-2.fc24.x86_64.rpm
          duperemove-0.09.5-2.fc24.src.rpm
duperemove.x86_64: W: spelling-error Summary(en_US) deduping -> deducing, deducting
duperemove.x86_64: W: spelling-error %description -l en_US deduplication -> reduplication, duplication, quadruplication
duperemove.x86_64: W: spelling-error %description -l en_US btrfs -> barfs
duperemove.x86_64: W: invalid-url URL: https://github.com/markfasheh/duperemove <urlopen error ('_ssl.c:574: The handshake operation timed out',)>
duperemove.x86_64: E: standard-dir-owned-by-package /usr/share/man/man8
duperemove.src: W: spelling-error Summary(en_US) deduping -> deducing, deducting
duperemove.src: W: spelling-error %description -l en_US deduplication -> reduplication, duplication, quadruplication
duperemove.src: W: spelling-error %description -l en_US btrfs -> barfs
duperemove.src: W: invalid-url URL: https://github.com/markfasheh/duperemove <urlopen error [Errno -5] No address associated with hostname>
duperemove.src:39: W: macro-in-%changelog %license
2 packages and 0 specfiles checked; 1 errors, 9 warnings.




Rpmlint (debuginfo)
-------------------
Checking: duperemove-debuginfo-0.09.5-2.fc24.x86_64.rpm
duperemove-debuginfo.x86_64: E: incorrect-fsf-address /usr/src/debug/duperemove-0.09.5/rbtree.c
duperemove-debuginfo.x86_64: E: incorrect-fsf-address /usr/src/debug/duperemove-0.09.5/rbtree.h
1 packages and 0 specfiles checked; 2 errors, 0 warnings.





Rpmlint (installed packages)
----------------------------
duperemove-debuginfo.x86_64: W: invalid-url URL: https://github.com/markfasheh/duperemove <urlopen error [Errno -5] No address associated with hostname>
duperemove-debuginfo.x86_64: E: incorrect-fsf-address /usr/src/debug/duperemove-0.09.5/rbtree.h
duperemove-debuginfo.x86_64: E: incorrect-fsf-address /usr/src/debug/duperemove-0.09.5/rbtree.c
duperemove.x86_64: W: invalid-url URL: https://github.com/markfasheh/duperemove <urlopen error [Errno -5] No address associated with hostname>
duperemove.x86_64: E: standard-dir-owned-by-package /usr/share/man/man8
2 packages and 0 specfiles checked; 3 errors, 2 warnings.



Requires
--------
duperemove (rpmlib, GLIBC filtered):
    libc.so.6()(64bit)
    libdl.so.2()(64bit)
    libgcrypt.so.20()(64bit)
    libgcrypt.so.20(GCRYPT_1.6)(64bit)
    libglib-2.0.so.0()(64bit)
    libgpg-error.so.0()(64bit)
    libgthread-2.0.so.0()(64bit)
    libpthread.so.0()(64bit)
    rtld(GNU_HASH)



Provides
--------
duperemove:
    duperemove
    duperemove(x86-64)



Source checksums
----------------
https://github.com/markfasheh/duperemove/archive/v0.09.5.tar.gz#/duperemove-0.09.5.tar.gz :
  CHECKSUM(SHA384) this package     : fdd7f79f1bc368e89bd60eb0382a07b2d86a47ecee3604c79172fe5d9235041feb1d0d2f96ed0588c9933d7f909954f5
  CHECKSUM(SHA384) upstream package : fdd7f79f1bc368e89bd60eb0382a07b2d86a47ecee3604c79172fe5d9235041feb1d0d2f96ed0588c9933d7f909954f5

Comment 4 Francesco Frassinelli (frafra) 2015-11-13 18:33:25 UTC
Thanks for your review. I fixed every issue except the problem with those bundled libraries.

rbtree.c should be related with an older kernel, but it's not there, and I don't think requiring kcbench-data is a good idea (it contains a copy of Linux 4.0 and it requires more than 100 MB of dependencies).

I can remove the remaining files with those commands:
sed -i 's/rbtree\.h //' Makefile
sed -i 's/list\.h\ //' Makefile
rm rbtree.h list.h

My problem is how to compile the program. I can include a bunch of kernel-headers directories, but I get a huge list of errors. I'm executing something like this:
make LIBRARY_FLAGS="-I/lib/modules/`uname -r`/build/include -I/lib/modules/`uname -r`/build/include/linux"

There's another problem related to bundled libraries: the program now includes (in the latest version - 0.10) libxxhash and libbloom. None of them are not included in Fedora. Should I package them?

Latest version:
https://frafra.fedorapeople.org/copr/duperemove.spec
https://frafra.fedorapeople.org/copr/duperemove-0.10-1.fc23.src.rpm

Comment 5 James Hogarth 2016-01-14 14:34:18 UTC
I'd say libbloom and libxxhash realistically should be packaged separately and then this blocked against those...

Note that btrfs-progs-devel includes rbtree.h ... you might try building against that (though I've not had a chance to check the exposed functions yet) ...

It might be better to have pkgconfig(glib-2.0) and pkgconfig(sqlite3) in the BR instead of naming the packages specifically.

Rather than naming SBINDIR and MANDIR specifically you can just pass PREFIX=%{_prefix} in to the %make_install 

Especially given how few files there are (only 8 including the man pages) I'd specifically list them rather than doing general globs of %{_mandir} and %{_sbindir} to avoid accidents in capturing too much during packaging.

Note that there is a csum-test which can be used to check the algorithm is being used correctly ... adding a %check to the spec file with something  akin to this appears to work building against an F23 mock of the git master:

%check
dd if=/dev/urandom of=4kfilerand count=1 bs=4096
tgt_hash="$(sha256sum ./4kfilerand)"
[[ "$(./csum-test -b 1024 --hash=sha256 ./4kfilerand | grep 4kfilerand)" == "${tgt_hash}" ]]
[[ "$(./csum-test -b 4096 --hash=sha256 ./4kfilerand | grep 4kfilerand)" == "${tgt_hash}" ]]

Note there are dedup patches being proposed for 4.5 on the btrfs mailing list at the moment so it's not actually clear if this particular tool has a future anyway if btrfs-progs gets an actual dedup subcommand...

I've sent a mail into the list to get clarification on this.

Comment 6 James Hogarth 2016-01-21 15:00:57 UTC
The dedupe stuff happening is solely around in-band so this tool still has a use.

Francesco are you still interested in packing this (and now the two dependencies identified of course, I'd defend the use of the rbtree stuff though realistically since it's not linked in via those headers).

Comment 7 Francesco Frassinelli (frafra) 2016-01-21 15:09:51 UTC
(In reply to James Hogarth from comment #6)
> Francesco are you still interested in packing this

Sure I am! I'm waiting for https://bugzilla.redhat.com/show_bug.cgi?id=1282063

Comment 8 James Hogarth 2016-01-21 15:24:32 UTC
(In reply to Francesco Frassinelli (frafra) from comment #7)
> (In reply to James Hogarth from comment #6)
> > Francesco are you still interested in packing this
> 
> Sure I am! I'm waiting for
> https://bugzilla.redhat.com/show_bug.cgi?id=1282063

I've marked it as blocking to make it clear :)

Do you have the other library in review to block as well?

Comment 9 Francesco Frassinelli (frafra) 2016-01-21 15:30:05 UTC
(In reply to James Hogarth from comment #8)
> I've marked it as blocking to make it clear :)

Thank you :)

> Do you have the other library in review to block as well?

No, I'm sorry.

Comment 10 James Hogarth 2016-09-21 09:09:43 UTC
Hi Francesco

With the changes to policy in the past 6 months or so I'm removing the blocker ... especially since the included xxhash is so far behind the xxhash upstream it's just painful trying to unbundle (see https://github.com/markfasheh/duperemove/issues/146).

I have some time to finalise this now - do you have time to proceed with it or would you prefer I carried out a fresh review request to push through, and then when you have further time I can add you as a co-maintainer?

Comment 11 Francesco Frassinelli (frafra) 2016-12-31 17:59:58 UTC
Hi James,
I tried to include rbtree.h and list.h from btrfs-progs-devel, with no success (warnings and errors). I think it would be quicker to let you do the job, because I am not sure how to proceed (I think we should use kernel-devel as building dependency, but I don't know how to include the sources correctly) :)

Comment 12 gabe 2017-12-05 05:48:08 UTC
Has there been any progress on this?

Comment 13 Eric Sandeen 2018-03-07 15:58:11 UTC
XFS has removed the "experimental" tag from reflink, so for filesystems created with the reflink feature, duperemove can now be used on XFS as well.  So I'd really like to see this move forward; anything I can do to help?

Comment 14 Eric Sandeen 2018-03-07 16:21:11 UTC
xxhash has been packaged for fedora:

https://src.fedoraproject.org/rpms/xxhash

and libbloom has been removed from duperemove:

646f547 Remove bloom filter code as we no longer use it

Comment 15 Jonathan Dieter 2018-09-10 11:30:43 UTC
Not sure if anyone is still interested in packaging this, but I have updated packages that unbundle xxhash.

https://www.jdieter.net/downloads/duperemove.spec
https://www.jdieter.net/downloads/duperemove-0.11-1.fc28.src.rpm

If Francesco isn't interested, but someone is interested in reviewing, I would be happy to take this.

Comment 16 Jonathan Dieter 2018-10-12 09:40:21 UTC
This review appears to be stalled.  Francesco, if you're still interested in packaging this, I'd be happy to help, but, if not, I would like to take it over.  Please respond one way or another within a week.

https://fedoraproject.org/wiki/Policy_for_stalled_package_reviews

Comment 17 Francesco Frassinelli (frafra) 2018-10-13 11:09:06 UTC
(In reply to Jonathan Dieter from comment #16)
> This review appears to be stalled.  Francesco, if you're still interested in
> packaging this, I'd be happy to help, but, if not, I would like to take it
> over.  Please respond one way or another within a week.
> 
> https://fedoraproject.org/wiki/Policy_for_stalled_package_reviews

Sorry for the late reply, I just moved to another country and I am pretty busy. Feel free to continue the process.

Comment 18 Jonathan Dieter 2018-10-13 19:51:33 UTC
Francesco, thanks so much for responding so quickly.  I've opened up a new review request and would appreciate a review from anyone still interested in getting this into Fedora.

*** This bug has been marked as a duplicate of bug 1638987 ***


Note You need to log in before you can comment on or make changes to this bug.