Bug 1065446 - Review Request: hive - Hadoop-compatible data warehouse
Summary: Review Request: hive - Hadoop-compatible data warehouse
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Will Benton
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: bigdata-review 1071456
TreeView+ depends on / blocked
 
Reported: 2014-02-14 16:39 UTC by Pete MacKinnon
Modified: 2014-03-17 17:55 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-03-17 17:55:04 UTC
Type: ---
Embargoed:
willb: fedora-review+
gwync: fedora-cvs+


Attachments (Terms of Use)

Description Pete MacKinnon 2014-02-14 16:39:41 UTC
Spec URL: http://pmackinn.fedorapeople.org/hive/hive.spec
SRPM URL: http://pmackinn.fedorapeople.org/hive/hive-0.12.0-1.fc21.src.rpm
Description: The Apache Hive data warehouse software facilitates querying and 
managing large datasets residing in distributed storage. Apache Hive 
provides a mechanism to project structure onto this data and query 
the data using a SQL-like language called HiveQL.
Fedora Account System Username: pmackinn

Comment 1 Pete MacKinnon 2014-02-19 21:21:35 UTC
Spec URL: http://pmackinn.fedorapeople.org/hive/hive.spec
SRPM URL: http://pmackinn.fedorapeople.org/hive/hive-0.12.0-2.fc21.src.rpm

Updated to add hive executable, shell scripts and conf files.

Comment 2 Pete MacKinnon 2014-02-27 13:25:07 UTC
Spec URL: http://pmackinn.fedorapeople.org/hive/hive.spec
SRPM URL: http://pmackinn.fedorapeople.org/hive/hive-0.12.0-1.fc21.src.rpm

Rowed back to dist #1. Sorry for the confusion. Changes from review will be line-items in the first dist clog entry going forward.

This version removes ivy download step.

Comment 3 Will Benton 2014-02-27 16:37:45 UTC
Pete, thanks for making that change to enable Hive to build without network access.  Unfortunately, this package also fails to build in mock or koji:

http://kojipkgs.fedoraproject.org//work/tasks/6854/6576854/build.log

Comment 5 Will Benton 2014-03-05 22:05:06 UTC
Thanks for your hard work getting this packaged, Pete!  Comments and fedora-review output are below.

Issues and notes
================

* I see what you did there, but suspect there are several possible summary 
  texts for this package that would be more useful than "Apache Hive."

* There is an upstream tarball available from a more-or-less canonical 
  location:
     https://github.com/apache/hive/archive/release-0.12.0.tar.gz
  Switching to this would enable you to use a URL for the Source: tag
  and automate version updates.

* There are several rpmlint errors:

hive.noarch: E: explicit-lib-dependency json-lib
hive.noarch: E: non-executable-script /usr/share/hive/bin/metatool 0644L /usr/bin/env
hive.noarch: E: non-executable-script /usr/share/hive/bin/init-hive-dfs.sh 0644L /usr/bin/env
hive.noarch: E: non-executable-script /usr/share/hive/bin/schematool 0644L /usr/bin/env

  The first of these appears spurious (since json-lib is the name of a
  package and not a library).  I believe you should be able to remove
  the explicit dependencies on Java libraries in any case, since
  packages with Maven fragments should have automatically-generated Requires:
  clauses for dependencies.  (This hasn't always worked how I expect
  it to, though.)  The latter errors should be fixed.

* rpmlint also warns about dangling symlinks, but these warnings are spurious 
  since the links are to libraries that hive requires.

* .mfiles doesn't include %{_javadir}/hive ; please add this to your
  %files with a %dir directive since this package needs to own that dir.

* This package depends on /usr/share/hadoop but does not explicitly
  depend on hadoop-common (which provides that dir).  It does have a
  transitively-carried dependency on hadoop-common (via hadoop-tests),
  though.  If you are going to have explicit Requires: for artifact
  dependencies, consider adding one for hadoop-common.

* Please justify patches, the ARM exclusion, and the absence of %check
  with spec file comments.

Package Review
==============

Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated

===== MUST items =====

Generic:
[x]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[x]: License field in the package spec file matches the actual license.
[x]: License file installed when any subpackage combination is installed.
[x]: Package requires other packages for directories it uses.
[!]: Package must own all directories that it creates.

- see above

[x]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[x]: Sources contain only permissible code or content.
[-]: Package contains desktop file if it is a GUI application.
[-]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[-]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[!]: Requires correct, justified where necessary.

- see above

[x]: Spec file is legible and written in American English.
[-]: Package contains systemd file(s) if in need.
[!]: Package is not known to require an ExcludeArch tag.

- please justify the ExcludeArch with a comment

[x]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
[x]: Package complies to the Packaging Guidelines
[x]: Package successfully compiles and builds into binary rpms on at least one
     supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
[x]: If (and only if) the source package includes the text of the license(s)
     in its own file, then that file, containing the text of the license(s)
     for the package is included in %doc.
[x]: Package does not own files or directories owned by other packages.
[x]: All build dependencies are listed in BuildRequires, except for any that
     are listed in the exceptions section of Packaging Guidelines.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package use %makeinstall only when make install' ' DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package do not use a name that already exist
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as provided
     in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Packages must not store files under /srv, /opt or /usr/local

Java:
[x]: Bundled jar/class files should be removed before build

===== SHOULD items =====

Generic:
[-]: If the source package does not include license text(s) as a separate file
     from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[-]: Fully versioned dependency in subpackages if applicable.
     Note: No Requires: %{name}%{?_isa} = %{version}-%{release} in hive-
     hcatalog , hive-javadoc
[x]: Package functions as described.
[x]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[!]: Patches link to upstream bugs/comments/lists or are otherwise justified.
[x]: SourceX tarball generation or download is documented.
[-]: Description and summary sections in the package spec file contains
     translations for supported Non-English languages, if available.
[x]: Package should compile and build into binary rpms on all supported
     architectures.
[!]: %check is present and all tests pass.
[x]: Packages should try to preserve timestamps of original installed files.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: Dist tag is present (not strictly required in GL).
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: SourceX is a working URL.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Spec file according to URL is the same as in SRPM.

Comment 6 Will Benton 2014-03-06 15:39:56 UTC
Pete, Stanislav tells me that you can use %mvn_artifact to get automatically-generated Requires:

Comment 7 Pete MacKinnon 2014-03-12 16:36:23 UTC
Spec URL: http://pmackinn.fedorapeople.org/hive/hive.spec
SRPM URL: http://pmackinn.fedorapeople.org/hive/hive-0.12.0-1.fc21.src.rpm

Couldn't get the auto-requires working due to various issues including bug 1075626.

* improved summary
* switched to github tar ball
* as you noted json-lib is a false positive simply because of its name! needs to be in there as a BR
* assigned %{_javadir}/hive
* hadoop-common,-mapreduce explicitly added in Reqs
* ARM no longer excluded
* %check with comment added
* comment added for patches

Comment 8 Will Benton 2014-03-17 15:01:32 UTC
Thanks for making these changes, Pete.  I also appreciate the other cleanups you've done in the spec.  The main concern I have remaining is that some of the libraries that hive has symlinked are only required transitively, but I'm just noting it here in case it presents a problem in the future.

Review granted.

Comment 9 Pete MacKinnon 2014-03-17 15:28:16 UTC
New Package SCM Request
=======================
Package Name: hive
Short Description: The Apache Hadoop data warehouse
Owners: pmackinn
Branches:
InitialCC: java-sig

Comment 10 Gwyn Ciesla 2014-03-17 16:11:52 UTC
Git done (by process-git-requests).


Note You need to log in before you can comment on or make changes to this bug.