Bug 859246 - Review Request: cmap - Adobe pdf character mapping data
Summary: Review Request: cmap - Adobe pdf character mapping data
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Nobody's working on this, feel free to take it
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 975266
Blocks: 823679
TreeView+ depends on / blocked
 
Reported: 2012-09-20 21:36 UTC by Ben Rosser
Modified: 2014-08-16 21:46 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-16 21:46:11 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Ben Rosser 2012-09-20 21:36:10 UTC
Spec URL: http://venus.arosser.com/fedora/cmap/cmap.spec
SRPM URL: http://venus.arosser.com/fedora/cmap/cmap-1.6-0.fc17.src.rpm

Description: CMap (Character Map) resources are used to unidirectionally
map character codes, such as a Unicode encoding form, to CIDs
(Characters IDs, meaning glyphs) of a CIDFont resource.

This is an Adobe project that's been bundled in a lot of Linux software: off the top of my head, poppler (poppler-data) and python-pdfminer (which I have an open review request for).

Fedora Account System Username: tc01

--

The package as it is works, but there are a few things about this one that I'm not quite sure about:

-Each cmap resources file has a different version attached to it, so I used the newest version number (1.6). Perhaps the package should be split up, or a better versioning system used? (Snapshot style)?

-Currently, the only package that's been updated to work with it is my own pdfminer: https://bugzilla.redhat.com/show_bug.cgi?id=823679. There are other packages, such as poppler-data, that require changing to support this one. So maybe there are %build steps I'll need to add to make it work well with other packages?

Comment 1 Michael Schwendt 2013-01-23 00:17:06 UTC
I'm afraid I'm not familiar with this or poppler-data to help.

It could be an idea to talk to the poppler-data maintainers (it's a separate poppler-data src.rpm) and also figure out whether there are any requirements related to the names of the directories where the files are stored. For poppler-data the files are stored in a private path /usr/share/poppler/cMap/. The paths of this package don't have much in common.

  $ rpmls poppler-data |grep Adobe-CNS1-0
  lrwxrwxrwx  /usr/share/ghostscript/9.06/Resource/CMap/Adobe-CNS1-0
  -rw-r--r--  /usr/share/poppler/cMap/Adobe-CNS1/Adobe-CNS1-0

  $ rpmls -p cmap-1.6-0.fc19.noarch.rpm |grep Adobe-CNS1-0
  -rw-r--r--  /usr/share/cmap/ac16/CMap/Adobe-CNS1-0

It's also unlikely that anything knows where to find the files due to those extra topdir below /usr/share/cmap/, such as "ac16". Likely some symlinking magic will be necessary.

The texlive-adobemapping package (subpackage of texlive) also contains cmap files.

[...]

The packaging is somewhat strange. If one examines the rpmbuild output:

Executing(%prep): ...
+ umask 022
+ cd /home/misc/tmp/rpm/BUILD
+ cd /home/misc/tmp/rpm/BUILD
+ mkdir -p cmap-1.6
+ cd /home/misc/tmp/rpm/BUILD
+ rm -rf cmap-1.6
+ /usr/bin/mkdir -p cmap-1.6
+ cd cmap-1.6
...
and so on. There are some redundancies. Let's see:

> %prep
> cd %{_builddir}

Not necessary. At the beginning of %prep, you are automatically inside %_builddir.

> mkdir -p %{name}-%{version}

Not necessary. The following line creates this directory, too:

> %setup -n cmap-1.6 -q -c

"-n cmap-1.6" is the default for your package, because "-n %{name}-%{version}" is the default.

> %setup -n cmap-1.6 -q -T -D -a 1
> %setup -n cmap-1.6 -q -T -D -a 2
> %setup -n cmap-1.6 -q -T -D -a 3
> %setup -n cmap-1.6 -q -T -D -a 4

There's also an inconsistency in that you mix %{name}-%{version} and the hardcoded cmap-1.6. It looks a bit like a result of trial-and-error.

The following %prep section could replace all your individual lines:

  %prep
  %setup -q -c -a 1 -a 2 -a 3 -a 4

[...]

> BuildRequires:  tar

Not needed.  https://fedoraproject.org/wiki/Packaging:Guidelines#Exceptions_2


> %install
> ...
> cp -r %{_builddir}/%{name}-%{version}/* %{buildroot}%{_datadir}/%{name}

At the beginning of %install, you're not just in %_builddir already but in the directory as specified during %setup. You could reduce the cp line to:

   cp -pr * %{buildroot}%{_datadir}/%{name}

Option -p is typically used to preserve timestamps of pre-existing files.


> %doc %{_docdir}/%{name}/

Files below %_docdir are automatically marked as %doc. One could argue whether to install documentation in a non-versioned /usr/share/cmap directory and not a versioned dir like thousands of other packages do it. There are packaging tricks to achieve that even for separate doc %SOURCE files. For example:

  %install
  ...
  rm -rf _tmpdoc ; mkdir _tmpdoc
  install -p -m0644 %SOURCE5 %SOURCE6 _tmpdoc
  ...

  %files
  ...
  %doc _tmpdoc/*


> a better versioning system used? (Snapshot style)?

Certainly sounds like a good idea, IMO. "Version: 0" and the download/checkout date as part of "Release".

Comment 2 Ben Rosser 2013-01-29 21:56:05 UTC
I've removed the redundancies and other inconsistencies. I also changed the version and release tags to reflect the date I originally checked the package out; I'm not entirely sure I've done this properly... is it supposed to be:

Version: 0
Release: 1.%{snapshot}%{?dist}

Where %{snapshot} is the snapshot date (and 1 is the release number)?

Anyway, here's the updated spec and RPM:

Spec File: http://venus.arosser.com/fedora/cmap/cmap.spec
Source RPM: http://venus.arosser.com/fedora/cmap/cmap-0-1.20120920snap.fc18.src.rpm

The texlive package seems to use the "ac16", etc. subdirectories, so it probably wouldn't be too drastic a change to pull in this package.

I'll email both the texlive and poppler maintainers about how to integrate this package.

Comment 3 Jason Tibbitts 2013-05-23 11:19:23 UTC
I am triaging old review tickets.  I can't promise a review if you reply, but by closing out the stale tickets we can devote extra attention to the ones which aren't stale.

This is another one of those packages which shouldn't have sat around for so long.  The only real question, I guess, is the issue of versioning, and it's kind of tricky.  The individual components of the package have their own versions, but those are grouped into separate tarballs and those tarballs (which also appear to be versioned) are grouped into this package.

So, the first question would be whether there is any point at all in separately packaging the tarballs?  Usually we say that you should have one package per upstream tarball, especially if they are not released all on the same schedule.  Judging from the list of files on the sourceforge page, that does seem to be the case, but then I have no idea how useful just one of those files would be without all of the other stuff in the package.

If the best course of action is to have everything together in one package (which I'm not really sold on, honestly), then the question arises as to what version to use.  I think what you've done is fine, though.  You can never guarantee that the project won't release with an overall version, so you need to stick to '0' and use the releasse to order/date things. This isn't any kind of prerelease, so there's no point in using release numbers that all start with '0'.  So 0-1.DATEsnap is OK.

Comment 4 Ben Rosser 2013-05-23 21:24:22 UTC
I think it probably could be split into separate packages- I know that python-pdfminer, the package I was working on that uses cmap, would have dependencies on only four of the tarballs.

If we do that, would it make sense to have one cmap package that installs all five cmap-foo packages? (Perhaps have the others be subpackages? I'm not entirely familiar with what the right way to do something like that would be).

Also, there isn't really any documentation (that I could find) other than the text and markdown file in the downloads directory, so what sort of documentation would the subpackages have?

There's also the question of integrating with other packages that bundle cmap, of course.

Comment 5 Jason Tibbitts 2013-05-30 19:51:20 UTC
I thought I sent this ages ago, but I saw it still sitting in the browser.  Sorry for that.

I honestly don't have any idea which way would be best, but I wouldn't object to a metapackage for this kind of thing even though they are generally frowned upon.  This just seems to be one of the cases where it makes the most sense.  Note that I wasn't suggesting they be subpackages, but simply N different packages (one per independent, separately released and versioned tarball).  Which does mean a number of reviews, but I'd take care of them for you.

As for documentation, whatever is appropriate makes sense.  Since they would be independent, you'd need licensing information if it's included anywhere, and I say that the document indicating current versions of each of the tables within the tarballs would be appropriate to go in each of the packages, since it applies separately.

Comment 6 Ben Rosser 2013-06-18 01:12:51 UTC
Well, I've submitted the first cmap package- I have written the other specs but I figure it makes sense to work through any issues with the first one, fix any issues, and then submit the others with them all fixed.

https://bugzilla.redhat.com/show_bug.cgi?id=975266

I haven't updated the pdfminer spec yet.

Comment 7 Till Maas 2013-10-21 19:57:43 UTC
Can this review be closed, because of bug 975266
 and others to follow?

Comment 8 Ben Rosser 2014-08-16 21:46:11 UTC
I think this can be closed now. 

cmap is now packaged in five different packages, installed into /usr/share/cmap.

cmap-japan1-6
cmap-korean1-2
cmap-gb1-5
cmap-cns1-6
cmap-identity0

I'm going to close the ticket.


Note You need to log in before you can comment on or make changes to this bug.