Bug 966143

Summary: Many duplicate files lead to unreasonable increase of size of a publican installation, Publican should share common files
Product: [Community] Publican Reporter: Raphaël Hertzog <raphael>
Component: publicanAssignee: Jeff Fearn 🐞 <jfearn>
Status: CLOSED CURRENTRELEASE QA Contact: Petr Bokoc <pbokoc>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.1CC: aigao, jfearn, pbokoc, rlandman
Target Milestone: 3.2   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: 3.2.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-09 04:47:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Raphaël Hertzog 2013-05-22 15:08:19 UTC
With the official Debian package, an installation of publican 2.8 weighs 6 Mb. An installation of publican 3.1 weighs 50 Mb.

It looks like most of the size increase is due to duplicated files in various translations like /usr/share/publican/Common_Content/common-db5/<lang>/images/title_logo.png

This has been initially reported to Debian in http://bugs.debian.org/708705
http://dedup.debian.net/compare/publican/publican provides a list of files which are duplicated once or more in the binary package.

Some of them are attributable to (bad) choices of me (like installing all versions of the publican user's guide) but many of them concern files in Common_Content which are duplicated dozens of times.

It would be nice if Publican could avoid this duplication, for example by supporting Common_Content/common/default/* that would be copied first and then overwritten by Commont_Content/common/<lang>/*. That way we can put the basic files in that directory and avoid the duplication between all translations.

Comment 2 Jeff Fearn 🐞 2013-07-04 06:17:58 UTC
dev note: there is no need for the content in /usr/share to be duplicated per language. The build process already copies the source language content and then copies the translation language content over the top.

This is just a flat out bug.

Comment 3 Jeff Fearn 🐞 2013-07-11 02:41:57 UTC
(In reply to Jeff Fearn from comment #2)
> dev note: there is no need for the content in /usr/share to be duplicated
> per language. The build process already copies the source language content
> and then copies the translation language content over the top.
> 
> This is just a flat out bug.

Actually I am wrong. I was thinking this was in /usr/share/publican but it's in /usr/share/doc/publican. 

/usr/share/doc/publican contains static HTML. You have no guarantee at this point that any other language, or any other output format, is installed, so you have to assume you need all the content for each format in each language.

You will note that /usr/share/publican/CommonContent does not have this duplication, because that is source and the logic in #1 is used during the build phase.

This would be a very intrusive change to make on the normal output and would probably make more sense to do as a specific output format for this use case.

Comment 4 Raphaël Hertzog 2013-07-11 06:21:04 UTC
No, no, you were right. /usr/share/publican contains most of the problematic duplication, the duplication in /usr/share/doc/ is expected, the documentation must be stand-alone.

dpkg -L publican | sed -ne 's|/usr/share/publican/Common_Content/.*/||p' | sort | uniq -c | sort -nr|head
    107 40.svg
    107 40.png
    107 39.svg
    107 39.png
    107 38.svg
    107 38.png
    107 37.svg
    107 37.png
    107 36.svg
    107 36.png

Comment 5 Jeff Fearn 🐞 2013-07-11 06:28:31 UTC
ah then it's a bug in the brand publishing code. Not so hard to fix.

Comment 6 HSS Product Manager 2013-07-11 06:37:34 UTC
HSS-QE has reviewed and declined this request. QE for this bug will be handled by IED.

Comment 7 Jeff Fearn 🐞 2013-07-12 00:06:47 UTC
Stopped brand install from duplicating css, images, & scripts sources.

To ssh://git.fedorahosted.org/git/publican.git
   b85fec3..4bd6cc7  HEAD -> devel

Comment 8 Petr Bokoc 2013-07-22 11:34:21 UTC
$ rpm -qi publican | grep Size
Size        : 3093902

Publican now seems to be vastly reduced in size. Verified in publican-3.1.5-0.fc17.t62.noarch.

Comment 9 Jeff Fearn 🐞 2013-08-09 04:47:12 UTC
The fix for this bug has been shipped in publican 3.2.0