Bug 1629642 - Module version generator should evaluate $VERSION assignment
Summary: Module version generator should evaluate $VERSION assignment
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: perl-generators
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jitka Plesnikova
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-17 08:49 UTC by Petr Pisar
Modified: 2023-03-09 17:35 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1629345 0 unspecified CLOSED Incorrect "Provides" Versions 2021-02-22 00:41:40 UTC

Internal Links: 1629345

Description Petr Pisar 2018-09-17 08:49:41 UTC
Many Perl modules uses very indirect way for declaring module versions. E.g. Encode-2.98's Encode::Byte uses:

our $VERSION = do { my @r = ( q$Revision: 2.4 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r };

Thus the intended module version is "2.04", while current perl-generator sees "2.4". These two Perl versions have different meaning (2.040.000 version 2.400.000).

It would be great if perl-generators evaluated the "our $VERSION =" lines by perl and used that value instead of parsing the lines. This is how CPAN extracts the versions.

Be ware that this can lead to executing any arbitrary code (e.g. executing external commands). There can be used some countermeasures like "Safe" module or running the eval in a forked process, but these cannot prevent from all the attack vectors.

On the other hand, the generator is usually executed by rpmbuild after executing Makefile.PL and other later scanned code, thus the use case of building RPM packages does not posses any new security issues.

Comment 1 Will Braswell 2018-09-17 19:10:21 UTC
Actually, you do not need to evaluate $VERSION at all, you can use the (much safer) MetaCPAN API instead!

Background: I caused this bug report to be opened by ppisar when I discovered the incorrect Perl module version numbers embedded in the perl-Encode RPM package.  I am the author of the RPerl compiler, as well as the CPANtoFPM packaging system which automatically packages dozens (or hundreds) of CPAN distributions into the corresponding RPMs by use of the official MetaCPAN API.

CPANtoFPM is based on the cross-platform FPM packaging system, which is used for generating RPM packages, DEB packages, and many other packing formats.  Because of the cross-platform nature of FPM, it does not use the automatic capabilities of the perl-generators software, instead relying on its own Ruby implementation to generate the RPM specfile, etc.  However, even though FPM does not use perl-generators itself, our generated RPMs must still be 100% compatible with the already-existing RPMs in the base repo which do use perl-generators, and this is how I discovered the incorrect version numbers in the ubiquitous perl-Encode RPM.  (I assume the same problem of incorrect version numbers may be present in many other perl-* RPM packages, we just haven't found them yet.)

Simply put, the version numbers currently created by perl-generators is incorrect, due to the differences between RPM versioning and Perl versioning.  Thankfully, it is not very difficult to fix this problem, and I have already done so in my own CPANtoFPM software!  :-)

The perceived security risk of executing arbitrary Perl code can be completely bypassed by using the MetaCPAN API to make a simple JSON query.

For example, to find all of the Perl modules (.pm files) provided by the Encode distribution on CPAN, simply fetch this URL:

https://fastapi.metacpan.org/v1/release/Encode

In the JSON data returned, you can see the "provides" data element is a string array of all the Perl modules provided by the Encode distribution, including "Encode" itself (in the file "Encode.pm"), along with "Encode::Byte" (in the file "Encode/Byte.pm") and many others.

Then, for each of these modules listed in the above API request, you can formulate another URL to fetch the correct module versions, for example with the Encode::Byte module in question:

https://fastapi.metacpan.org/v1/module/Encode::Byte

You can see the following data returned among the JSON output:

[[[ BEGIN JSON CODE ]]]
   "module" : [
      {
         "version" : "2.04",
         "authorized" : true,
         "indexed" : true,
         "version_numified" : 2.04,
         "associated_pod" : "DANKOGAI/Encode-2.98/Byte/Byte.pm",
         "name" : "Encode::Byte"
      }
   ],
[[[ END JSON CODE ]]]

As you can see, the "module->version" data element contains the string with the REAL version you want, in this case "2.04", and you do not have to execute any potentially-hazardous Perl code whatsoever!  In other words, you MetaCPAN evaluate $VERSION for you. :-)

(Please note, do NOT use the "version" data element, because this is actually the Encode distribution's version, not the Encode::Byte module's version.  Also, do NOT use the "module->version_numified" data element, because it is numeric data instead of string data and may have already undergone incorrect loss of significant digits which are preserved only in the "module->version" string.)

You can also save time and bandwidth by combining all of the module version API requests into one single http fetch using the POST method, as shown in the following Ruby code snippet:

[[[ BEGIN RUBY CODE ]]]
# call metacpan API to find all modules belonging to distribution
metacpan_search_url = "https://fastapi.metacpan.org/v1/module/_search"
metacpan_search_query = <<-EOL
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "exists" : { "field" : "module" }
            }
        }
    },
    "size": 5000,
    "_source": [ "name", "module.name", "module.version" ],
    "filter": {
        "and": [
            { "term": { "distribution": "#{VARIABLE_CONTAINING_DISTRIBUTION_NAME}" } },
            { "term": { "maturity": "released" } },
            { "term": { "status": "latest" } }
        ]
    }
}
EOL
search_response = httppost(metacpan_search_url,metacpan_search_query)
[[[ END RUBY CODE ]]]

The output for this POST request is a bit more complex, but all of the needed version data is there.  As mentioned above, I have already written all the Ruby code for parsing the JSON returned by all 3 of the above MetaCPAN API requests, and will be happy to share as needed.

Unfortunately, at this time the RPM versioning for Perl packages is truly "broken", but as you can see we can fix it with just a few extra lines of code!  Please let me know how I can be of assistance in fixing this bug, let's work together to make the RPM-and-Perl world a better place for everyone!  Thank you for your time and attention.

Comment 2 Dan Book 2018-09-17 19:53:24 UTC
The correct way to interpret a Perl module version is to follow what the PAUSE indexer does. This is to first check META.json (falling back to META.yml) for a 'provides' entry, in which case it will be trusted to provide the versions of all included modules. Failing that (many dists don't include provides metadata), the first line containing $VERSION after each package declaration is executed on its own and the resulting value of $VERSION is used for the version of that package. This logic is usefully encapsulated by https://metacpan.org/pod/Dist::Metadata and similarly used by MetaCPAN to provide versions in its API. As such there are many distributions like this one which build their version assignment line with the expectation that it will be executed as Perl code.

Comment 3 Will Braswell 2018-09-18 17:59:21 UTC
I will defer to Dan Book, he is an accepted Perl expert in this area.  We have discussed this issue in depth and others have confirmed his assessment as accurate.

Also, my proposed MetaCPAN API solution will only work if the package-building machines have access to the Internet, which may or may not be a limiting factor.


Note You need to log in before you can comment on or make changes to this bug.