Bug 657636

Summary: cgi-bin/check takes too long to complete (due to network requests to w3.org)
Product: [Fedora] Fedora Reporter: David Dick <ddick>
Component: w3c-markup-validatorAssignee: Ville Skyttä <ville.skytta>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: ville.skytta
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: w3c-markup-validator-1.1-2.fc14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-06 19:58:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Extract from cgi-bin/check that shows the issue none

Description David Dick 2010-11-26 22:09:10 UTC
Created attachment 463153 [details]
Extract from cgi-bin/check that shows the issue

Description of problem:


Version-Release number of selected component (if applicable):

w3c-markup-validator-1.1-1.fc14.noarch
w3c-markup-validator-libs-1.1-1.fc14.noarch
xhtml1-dtds-1.0-20020801.5.noarch
libxml2-2.7.7-2.fc14.i686
perl-XML-LibXML-1.70-5.fc14.i686

How reproducible:


Steps to Reproduce:
1. Observe the time required to validate

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-au" lang="en-au">
<head><title></title>
</head>
<body>
</body>
</html>
  
Actual results:

Over 20 seconds

Expected results:

Less than a second

Additional info:

Using 'netstat -t | grep w3.org', network requests to w3.org can clearly be observed, despite the fact that the local /usr/share/sgml/w3c-markup-validator/REC-xhtml1-20020801/xhtml1-transitional.dtd file exists.

The attached perl script (extracted from cgi-bin/check) shows where the network requests occur in cgi-bin/check.  Although the majority of the code is calls to the XML::LibXML library, the network requests are caused by including '/usr/share/sgml/w3c-markup-validator/catalog.xml' in the XML_CATALOG_FILES environment variable.

Comment 1 Ville Skyttä 2010-11-27 21:28:24 UTC
Including the validator's catalog in XML_CATALOG_FILES is quite essential, but I believe we're using the wrong syntax for it; instead of colons, we should be using spaces as the separator for values in it.

Could you check if changing the colon to a space in the "check" fixes it for you?  It does seem to do the right thing for me.

-    local $ENV{XML_CATALOG_FILES} = "/etc/xml/catalog:" .
+    local $ENV{XML_CATALOG_FILES} = "/etc/xml/catalog " .

Comment 2 David Dick 2010-11-27 21:48:12 UTC
Excellent! Problem solved! Thanks a lot

Comment 3 Fedora Update System 2010-11-28 10:29:23 UTC
w3c-markup-validator-1.1-2.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/w3c-markup-validator-1.1-2.fc14

Comment 4 Fedora Update System 2010-11-28 20:41:02 UTC
w3c-markup-validator-1.1-2.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update w3c-markup-validator'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/w3c-markup-validator-1.1-2.fc14

Comment 5 Fedora Update System 2010-12-06 19:58:15 UTC
w3c-markup-validator-1.1-2.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.