Bug 1537853 - libtidy-5.6.0-2.fc27.x86_64 warnings and breakage/output-changes
Summary: libtidy-5.6.0-2.fc27.x86_64 warnings and breakage/output-changes
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: tidy
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Rex Dieter
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-24 01:39 UTC by Harald Reindl
Modified: 2019-11-27 22:23 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-11-27 22:23:46 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
test.php to see the difference between 5.4 and 5.6 (7.81 KB, application/octet-stream)
2018-12-04 10:12 UTC, Harald Reindl
no flags Details

Description Harald Reindl 2018-01-24 01:39:26 UTC Comment hidden (abuse)
Comment 1 Harald Reindl 2018-01-24 01:46:32 UTC Comment hidden (abuse)
Comment 2 Rex Dieter 2018-01-24 12:45:43 UTC
I can't seem to reproduce it with a minimal test case, I'll try with the same options you're using now.

fwiw, looks like drop-font-tags was indeed removed recently, previously it contained this in docs,

-        TidyDropFontTags,             0,
-        "Deprecated; <em>do not use</em>. This option is destructive to "
-        "<code>&lt;font&gt;</code> tags, and it will be removed from future "
-        "versions of Tidy. Use the <code>clean</code> option instead. "
-        "<br/>"

Comment 3 Harald Reindl 2018-01-24 12:52:36 UTC Comment hidden (abuse)
Comment 4 Rex Dieter 2018-01-24 13:30:57 UTC
Ah, you're using php-tidy here?  Silly me, I'll try that (and triage to that component for now, while we're investigating)

In the meantime, I've revoked the tidy bodhi updates until we can get to the bottom of this.

Comment 5 Harald Reindl 2018-01-24 13:38:10 UTC Comment hidden (abuse)
Comment 6 Rex Dieter 2018-01-24 13:56:28 UTC
So is
-----
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Tidy</title></head><body><a href="bla test">Test</a></body></html>


<body>
<a href="bla%20test">Test</a>
</body>
------

the expected or wrong output from test.php?


I'll reassign this back to tidy then if you're not using fedora's php-tidy

Comment 7 Rex Dieter 2018-01-24 13:57:57 UTC
Because I'm not sure if this counts as reproducing your issue or not.


$ rpm -q  libtidy php-tidy
libtidy-5.4.0-3.fc27.x86_64
php-tidy-7.1.13-1.fc27.x86_64

$ php test.php 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Tidy</title></head><body><a href="bla test">Test</a></body></html>


<body>
<a href="bla%20test">Test</a>
</body>

Comment 8 Rex Dieter 2018-01-24 13:58:31 UTC
arg, sorry, somewhere my test env downgraded libtidy

Comment 9 Rex Dieter 2018-01-24 14:01:04 UTC
OK, now I'm getting slightly different results,


$ php test.php 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Tidy</title></head><body><a href="bla test">Test</a></body></html>


<body>

<a href="bla%20test">
Test</a>
</body>

Comment 10 Harald Reindl 2018-01-24 14:04:19 UTC Comment hidden (abuse)
Comment 11 Rex Dieter 2018-01-24 14:11:06 UTC
OK, so we're each clearly getting different results.

expected:
<body>
<a href="bla%20test">Test</a>
</body>

you:
<body>

<a href="bla">
</a>
</body>

me:
<body>

<a href="bla%20test">
Test</a>
</body>

Comment 12 Harald Reindl 2018-01-24 14:13:31 UTC
that makes me even more worry about the new tidy build because all the autotests are written more than a year ago and never changed between F24/F25/F26 whatever libtidy build as well as with PHP 7.0,7.1 and 7.2

Comment 13 Rex Dieter 2018-01-24 14:13:45 UTC
Sorry for the bz spam, but this is interesting:

if I remove all the custom config options, I now get output:
<body>
<a href="bla%20test">Test</a>
</body>

win!  So, going to bisect through those configs to see which one or set of them helps trigger it.

Comment 14 Rex Dieter 2018-01-24 14:18:35 UTC
The trigger for me is
  'wrap'                        => 0,

with that present I get:
<body>

<a href="bla%20test">
Test</a>
</body>


without it:
<body>
<a href="bla%20test">Test</a>
</body>

Comment 15 Harald Reindl 2018-01-24 14:23:32 UTC
interesting - now it no longer removes the link-text and also don't break the href but still makes crazy linebreakings - but at least both our results are identical

most of the custom options are to make sure that no wrapping happens - however, i need identical outpus because rewrite hashes of autotest-results don't scale to verify before/after upgrade of Fedora/Libtidy/PHP/PHP-code and brings the danger that you manifest a wrong result with the new hashes and things are unnoticed broken until customers complain which is exactly the opposite as the tests are written for

[harry@srv-rhsoft:/downloads]$ ls
insgesamt 232K
-rw-r----- 1 harry verwaltung 1,9K 2018-01-24 15:15 test.php
-rw-r----- 1 harry verwaltung 226K 2018-01-24 15:14 libtidy-5.6.0-2.fc27.x86_64.rpm
______________________________________

[harry@srv-rhsoft:/downloads]$ php test.php
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Tidy</title></head><body><a href="bla test">Test</a></body></html>


<body>
<a href="bla%20test">Test</a>
</body>
______________________________________

[harry@srv-rhsoft:/downloads]$ sudo rpm -Uvh libtidy-5.6.0-2.fc27.x86_64.rpm
______________________________________

[harry@srv-rhsoft:/downloads]$ php test.php
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Tidy</title></head><body><a href="bla test">Test</a></body></html>


<body>

<a href="bla%20test">
Test</a>
</body>

Comment 16 Harald Reindl 2018-01-24 14:25:28 UTC
'wrap' => 0

confirmed - set it to 0/1 has the same effect -> enable wrap
that's not how on/off options are supposed to work :-)

Comment 17 Harald Reindl 2018-01-24 14:30:44 UTC
i correct myself - wrap is a integer option which defaults to 68 and the 0 value is supposed to disable autowrap at all

http://tidy.sourceforge.net/docs/quickref.html
wrap 	Integer 	68

Comment 18 Harald Reindl 2018-01-24 14:33:26 UTC
but even if i remove the wrap option way too much tests breaking
i try to analyze what's different in the follow-up case, stay tuned

FAILED: NOT ALL TESTS PASSED
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 152
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 159
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 166
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 173
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 180
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 187
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 194
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  cl_api->data->test() failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotest.php on line 351
[24-Jan-2018 15:31:47 Europe/Vienna] PHP Notice:  cl_api->data->test(): Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 152

Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 159

Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 166

Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 173

Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 180

Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 187

Notice: check failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_data.php on line 194 in /mnt/data/www/thelounge.net/contentlounge/cms/autotest.php on line 370
[24-Jan-2018 15:31:48 Europe/Vienna] PHP Notice:  output hash: 9701bc28129c1a0e7e82c5c996a74765 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_youtube.php on line 101
[24-Jan-2018 15:31:48 Europe/Vienna] PHP Notice:  cl_api->youtube->test() failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotest.php on line 351
[24-Jan-2018 15:31:48 Europe/Vienna] PHP Notice:  cl_api->youtube->test(): Notice: output hash: 9701bc28129c1a0e7e82c5c996a74765 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_youtube.php on line 101 in /mnt/data/www/thelounge.net/contentlounge/cms/autotest.php on line 370
[24-Jan-2018 15:31:49 Europe/Vienna] PHP Notice:  'bewerten.php' checksum mismatch: cf2182722f2e8a61acf1b5581f212376 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_fotos.php on line 51
[24-Jan-2018 15:31:49 Europe/Vienna] PHP Notice:  cl_api->fotos->test() failed in /mnt/data/www/thelounge.net/contentlounge/cms/autotest.php on line 351
[24-Jan-2018 15:31:49 Europe/Vienna] PHP Notice:  cl_api->fotos->test(): Notice: 'bewerten.php' checksum mismatch: cf2182722f2e8a61acf1b5581f212376 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/api_fotos.php on line 51 in /mnt/data/www/thelounge.net/contentlounge/cms/autotest.php on line 370
[24-Jan-2018 15:31:50 Europe/Vienna] PHP Notice:  check faild: d1583ec0597919a0e6f4e53a081ec65c in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/custom/tidy.php on line 40
[24-Jan-2018 15:31:50 Europe/Vienna] PHP Notice:  check faild: d24925f3d31c1fb7e6c2a9fe69fed1d8 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/custom/tidy.php on line 49
[24-Jan-2018 15:31:50 Europe/Vienna] PHP Notice:  CUSTOM-TEST: tidy.php
Notice: check faild: d1583ec0597919a0e6f4e53a081ec65c in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/custom/tidy.php on line 40

Notice: check faild: d24925f3d31c1fb7e6c2a9fe69fed1d8 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/custom/tidy.php on line 49 in /mnt/data/www/thelounge.net/contentlounge/cms/autotests/suite/custom.php on line 55

Comment 19 Harald Reindl 2018-01-24 14:45:09 UTC
well, without the default setting of 68 hits - the warp option is serious broken and until thats fixed it makes no sense test anything else because it's hard to focus when you need to watch byte-for-byte if the result makes any sense and it's "just" a wrapping issue

the "href" is supposed to be in the same line as the closing </a> instead wrap before

$tidy_text = "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\"><html><head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=ISO-8859-1\"><title>Tidy</title></head><body><script>self.location.href='test.php?a=1&b=2';</script><a class=\"test1\"class='test2'style=\"padding:10px;\" style='margin:10px;'href=\"bla test\">Test</a></body></html>\n";
$cleaned = cl_autotest_tidy($tidy_text);
exit($cleaned);

<body>
<script>
self.location.href='test.php?a=1&b=2';
</script><a class="test1 test2" style="padding:10px; margin:10px;"
href="bla%20test">Test</a>
</body>

Comment 20 Rex Dieter 2018-01-24 14:49:19 UTC
agreed.

Interestingly, I cannot seem to reproduce the problem with the 'tidy' command line client using the same options either, so it would appear this is specific to php-tidy somehow.

I guess I'm going to have to backport the CVE fixes to 5.4.0 at least in the short-term.

Comment 21 Ben Cotton 2018-11-27 17:22:40 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 22 Harald Reindl 2018-12-04 10:12:39 UTC
Created attachment 1511243 [details]
test.php to see the difference between 5.4 and 5.6

* test.php contains base64 content and expected result
* libtidy-5.4.htm the for years constant output even before tidy5
* libtidy-5.6.htm the current output
* libtidy-54-56.diff is a diff between both outputs

you simply can't write autotest with such huge changes

where the 3.3333333 from https://bugzilla.redhat.com/show_bug.cgi?id=1609520 comes from is currently unclear and seems to be the result of other content manipulation within the php-application which is a side-effect triggered by the tidy 5.6 output

Comment 23 Rex Dieter 2018-12-05 15:11:50 UTC
I'd encourage you to engage tidy upstream if possible.  That will likely yield the best and quickest results.

Comment 24 Harald Reindl 2018-12-05 15:15:25 UTC
https://github.com/htacg/tidy-html5/issues/780#issuecomment-444219573

i thought that's what distributions are for not have to eal with each and every upstream....

Comment 25 Rex Dieter 2018-12-05 16:24:26 UTC
thanks

Comment 26 Harald Reindl 2018-12-10 16:56:57 UTC
https://github.com/htacg/tidy-html5/issues/780#issuecomment-445885150
https://bugs.php.net/bug.php?id=77278

libtidy 5.6 is touching LOCALE somewhere without a proper reset - period
look at the comma instad dot after typecasting a float value

libtidy-5.6.0-2.fc28.x86_64.rpm
[harry@srv-rhsoft:~]$ php /downloads/tidy-debug.php
PAGES-SUMMARY: 3,3333333333333
PAGES-SUMMARY: 3,3333333333333
CORRUPTION!

libtidy-5.4.0-4.fc28.20181003.rh.x86_64.rpm
[harry@srv-rhsoft:~]$ php /downloads/tidy-debug.php
PAGES-SUMMARY: 3.3333333333333
PAGES-SUMMARY: 4
OK

/** comment out this line and everything is fine with libtidy-5.6.0-2.fc28.x86_64 too */
$tidy = tidy_parse_string('bla', [], 'latin1');

/** that code is completly unrelated to tidy and must not change it's behavior */
$conn = mysqli_init();
mysqli_real_connect($conn, $host, $user, $pwd, $db);
$result = mysqli_query($conn, "select SQL_CALC_FOUND_ROWS * from cl_autotest_youtube_items where yi_cid='1' and yi_aktiv='1' order by yi_sort asc limit 0, 3");
$count_summary = mysqli_fetch_row(mysqli_query($conn, 'select SQL_NO_CACHE found_rows()'))[0];
$pages_summary = (string)($count_summary / 3);
echo "PAGES-SUMMARY: $pages_summary\n";
if($pages_summary > (int)$pages_summary)
{
$pages_summary = (int)$pages_summary + 1;
}
echo "PAGES-SUMMARY: $pages_summary\n";
if((int)$pages_summary !== 4)
{
echo "CORRUPTION!\n";
}
else
{
echo "OK\n";
}

Comment 27 Ben Cotton 2019-05-02 20:09:08 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 28 Ben Cotton 2019-10-31 19:26:28 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 29 Ben Cotton 2019-11-27 22:23:46 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.