Bug 123142 - Bogus UTF-8 char data can be input
Bogus UTF-8 char data can be input
Status: CLOSED CURRENTRELEASE
Product: Bugzilla
Classification: Community
Component: Bugzilla General (Show other bugs)
3.2
All Linux
medium Severity medium (vote)
: ---
: ---
Assigned To: PnT DevOps Devs
David Lawrence
https://bugzilla.redhat.com/bugzilla/...
: i18n
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-05-12 18:37 EDT by Toshiyuki Takamiya
Modified: 2015-06-01 21:18 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-11-04 16:53:26 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Perl subroutines to check for good UTF-8/chars (3.05 KB, text/plain)
2004-05-12 18:39 EDT, Eido Inoue
no flags Details
this one has an improved not_a_char subroutine (2.16 KB, text/plain)
2004-05-12 18:45 EDT, Eido Inoue
no flags Details

  None (edit)
Description Eido Inoue 2004-05-12 18:37:45 EDT
Description of problem:
Bad browsers can inject non-utf-8 into text fields

How reproducible:
always

Steps to Reproduce:
1. ask warren to do what we did for bug 122992 :)

Actual results:
Some non-UTF-8 in the description

Expected results:
Bugzilla should not allow bad character data, and the database should
be cleaned of any non-utf8 text

Additional info:

additionally, after checking the data for UTF-8-ness, the checked
UTF-8 string should be run through function NFKC($string) in
Unicode::Normalize module -- although this may be computation
expensive (to guarantee that most browsers have a chance of displaying
the Unicode should some valid-but-wacko Unicode make it through)
Comment 1 Eido Inoue 2004-05-12 18:39:12 EDT
Created attachment 100201 [details]
Perl subroutines to check for good UTF-8/chars

simple perl subroutine that checks for good utf-8 and does a very simple sanity
check on the Unicode (no noncharacter codepoints... as of Unicode 4.0)
Comment 2 Eido Inoue 2004-05-12 18:45:24 EDT
Created attachment 100203 [details]
this one has an improved not_a_char subroutine

slightly shorter and faster not_a_char function... also checks for chars >
U-10FFFF, which are obsoleted as of Unicode 3.0
Comment 3 David Lawrence 2006-04-08 14:03:35 EDT
Red Hat's current Bugzilla version is 2.18. I am moving all older open bugs to
this version. Any bugs against the older versions will need to be verified that
they are still bugs. This will help me also to sort them better.
Comment 5 David Lawrence 2008-09-16 12:50:56 EDT
Red Hat Bugzilla is now using version 3.2 of the Bugzilla codebase and therefore this bug will need to be re-verified against the new release. With the updated code this bug may no longer be relevant or may have been fixed in the new code.
Updating bug version to 3.2.

Note You need to log in before you can comment on or make changes to this bug.