Bug 123142 - Bogus UTF-8 char data can be input
Summary: Bogus UTF-8 char data can be input
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Bugzilla
Classification: Community
Component: Bugzilla General
Version: 3.2
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: PnT DevOps Devs
QA Contact: David Lawrence
URL: https://bugzilla.redhat.com/bugzilla/...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-05-12 22:37 UTC by Toshiyuki Takamiya
Modified: 2015-06-02 01:18 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-11-04 21:53:26 UTC
Embargoed:


Attachments (Terms of Use)
Perl subroutines to check for good UTF-8/chars (3.05 KB, text/plain)
2004-05-12 22:39 UTC, Eido Inoue
no flags Details
this one has an improved not_a_char subroutine (2.16 KB, text/plain)
2004-05-12 22:45 UTC, Eido Inoue
no flags Details

Description Eido Inoue 2004-05-12 22:37:45 UTC
Description of problem:
Bad browsers can inject non-utf-8 into text fields

How reproducible:
always

Steps to Reproduce:
1. ask warren to do what we did for bug 122992 :)

Actual results:
Some non-UTF-8 in the description

Expected results:
Bugzilla should not allow bad character data, and the database should
be cleaned of any non-utf8 text

Additional info:

additionally, after checking the data for UTF-8-ness, the checked
UTF-8 string should be run through function NFKC($string) in
Unicode::Normalize module -- although this may be computation
expensive (to guarantee that most browsers have a chance of displaying
the Unicode should some valid-but-wacko Unicode make it through)

Comment 1 Eido Inoue 2004-05-12 22:39:12 UTC
Created attachment 100201 [details]
Perl subroutines to check for good UTF-8/chars

simple perl subroutine that checks for good utf-8 and does a very simple sanity
check on the Unicode (no noncharacter codepoints... as of Unicode 4.0)

Comment 2 Eido Inoue 2004-05-12 22:45:24 UTC
Created attachment 100203 [details]
this one has an improved not_a_char subroutine

slightly shorter and faster not_a_char function... also checks for chars >
U-10FFFF, which are obsoleted as of Unicode 3.0

Comment 3 David Lawrence 2006-04-08 18:03:35 UTC
Red Hat's current Bugzilla version is 2.18. I am moving all older open bugs to
this version. Any bugs against the older versions will need to be verified that
they are still bugs. This will help me also to sort them better.

Comment 5 David Lawrence 2008-09-16 16:50:56 UTC
Red Hat Bugzilla is now using version 3.2 of the Bugzilla codebase and therefore this bug will need to be re-verified against the new release. With the updated code this bug may no longer be relevant or may have been fixed in the new code.
Updating bug version to 3.2.


Note You need to log in before you can comment on or make changes to this bug.