Bug 104203

Summary: problem when writing french characters to mysql database
Product: [Retired] Red Hat Linux Reporter: Elie De Brauwer <elie>
Component: phpAssignee: Joe Orton <jorton>
Status: CLOSED NOTABUG QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: elie
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-09-12 10:47:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Elie De Brauwer 2003-09-11 07:38:29 UTC
Ok, so I have the following situation,  a mysql is running:

mysql> status;
--------------
mysql  Ver 11.18 Distrib 3.23.54, for redhat-linux-gnu (i386)

Connection id:          5554
Current database:       <snip>
Current user:           <snip>
Current pager:          stdout
Using outfile:          ''
Server version:         3.23.54
Protocol version:       10
Connection:             Localhost via UNIX socket
Client characterset:    latin1
Server characterset:    latin1
UNIX socket:            /var/lib/mysql/mysql.sock
Uptime:                 3 days 17 min 41 sec


Suppose we have this table: 

mysql> describe response_text;
+-------------+------------------+------+-----+---------+-------+
| Field       | Type             | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+-------+
| response_id | int(10) unsigned |      | PRI | 0       |       |
| question_id | int(10) unsigned |      | PRI | 0       |       |
| response    | text             | YES  |     | NULL    |       |
+-------------+------------------+------+-----+---------+-------+

And we insert a bunch of french characters:

mysql> insert into response_text (response_id,question_id,response) values
(9999,9999,"éèàâïôê");

Now we read it back:
mysql> select response from response_text where question_id=9999;
+----------+
| response |
+----------+
| éèàâïôê  |
+----------+
1 row in set (0.00 sec)

And it works prefectly, now there is also installed the following:
$ rpm -qa | grep php
php-mysql-4.2.2-17.2
php-4.2.2-17.2
$ rpm -qa | grep httpd
httpd-2.0.40-21.3

Everything is working properly except when i create a php file that contains 
the following code:
<?
        $dblink = mysql_connect(<snip><snip><snip>);
        mysql_select_db("<snip>");
        mysql_query("insert INTO response_text  
        response_id,question_id,response) "
        . "VALUES(99999,99999,\"" . $_POST['test'] . "\")");

        echo "<br>Now i'm reading it back from the database:<br>\n ";

        $res = mysql_query("select * from response_text where 
        question_id=99999");
        while($row = mysql_fetch_array($res)){
                var_dump($row);
                echo "<br>";
        }
        mysql_free_result($res);

//      mysql_query("delete from response_text where question_id=99999");

        mysql_close($dblink);
?>

Now what does this script do ? It simply puts the value in $_POST['test'] into 
the database, reads it and deletes it (i commented the delete out).
Now, when i enter the string with the accents, it goes perfectly and the output 
is:

Now i'm reading it back from the database:
array(6) { [0]=> string(5) "99999" ["response_id"]=> string(5) "99999" [1]=> 
string(5) "99999" ["question_id"]=> string(5) "99999" [2]=> string
(15) "éèàâïôê " ["response"]=> string(15) "éèàâïôê " } 

Exactly as we suspected, but notice that it isn't deleted and it is still in 
the database when i go to look into the database: 
mysql> select response from response_text where question_id=99999;
+-----------------+
| response        |
+-----------------+
| éèà âïôê  |
+-----------------+
1 row in set (0.00 sec)

It dumps this gibberish. 
A php program (phpESP) which does a query to the database and sends a mail 
containing the contents of the database also contains these gibberish 
characters (which means the conversion fails from mysql->php sometimes).

This only applies to charactes containing accents. Normal characters work fine.


I tried googling, php.net bug database and bugzilla but couldn't come up with 
an answer.
I also tried to reproduce it on my debian unstable box which failed. (debian 
box was running mysql 4.0.14, php 4.3.2)

Comment 1 Joe Orton 2003-09-11 08:27:45 UTC
There's no error handling here at all, can you add

  or die("mysql_blah failed " . mysql_error());

after each mysql_blah statement?

Comment 2 Elie De Brauwer 2003-09-11 09:04:46 UTC
The code is ran on a production machine where mysql/php/apache work so the or 
die's won't change much but anyhow here's the code:

        $dblink = mysql_connect(SNIP) or die ("foo");
        mysql_select_db(SNIP) or die("bar");
        mysql_query("insert INTO response_text 
(response_id,question_id,response) "
        . "VALUES(99999,99999,\"" . $_POST['test'] . "\")") or die("this");
        echo "<br>Now i'm reading it back from the database:<br>\n ";
        $res = mysql_query("select * from response_text where 
question_id=99999")or die("is");
        while($row = mysql_fetch_array($res) or die("a waste ".mysql_error())){
                var_dump($row);
                echo "<br>";
        }
        mysql_free_result($res) or die("of");

//      mysql_query("delete from response_text where question_id=99999");

        mysql_close($dblink) or die("time");


The output is: 

Now i'm reading it back from the database:
array(6) { [0]=> string(5) "99999" ["response_id"]=> string(5) "99999" [1]=> 
string(5) "99999" ["question_id"]=> string(5) "99999" [2]=> string(16) " 
éèàâïôê " ["response"]=> string(16) " éèàâïôê " } 
a waste 

Which is normal because that's the entire idea of the while() loop, now when i 
delete the or die in the while everything runs perfectly as expected. 

(I know i don't have to write code like that but it was a quick test to locate 
the problem and to attach it to this bugreport)

Comment 3 Joe Orton 2003-09-12 10:28:35 UTC
The important question is whether or not the mysql_query("delete ..." )
fails when you run it - have you tried the "or die" on that statement?

Comment 4 Elie De Brauwer 2003-09-12 10:32:32 UTC
the problem is that the sentence is transformed. The delete isn't necessary. 
The delete is everywhere commented out and has nothing to see with the problem 
here. 

If you really insist i will add or die to the command but that isn't very 
usefull imo

Comment 5 Joe Orton 2003-09-12 10:43:23 UTC
Oh, I see, it wasn't clear exactly what bug you were reporting.

The characters you've pasted above with the à symbols look like a UTF-8
character sequence does when viewed in a non-UTF-8 locale.

What is your $LANG setting for the terminal where you are running the mysql
command?  If you are using a non-UTF-8 locale for your terminal, and your PHP
script is inserting UTF-8-encoded characters into the MySQL database, this would
be expected behaviour.

Comment 6 Joe Orton 2003-09-12 10:47:18 UTC
You can use the PHP function utf8_decode() on the input string before the INSERT
if you don't want UTF-8 strings in your database.