Bug 104203 - problem when writing french characters to mysql database
problem when writing french characters to mysql database
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: php (Show other bugs)
9
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Joe Orton
David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-09-11 03:38 EDT by Elie De Brauwer
Modified: 2007-04-18 12:57 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-09-12 06:47:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Elie De Brauwer 2003-09-11 03:38:29 EDT
Ok, so I have the following situation,  a mysql is running:

mysql> status;
--------------
mysql  Ver 11.18 Distrib 3.23.54, for redhat-linux-gnu (i386)

Connection id:          5554
Current database:       <snip>
Current user:           <snip>
Current pager:          stdout
Using outfile:          ''
Server version:         3.23.54
Protocol version:       10
Connection:             Localhost via UNIX socket
Client characterset:    latin1
Server characterset:    latin1
UNIX socket:            /var/lib/mysql/mysql.sock
Uptime:                 3 days 17 min 41 sec


Suppose we have this table: 

mysql> describe response_text;
+-------------+------------------+------+-----+---------+-------+
| Field       | Type             | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+-------+
| response_id | int(10) unsigned |      | PRI | 0       |       |
| question_id | int(10) unsigned |      | PRI | 0       |       |
| response    | text             | YES  |     | NULL    |       |
+-------------+------------------+------+-----+---------+-------+

And we insert a bunch of french characters:

mysql> insert into response_text (response_id,question_id,response) values
(9999,9999,"éèàâïôê");

Now we read it back:
mysql> select response from response_text where question_id=9999;
+----------+
| response |
+----------+
| éèàâïôê  |
+----------+
1 row in set (0.00 sec)

And it works prefectly, now there is also installed the following:
$ rpm -qa | grep php
php-mysql-4.2.2-17.2
php-4.2.2-17.2
$ rpm -qa | grep httpd
httpd-2.0.40-21.3

Everything is working properly except when i create a php file that contains 
the following code:
<?
        $dblink = mysql_connect(<snip><snip><snip>);
        mysql_select_db("<snip>");
        mysql_query("insert INTO response_text  
        response_id,question_id,response) "
        . "VALUES(99999,99999,\"" . $_POST['test'] . "\")");

        echo "<br>Now i'm reading it back from the database:<br>\n ";

        $res = mysql_query("select * from response_text where 
        question_id=99999");
        while($row = mysql_fetch_array($res)){
                var_dump($row);
                echo "<br>";
        }
        mysql_free_result($res);

//      mysql_query("delete from response_text where question_id=99999");

        mysql_close($dblink);
?>

Now what does this script do ? It simply puts the value in $_POST['test'] into 
the database, reads it and deletes it (i commented the delete out).
Now, when i enter the string with the accents, it goes perfectly and the output 
is:

Now i'm reading it back from the database:
array(6) { [0]=> string(5) "99999" ["response_id"]=> string(5) "99999" [1]=> 
string(5) "99999" ["question_id"]=> string(5) "99999" [2]=> string
(15) "éèàâïôê " ["response"]=> string(15) "éèàâïôê " } 

Exactly as we suspected, but notice that it isn't deleted and it is still in 
the database when i go to look into the database: 
mysql> select response from response_text where question_id=99999;
+-----------------+
| response        |
+-----------------+
| éèà âïôê  |
+-----------------+
1 row in set (0.00 sec)

It dumps this gibberish. 
A php program (phpESP) which does a query to the database and sends a mail 
containing the contents of the database also contains these gibberish 
characters (which means the conversion fails from mysql->php sometimes).

This only applies to charactes containing accents. Normal characters work fine.


I tried googling, php.net bug database and bugzilla but couldn't come up with 
an answer.
I also tried to reproduce it on my debian unstable box which failed. (debian 
box was running mysql 4.0.14, php 4.3.2)
Comment 1 Joe Orton 2003-09-11 04:27:45 EDT
There's no error handling here at all, can you add

  or die("mysql_blah failed " . mysql_error());

after each mysql_blah statement?
Comment 2 Elie De Brauwer 2003-09-11 05:04:46 EDT
The code is ran on a production machine where mysql/php/apache work so the or 
die's won't change much but anyhow here's the code:

        $dblink = mysql_connect(SNIP) or die ("foo");
        mysql_select_db(SNIP) or die("bar");
        mysql_query("insert INTO response_text 
(response_id,question_id,response) "
        . "VALUES(99999,99999,\"" . $_POST['test'] . "\")") or die("this");
        echo "<br>Now i'm reading it back from the database:<br>\n ";
        $res = mysql_query("select * from response_text where 
question_id=99999")or die("is");
        while($row = mysql_fetch_array($res) or die("a waste ".mysql_error())){
                var_dump($row);
                echo "<br>";
        }
        mysql_free_result($res) or die("of");

//      mysql_query("delete from response_text where question_id=99999");

        mysql_close($dblink) or die("time");


The output is: 

Now i'm reading it back from the database:
array(6) { [0]=> string(5) "99999" ["response_id"]=> string(5) "99999" [1]=> 
string(5) "99999" ["question_id"]=> string(5) "99999" [2]=> string(16) " 
éèàâïôê " ["response"]=> string(16) " éèàâïôê " } 
a waste 

Which is normal because that's the entire idea of the while() loop, now when i 
delete the or die in the while everything runs perfectly as expected. 

(I know i don't have to write code like that but it was a quick test to locate 
the problem and to attach it to this bugreport)
Comment 3 Joe Orton 2003-09-12 06:28:35 EDT
The important question is whether or not the mysql_query("delete ..." )
fails when you run it - have you tried the "or die" on that statement?
Comment 4 Elie De Brauwer 2003-09-12 06:32:32 EDT
the problem is that the sentence is transformed. The delete isn't necessary. 
The delete is everywhere commented out and has nothing to see with the problem 
here. 

If you really insist i will add or die to the command but that isn't very 
usefull imo
Comment 5 Joe Orton 2003-09-12 06:43:23 EDT
Oh, I see, it wasn't clear exactly what bug you were reporting.

The characters you've pasted above with the à symbols look like a UTF-8
character sequence does when viewed in a non-UTF-8 locale.

What is your $LANG setting for the terminal where you are running the mysql
command?  If you are using a non-UTF-8 locale for your terminal, and your PHP
script is inserting UTF-8-encoded characters into the MySQL database, this would
be expected behaviour.
Comment 6 Joe Orton 2003-09-12 06:47:18 EDT
You can use the PHP function utf8_decode() on the input string before the INSERT
if you don't want UTF-8 strings in your database.

Note You need to log in before you can comment on or make changes to this bug.