Ok, so I have the following situation, a mysql is running: mysql> status; -------------- mysql Ver 11.18 Distrib 3.23.54, for redhat-linux-gnu (i386) Connection id: 5554 Current database: <snip> Current user: <snip> Current pager: stdout Using outfile: '' Server version: 3.23.54 Protocol version: 10 Connection: Localhost via UNIX socket Client characterset: latin1 Server characterset: latin1 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 3 days 17 min 41 sec Suppose we have this table: mysql> describe response_text; +-------------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------+------------------+------+-----+---------+-------+ | response_id | int(10) unsigned | | PRI | 0 | | | question_id | int(10) unsigned | | PRI | 0 | | | response | text | YES | | NULL | | +-------------+------------------+------+-----+---------+-------+ And we insert a bunch of french characters: mysql> insert into response_text (response_id,question_id,response) values (9999,9999,"éèà âïôê"); Now we read it back: mysql> select response from response_text where question_id=9999; +----------+ | response | +----------+ | éèà âïôê | +----------+ 1 row in set (0.00 sec) And it works prefectly, now there is also installed the following: $ rpm -qa | grep php php-mysql-4.2.2-17.2 php-4.2.2-17.2 $ rpm -qa | grep httpd httpd-2.0.40-21.3 Everything is working properly except when i create a php file that contains the following code: <? $dblink = mysql_connect(<snip><snip><snip>); mysql_select_db("<snip>"); mysql_query("insert INTO response_text response_id,question_id,response) " . "VALUES(99999,99999,\"" . $_POST['test'] . "\")"); echo "<br>Now i'm reading it back from the database:<br>\n "; $res = mysql_query("select * from response_text where question_id=99999"); while($row = mysql_fetch_array($res)){ var_dump($row); echo "<br>"; } mysql_free_result($res); // mysql_query("delete from response_text where question_id=99999"); mysql_close($dblink); ?> Now what does this script do ? It simply puts the value in $_POST['test'] into the database, reads it and deletes it (i commented the delete out). Now, when i enter the string with the accents, it goes perfectly and the output is: Now i'm reading it back from the database: array(6) { [0]=> string(5) "99999" ["response_id"]=> string(5) "99999" [1]=> string(5) "99999" ["question_id"]=> string(5) "99999" [2]=> string (15) "éèà âïôê " ["response"]=> string(15) "éèà âïôê " } Exactly as we suspected, but notice that it isn't deleted and it is still in the database when i go to look into the database: mysql> select response from response_text where question_id=99999; +-----------------+ | response | +-----------------+ | éèà âïôê | +-----------------+ 1 row in set (0.00 sec) It dumps this gibberish. A php program (phpESP) which does a query to the database and sends a mail containing the contents of the database also contains these gibberish characters (which means the conversion fails from mysql->php sometimes). This only applies to charactes containing accents. Normal characters work fine. I tried googling, php.net bug database and bugzilla but couldn't come up with an answer. I also tried to reproduce it on my debian unstable box which failed. (debian box was running mysql 4.0.14, php 4.3.2)
There's no error handling here at all, can you add or die("mysql_blah failed " . mysql_error()); after each mysql_blah statement?
The code is ran on a production machine where mysql/php/apache work so the or die's won't change much but anyhow here's the code: $dblink = mysql_connect(SNIP) or die ("foo"); mysql_select_db(SNIP) or die("bar"); mysql_query("insert INTO response_text (response_id,question_id,response) " . "VALUES(99999,99999,\"" . $_POST['test'] . "\")") or die("this"); echo "<br>Now i'm reading it back from the database:<br>\n "; $res = mysql_query("select * from response_text where question_id=99999")or die("is"); while($row = mysql_fetch_array($res) or die("a waste ".mysql_error())){ var_dump($row); echo "<br>"; } mysql_free_result($res) or die("of"); // mysql_query("delete from response_text where question_id=99999"); mysql_close($dblink) or die("time"); The output is: Now i'm reading it back from the database: array(6) { [0]=> string(5) "99999" ["response_id"]=> string(5) "99999" [1]=> string(5) "99999" ["question_id"]=> string(5) "99999" [2]=> string(16) " éèà âïôê " ["response"]=> string(16) " éèà âïôê " } a waste Which is normal because that's the entire idea of the while() loop, now when i delete the or die in the while everything runs perfectly as expected. (I know i don't have to write code like that but it was a quick test to locate the problem and to attach it to this bugreport)
The important question is whether or not the mysql_query("delete ..." ) fails when you run it - have you tried the "or die" on that statement?
the problem is that the sentence is transformed. The delete isn't necessary. The delete is everywhere commented out and has nothing to see with the problem here. If you really insist i will add or die to the command but that isn't very usefull imo
Oh, I see, it wasn't clear exactly what bug you were reporting. The characters you've pasted above with the à symbols look like a UTF-8 character sequence does when viewed in a non-UTF-8 locale. What is your $LANG setting for the terminal where you are running the mysql command? If you are using a non-UTF-8 locale for your terminal, and your PHP script is inserting UTF-8-encoded characters into the MySQL database, this would be expected behaviour.
You can use the PHP function utf8_decode() on the input string before the INSERT if you don't want UTF-8 strings in your database.