Bug 1186681 - MySQL 5.5 cartridge does not support storing of emojis (4 byte utf chars)
Summary: MySQL 5.5 cartridge does not support storing of emojis (4 byte utf chars)
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Image
Version: 2.x
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Abhishek Gupta
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-28 10:00 UTC by matzew
Modified: 2017-05-31 18:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-31 18:22:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker AGPUSH-1246 0 Major Closed Openshift MySQL cartridge does not support Emoji 2017-06-01 00:02:51 UTC

Description matzew 2015-01-28 10:00:01 UTC
Description of problem:
The MySQL 5.5. cartridge used in here:
https://openshift.redhat.com/app/console/application_type/cart!jboss-unified-push-1

does not support emojis ( 4-byte UTF-8 character )

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. create cartidge
2. setup device
3. and push a message, containing an emojo

Actual results:

push message is stored in mysql

Expected results:

push is not stored in mysql and some exception is visbile on the JBoss console (saw it via rhc tail). Details are here:

https://issues.jboss.org/browse/AGPUSH-1246

Additional info:

Comment 1 matzew 2015-01-28 10:12:28 UTC
Wondering what version of the driver is used - locally I used:

mysql-connector-java:5.1.18

Comment 2 Maciej Szulik 2015-01-28 11:12:19 UTC
I'm assigning this bug to xPaaS team, since they own this cartridge.

Comment 3 matzew 2015-01-28 11:42:32 UTC
well, the problem is not really the mobile cartridge. The problem is that the mysql server is NOT configured with utf8mb4

Comment 4 JBoss JIRA Server 2015-01-28 15:30:40 UTC
Matthias Wessendorf <matthias> updated the status of jira AGPUSH-1246 to Coding In Progress

Comment 5 Maciej Szulik 2015-01-28 16:56:21 UTC
The default encoding for newly created databases is utf8 (see [1]). We'd like to keep that default. Since each user is able to login into his gear and change database encoding using commands from [2] I'm closing this as a won't fix.

[1] https://github.com/openshift/origin-server/blob/master/cartridges/openshift-origin-cartridge-mysql/bin/post_install#L14
[2] http://dba.stackexchange.com/questions/8239/how-to-easily-convert-utf8-tables-to-utf8mb4-in-mysql-5-5/21684#21684

Comment 6 matzew 2015-01-28 16:58:06 UTC
When the cartridge is launch, why can we not change the MySQL server, that will be used (and created) on this cart ? 

We do need that

Comment 7 matzew 2015-01-29 08:16:36 UTC
Here is some more info:
http://info.michael-simons.eu/2013/01/21/java-mysql-and-multi-byte-utf-8-support/

But locally, I don't even set this, and it works. What version of the mysql driver is used?

Comment 8 Stijn de Witt 2015-06-15 14:54:04 UTC
Please have a look at this:
http://dev.mysql.com/doc/connector-j/en/connector-j-reference-charsets.html

Setting the Character Encoding
The character encoding between client and server is automatically detected upon connection. You specify the encoding on the server using the character_set_server for server versions 4.1.0 and newer, and character_set system variable for server versions older than 4.1.0. The driver automatically uses the encoding specified by the server. For more information, see Server Character Set and Collation.

For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with character_set_server=utf8mb4, and leave characterEncoding out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting.

To override the automatically detected encoding on the client side, use the characterEncoding property in the URL used to connect to the server.


HERE IS MY PROBLEM:

* The OpenShift cartridge sets the `character_set_server` to `latin1`
* I should be able to override the `characterEncoding` property on the URL to override this, however it does not accept `utf8mb4`:

java.sql.SQLException: Unsupported character encoding 'utf8mb4'

MySQL does not seem to think it's needed to fix this:
http://bugs.mysql.com/bug.php?id=64823

So now I'm stuck. It seems we can't use utf8mb4 on OpenShift due to this??

"The default encoding for newly created databases is utf8 (see [1]). We'd like to keep that default."

Why would you like to keep that default? It is wrong.
Please note that `utf8` on MySQL is not the good old utf8 that we all know and love. Instead it's a broken, 3-byte, MySQL-specific encoding that MySQL calls `utf8`. The encoding we all know and love as `utf8` is actually called `utf8mb4` in MySQL.

Read more:
https://stijndewitt.wordpress.com/2015/06/15/use-mysql-utf8mb4-if-you-want-full-unicode-support/

Comment 9 Eric Paris 2017-05-31 18:22:11 UTC
We apologize, however, we do not plan to address this report at this time. The majority of our active development is for the v3 version of OpenShift. If you would like for Red Hat to reconsider this decision, please reach out to your support representative. We are very sorry for any inconvenience this may cause.


Note You need to log in before you can comment on or make changes to this bug.