Table Special Character issue after MySQL Upgrade

I have below table in mysql 5.7.36 database.

CREATE TABLE message_table (
id int(11) NOT NULL AUTO_INCREMENT COMMENT ‘primary key for community messages’,
icon varchar(255) NOT NULL DEFAULT ‘’ COMMENT ‘this field will contain IPFS hash’,
subject varchar(255) DEFAULT ‘’ COMMENT 'encrypted document data to uniquely identify ',
details text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci COMMENT ‘this field will contains the body of community messages’,
PRIMARY KEY (id),
KEY message_owner_nci (message_owner)
) ENGINE=InnoDB AUTO_INCREMENT=1234 DEFAULT CHARSET=utf8

In MYSQL 5.7.36 in This table’s ‘details’ column is set to utf8mb4 with the utf8mb4_unicode_ci collation to store special characters like emojis and other unique symbols.

Recently, we upgraded our MySQL from version 5.7.36 to 8.0.36 using the mysqldump command. After importing the data into MySQL 8, we altered the table’s character set to DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci.
However, after this alteration, we encountered data issues in the ‘message_table’. The ‘details’ column displays ‘???’ instead of the special characters.

show create table in mysql 8.0.36

CREATE TABLE message_table (
id int NOT NULL AUTO_INCREMENT COMMENT ‘primary key for community messages’,
icon varchar(255) NOT NULL DEFAULT ‘’ COMMENT ‘this field will contain IPFS hash’,
subject varchar(255) DEFAULT ‘’ COMMENT 'encrypted document data to uniquely identify ',
details text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci COMMENT ‘this field will contains the body of community messages’,
PRIMARY KEY (id),
KEY message_owner_nci (message_owner)
) ENGINE=InnoDB AUTO_INCREMENT=1234 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Could you please help me resolve this issue?

Verify the data at all stages to identify where the conversion loss is happening.

On 5.7, SELECT HEX(details) WHERE id = X
Then check the mysqldump output for id = X
Then, repeat the above SQL on 8 after import. Somewhere, something is causing conversion issues.

Hi Matt,

Thank you for your response.

Here is the output we observed:

  • MySQL 5.7: F09F989DF09F9890EFB88F
  • mysqldump: 'ð<9f><98><9d>ð<9f><98><90>ï¸<8f>'
  • MySQL 8: 3F3F3F3F3F3F3F3FEFB88F

Could you please advise on how to resolve this issue?

I was unable to reproduce this

mysql [localhost:8035] {msandbox} (test) > CREATE TABLE message_table ( id int NOT NULL AUTO_INCREMENT, icon varchar(255) NOT NULL DEFAULT '', subject varchar(255) DEFAULT '', details text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci, PRIMARY KEY (id)) ENGINE=InnoDB AUTO_INCREMENT=1234 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Query OK, 0 rows affected (0.04 sec)

mysql [localhost:8035] {msandbox} (test) > INSERT INTO message_table VALUES (1, 'my icon', 'my subject', UNHEX('F09F989DF09F9890EFB88F'));
Query OK, 1 row affected (0.01 sec)

mysql [localhost:8035] {msandbox} (test) > SELECT * FROM message_table;
+----+---------+------------+-------------+
| id | icon    | subject    | details     |
+----+---------+------------+-------------+
|  1 | my icon | my subject | 😝😐️           |
+----+---------+------------+-------------+
1 row in set (0.00 sec)

$ ~/dbdeployer/opt/mysql/8.0.35/bin/mysqldump -h 127.0.0.1 -P8035 -u msandbox -pmsandbox test message_table >dump.sql

$ ./use test <dump.sql

mysql [localhost:8035] {msandbox} (test) > SELECT * FROM message_table;
+----+---------+------------+-------------+
| id | icon    | subject    | details     |
+----+---------+------------+-------------+
|  1 | my icon | my subject | 😝😐️           |
+----+---------+------------+-------------+
1 row in set (0.00 sec)

Your mysqldump is clearly not correct. You might want to add a set names or something to the running mysqldump command.

Hi Matt,

You were right. I tried to back up only a single table from MySQL 5.7.39, and when I checked the mysqldump, it showed the emojis correctly. However, during the upgrade, I exported the full schema structure and data into separate files. The command I used to back up the schema’s data is below. During the full schema backup, the mysqldump file showed junk characters,strange behavior? but when I backed up a single table, it displayed the emojis correctly.

It seems that the mysqldump behavior is such that when I export only a table, it exports the data properly. However, when I export the full schema and data, it shows junk characters in the mysqldump file.

You can reproduce this issue using the following command:

Command used to take schema backup and mysqldump file contain juck char:
mysqldump -uroot -p --no-create-info --no-create-db --routines=0 --events=0 --triggers=0 --max_allowed_packet=512M --single-transaction --databases db_name_contain_table > db_name.sql

Command used to take only single table and it shows proper data in dumpfile:

mysqldump -uroot -p --no-create-info --no-create-db --routines=0 --events=0 --triggers=0 --max_allowed_packet=512M --single-transaction homebase_stage tablename > dump_table.sql

It might be because we were using UTF-8 as the default server and database character set in MySQL 5.7. To store special characters in MySQL 5.7, we changed the only column character set to utf8mb4. When exporting only the table, it exports the data correctly. However, when exporting the whole schema, it adds junk characters because the database character set is UTF-8.