CREATE TABLE message_table ( id int(11) NOT NULL AUTO_INCREMENT COMMENT ‘primary key for community messages’, icon varchar(255) NOT NULL DEFAULT ‘’ COMMENT ‘this field will contain IPFS hash’, subject varchar(255) DEFAULT ‘’ COMMENT 'encrypted document data to uniquely identify ', details text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci COMMENT ‘this field will contains the body of community messages’,
PRIMARY KEY (id),
KEY message_owner_nci (message_owner)
) ENGINE=InnoDB AUTO_INCREMENT=1234 DEFAULT CHARSET=utf8
In MYSQL 5.7.36 in This table’s ‘details’ column is set to utf8mb4 with the utf8mb4_unicode_ci collation to store special characters like emojis and other unique symbols.
Recently, we upgraded our MySQL from version 5.7.36 to 8.0.36 using the mysqldump command. After importing the data into MySQL 8, we altered the table’s character set to DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci.
However, after this alteration, we encountered data issues in the ‘message_table’. The ‘details’ column displays ‘???’ instead of the special characters.
show create table in mysql 8.0.36
CREATE TABLE message_table ( id int NOT NULL AUTO_INCREMENT COMMENT ‘primary key for community messages’, icon varchar(255) NOT NULL DEFAULT ‘’ COMMENT ‘this field will contain IPFS hash’, subject varchar(255) DEFAULT ‘’ COMMENT 'encrypted document data to uniquely identify ', details text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci COMMENT ‘this field will contains the body of community messages’,
PRIMARY KEY (id),
KEY message_owner_nci (message_owner)
) ENGINE=InnoDB AUTO_INCREMENT=1234 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Verify the data at all stages to identify where the conversion loss is happening.
On 5.7, SELECT HEX(details) WHERE id = X
Then check the mysqldump output for id = X
Then, repeat the above SQL on 8 after import. Somewhere, something is causing conversion issues.
You were right. I tried to back up only a single table from MySQL 5.7.39, and when I checked the mysqldump, it showed the emojis correctly. However, during the upgrade, I exported the full schema structure and data into separate files. The command I used to back up the schema’s data is below. During the full schema backup, the mysqldump file showed junk characters,strange behavior? but when I backed up a single table, it displayed the emojis correctly.
It seems that the mysqldump behavior is such that when I export only a table, it exports the data properly. However, when I export the full schema and data, it shows junk characters in the mysqldump file.
You can reproduce this issue using the following command:
Command used to take schema backup and mysqldump file contain juck char:
mysqldump -uroot -p --no-create-info --no-create-db --routines=0 --events=0 --triggers=0 --max_allowed_packet=512M --single-transaction --databases db_name_contain_table > db_name.sql
Command used to take only single table and it shows proper data in dumpfile:
It might be because we were using UTF-8 as the default server and database character set in MySQL 5.7. To store special characters in MySQL 5.7, we changed the only column character set to utf8mb4. When exporting only the table, it exports the data correctly. However, when exporting the whole schema, it adds junk characters because the database character set is UTF-8.