Character Set & Collation

Character Set & Collation

ℹ️
This article is a rewrite of Claudio Kuenzler’s excellent blog post.
Ensure you have a backup of your database before starting on these instructions.

The following article will walk you through changing an existing databases character set and collation. For this article, we’ll move the database from using the UTF8 character set and its associated collation to using UTF8mb4.

A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. For more information see MySQL’s Reference Manual.

Update Database Defaults

Login to MySQL and update the database’s default character set and collation.

# login to MySQL command-line interface.
mysql -u root -p

# change [DATABASE] to the name of the database you'd like to modify.
mysql > ALTER DATABASE [DATABASE] DEFAULT CHARACTER SET = utf8mb4 DEFAULT COLLATE = utf8mb4_bin;

Update Existing Tables & Columns

Download and save the following script onto the server running MySQL. Ensure the database variable is updated to the name of the database that will be modified.

ℹ️
The script is available on my References GitHub Repository.
# change [DATABASE] to the name of the database you'd like to modify.

use information_schema;
SELECT concat("ALTER DATABASE `",table_schema,"` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin;") as _sql
FROM `TABLES` where table_schema like "[DATABASE]" and TABLE_TYPE='BASE TABLE' group by table_schema;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name,"` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;") as _sql  
FROM `TABLES` where table_schema like "[DATABASE]" and TABLE_TYPE='BASE TABLE' group by table_schema, table_name;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type,"(",character_maximum_length,") CHARACTER SET utf8mb4 COLLATE utf8mb4_bin",IF(is_nullable="YES"," NULL"," NOT NULL"),";") as _sql
FROM `COLUMNS` where table_schema like "[DATABASE]" and data_type in ('varchar','char');
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type," CHARACTER SET utf8mb4 COLLATE utf8mb4_bin",IF(is_nullable="YES"," NULL"," NOT NULL"),";") as _sql
FROM `COLUMNS` where table_schema like "[DATABASE]" and data_type in ('text','tinytext','mediumtext','longtext');

Run this script against the database. This script will create a output.sql file containing a list of SQL commands that will successfully convert the database to the new character set and collation.

mysql -u root -p < script.sql | egrep "^ALTER" > output.sql

At this point the database has not been modified. Next run the generated output.sql file against the database to successfully update the database to the new character set and collation.

mysql -u root -p < output.sql

If no errors are produced, you have successfully updated the database to the new character set and collation.