mysql utf8 collation / conversion

mysql utf8 collation / conversion

  • Written by
    Walter Doekes
  • Published on

On a clean MySQL install — on a Debian or Ubuntu system at least — the MySQL server gets the latin1_swedish_ci with latin1 character set by default. Every time you set up a new machine, you must remember to either fix the defaults in my.cnf config file or to supply character set and collation options when creating databases.

Of course you’ll opt to set this by default in my.cnf first:

[client]
default-character-set = utf8

[mysqld]
init_connect = 'SET collation_connection = utf8_unicode_ci'
default-character-set = utf8
character-set-server = utf8
collation-server = utf8_unicode_ci

Unfortunately, sometimes you forget to do this, and now you have a database in some legacy (non-utf-8) character set. Well. Here is the fix. Create this stored procedure and run it supplying the the schema_name. It converts the database(*) and the tables to utf-8 with the utf8_unicode_ci collation.

delimiter //

DROP PROCEDURE IF EXISTS convert_to_utf8;
//

CREATE PROCEDURE convert_to_utf8 (schema_name VARCHAR(64))
BEGIN
  DECLARE done INT DEFAULT 0;
  DECLARE table_name VARCHAR(64);
  DECLARE schema_cur CURSOR FOR
    SELECT t.table_name FROM information_schema.tables t
    WHERE CAST(table_schema AS CHAR CHARACTER SET utf8) COLLATE utf8_unicode_ci
        = CAST(schema_name AS CHAR CHARACTER SET utf8) COLLATE utf8_unicode_ci
      AND table_collation NOT LIKE 'utf8%';
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

  -- ALTER DATABASE
  SET @statement = CONCAT('ALTER DATABASE ', schema_name,
    ' CHARACTER SET utf8 COLLATE utf8_unicode_ci');
  -- "This command is not supported in the prepared statement protocol yet"
  -- PREPARE executable FROM @statement;
  -- EXECUTE executable;
  -- DEALLOCATE PREPARE executable;
  SELECT @statement AS `Run this by hand!`;

  -- ALTER TABLES
  OPEN schema_cur;
  REPEAT
    FETCH schema_cur INTO table_name;
    IF NOT done THEN
      SET @statement = CONCAT('ALTER TABLE ', schema_name, '.', table_name,
        ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci');
      PREPARE executable FROM @statement;
      EXECUTE executable;
      DEALLOCATE PREPARE executable;
    END IF;
  UNTIL done END REPEAT;
END;
//

After switching back the delimiter to ‘;’, you can call this stored procedure like this:

CALL convert_to_utf8('meetbees_db');

This converts the meetbees_db database to the utf-8 characters set with the utf8_unicode_ci collation.

(*) The ALTER DATABASE command must be run by hand as the MySQL version that I’ve tried it on reports that “[this] command is not supported in the prepared statement protocol yet”. That’s one command to copy-paste, so you should be able to manage that.


Back to overview Newer post: unexpanded tabs / mercurial web / diff