Wednesday, November 23, 2011

How to Convert MySQL Database Character Set from default latin1_swedish_ci to UTF8

Currently I am working with Joomla CMS (Content Management System) which is hosted in a foreign web host and its MySQL database is using latin1_swedish_ci character set. I need to convert it to UTF8 since the content of the website that I'm dealing with is in English, Chinese Traditional and Chinese Simplified.

After searching in the web, I found a php code that can do the thing I want. The solution is pretty easy (at least after I found this code), just copy and paste the code below in a php file, let's call it a converter.php file, and then on the second to fourth line of the codes below change the parameters so that it will fit to your database settings. Here's the code:

<?php
define('DB_NAME', 'putyourdbnamehere');    // Put your database name here
define('DB_USER', 'usernamehere');     // MySQL Username
define('DB_PASSWORD', 'yourpasswordhere'); // Password
define('DB_HOST', 'localhost');    // Mostly just use the localhost

function UTF8_DB_Converter_DoIt() {
 $tables = array();
 $tables_with_fields = array();

 // Since we cannot use the WordPress Database Abstraction Class (wp-db.php),
 // we have to make an a stand-alone/direct connection to the database.
 $link_id = mysql_connect(DB_HOST, DB_USER, DB_PASSWORD) or die('Error establishing a database connection');
 mysql_select_db(DB_NAME, $link_id);

 // Gathering information about tables and all the text/string fields that can be affected
 // during the conversion to UTF-8.
 $resource = mysql_query("SHOW TABLES", $link_id);
 while ( $result = mysql_fetch_row($resource) )
  $tables[] = $result[0];

 if ( !empty($tables) ) {
  foreach ( (array) $tables as $table ) {
   $resource = mysql_query("EXPLAIN $table", $link_id);
   while ( $result = mysql_fetch_assoc($resource) ) {
    if ( preg_match('/(char)|(text)|(enum)|(set)/', $result['Type']) )
     $tables_with_fields[$table][$result['Field']] = $result['Type'] . " " . ( "YES" == $result['Null'] ? "" : "NOT " ) . "NULL " .  ( !is_null($result['Default']) ? "DEFAULT '". $result['Default'] ."'" : "" );
   }
  }

  // Change all text/string fields of the tables to their corresponding binary text/string representations.
  foreach ( (array) $tables as $table )
   mysql_query("ALTER TABLE $table CONVERT TO CHARACTER SET binary", $link_id);

  // Change database and tables to UTF-8 Character set.
  mysql_query("ALTER DATABASE " . DB_NAME . " CHARACTER SET utf8", $link_id);
  foreach ( (array) $tables as $table )
   mysql_query("ALTER TABLE $table CONVERT TO CHARACTER SET utf8", $link_id);

  // Return all binary text/string fields previously changed to their original representations.
  foreach ( (array) $tables_with_fields as $table => $fields ) {
   foreach ( (array) $fields as $field_type => $field_options ) {
    mysql_query("ALTER TABLE $table MODIFY $field_type $field_options", $link_id);
   }
  }

  // Optimize tables and finally close the mysql link.
  foreach ( (array) $tables as $table )
   mysql_query("OPTIMIZE TABLE $table", $link_id);
  mysql_close($link_id);
 } else {
  die('<strong>There are no tables?</strong>');
 }

 return true;
}
UTF8_DB_Converter_DoIt();
?>
Source: http://blog.mashida.info/?p=1353 and http://bingu.net/472/latin1_swedish_ci-to-utf8_general_ci/

Save the converter.php file and then upload to your webhost www folder and execute that php by requesting that file using your browser. E.g. If your webhost domain is www.mywebhost.com, then you can request the file with www.mywebhost.com/converter.php. And boom, everything is done, your database will be converted from latin1_swedish_ci character set to UTF8. And don't forget to delete your converter.php file on your webhost since it will not be used anymore.

No comments:

Post a Comment