MySQL 3.23, 4.0, 4.1 Reference Manual :: 10 Character Set Support :: 10.9 Upgrading Character Sets from MySQL 4.0 :: 10.9.2 Converting 4.0 Character Columns to 4.1 Format
  • MySQL 3.23, 4.0, 4.1 Reference Manual

  • 10.9 Upgrading Character Sets from MySQL 4.0
  • 10.9.1 4.0 Character Sets and Corresponding 4.1 Character Set/Collation Pairs
  • 10.9.2 Converting 4.0 Character Columns to 4.1 Format


10.9.2. Converting 4.0 Character Columns to 4.1 Format

Normally, the server runs using the latin1 character set by default. If you have been storing column data that actually is in some other character set that the 4.1 server supports directly, you can convert the column. However, you should avoid trying to convert directly from latin1 to the "real" character set. This may result in data loss. Instead, convert the column to a binary data type, and then from the binary type to a non-binary type with the desired character set. Conversion to and from binary involves no attempt at character value conversion and preserves your data intact. For example, suppose that you have a 4.0 table with three columns that are used to store values represented in latin1, latin2, and utf8:

CREATE TABLE t
(
    latin1_col CHAR(50),
    latin2_col CHAR(100),
    utf8_col CHAR(150)
);

For MySQL 4.1, you want to convert this table to leave latin1_col alone but change the latin2_col and utf8_col columns to have character sets of latin2 and utf8. Before upgrading to 4.1, back up your table, then convert the columns as follows:

ALTER TABLE t MODIFY latin2_col BINARY(100);
ALTER TABLE t MODIFY utf8_col BINARY(150);

Then, after upgrading to 4.1, complete the conversion by issuing these statements:

ALTER TABLE t MODIFY latin2_col CHAR(100) CHARACTER SET latin2;
ALTER TABLE t MODIFY utf8_col CHAR(150) CHARACTER SET utf8;

The first two statements “remove” the character set information from the latin2_col and utf8_col columns. The second two statements assign the proper character sets to the two columns.

If you like, you can combine the to-binary conversions and from-binary conversions into single statements. In MySQL 4.0, do this:

ALTER TABLE t
    MODIFY latin2_col BINARY(100),
    MODIFY utf8_col BINARY(150);

After upgrading to 4.1, do this:

ALTER TABLE t
    MODIFY latin2_col CHAR(100) CHARACTER SET latin2,
    MODIFY utf8_col CHAR(150) CHARACTER SET utf8;

If you can ensure that the tables will not otherwise be modified before you perform the character set conversion, you can issue all of the ALTER TABLE statements after upgrading to MySQL 4.1.

If you specified attributes when creating a column initially, you should also specify them when altering the table with ALTER TABLE. For example, if you specified NOT NULL and an explicit DEFAULT value, you should also provide them in the ALTER TABLE statement. Otherwise, the resulting column definition will not include those attributes.


User Comments

Posted by shimon doodkin on February 1 2005 5:16pm[Delete] [Edit]

<?php
/* $Id: mysqlupgrade.php,v 1.3 2005/01/31 22:04:02 shimon Exp $ */
// upgrade CHARACTER SET for MySQL 4.1.0 +
//
// Did you export all databases including mysql database before runing this file ?
//
// known bug of this program it dont know to treat FULLTEXT index
//
//by Shimon Doodkin shimon_d@hotmail.com

$conn = mysql_connect("localhost", "mashovim.co.il", "***");
$printonly=true; //change this to false to alter on the fly
$charset="hebrew";
$collate="hebrew_general_ci";
$altertablecharset=true;
$alterdatabasecharser=true;

function
PMA_getDbCollation($db)
{
$sq='SHOW CREATE DATABASE `'.$db.'`;';
$res = mysql_query($sq);
if(!
$res) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
if(
$row = mysql_fetch_assoc($res))
{
  
$tokenized = explode(' ', $row[1]);
  unset(
$row, $res, $sql_query);
  for (
$i = 1; $i + 3 < count($tokenized); $i++)
  {
   if (
$tokenized[$i] == 'DEFAULT' && $tokenized[$i + 1] == 'CHARACTER' && $tokenized[$i + 2] == 'SET')
   {
    if (isset(
$tokenized[$i + 5]) && $tokenized[$i + 4] == 'COLLATE')
    {
     return array(
$tokenized [$i + 3],$tokenized[$i + 5]); // We found the collation!
    
}
    else
    {
     return array(
$tokenized [$i + 3]);
    }
   }
  }
}
return
'';
}

?>

<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1251<? //remember to change it if needed ?>" />
<xmp>
<?

$rs2
= mysql_query("SHOW DATABASES");
if(!
$rs2) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
while (
$data2 = mysql_fetch_row($rs2))
{
$db=$data2[0];
$db_cha=PMA_getDbCollation($db);
if (
$db!='mysql' /* &&( $db=='mydb_x' || $db=='mydb_y' ) */ ) // limit to database(s) - $db=='mydb_x' || $db=='mydb_y' || $db=='mydb_z'
if ( substr($db_cha[0],0,4)!='utf8' ) // limit to charset
{
mysql_select_db($db);
$rs = mysql_query("SHOW TABLES");
if(!
$rs) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
while (
$data = mysql_fetch_row($rs))
{
$rs1 = mysql_query("show FULL columns from $data[0]");
if(!
$rs1) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
while (
$data1 = mysql_fetch_assoc($rs1))
{
  if(
in_array(array_shift(split("\\(",$data1['Type'],2)),array(
//'national char',
//'nchar',
//'national varchar',
//'nvarchar',
'char',
'varchar',
'tinytext',
'text',
'mediumtext',
'longtext',
'enum',
'set'
  
)))
  {
   if(
substr($data1['Collation'],0,4)!='utf8') // limit to charset
   
{
    
$sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type'].' CHARACTER SET binary '.($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').';';
    if(!
$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
    else
    {
     echo (
$sq."\n") ;
     
$sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type']." CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate").($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').($data1['Comment']==''?'':' COMMENT \''.mysql_escape_string($data1['Comment']).'\'').';';
     if(!
$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
     else echo (
$sq."\n") ;
    }
   }
  }
}
if(
$altertablecharset)
{
  
/*
  $sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET binary";
  echo ($sq."\n") ;
  if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
  */
  
$sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
  echo (
$sq."\n") ;
    if(!
$printonly)
  if(!
mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
}
}
if(
$alterdatabasecharser)
{
  
/*
  $sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET binary";
  echo ($sq."\n") ;
  if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
  */
  
$sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
  echo (
$sq."\n") ;
    if(!
$printonly)
  if(!
mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
}
}
}
?>

</xmp>

Posted by Mariusz Gniazdowski on February 5 2005 12:27pm[Delete] [Edit]

Is this MODIFY query correct? Only way that i was able to get proper sorting on utf8/utf8_polish_ci table is:

ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;

like said on page:
http://dev.mysql.com/doc/mysql/en/alter-table.html

Add your own comment.





real_vine@hotmail.com