Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:
一些字符集和collation,是用户client's对Server的相互作用。有些在前面已经提及了:
The server character set and collation can be determined from
the values of the character_set_server
and
collation_server
system variables.
服务器(server)的字符集和排序(collation)是被 character_set_server
和 collation_server
这两个系统变量system variables的值決定
The character set and collation of the default database can be
determined from the values of the
character_set_database
and
collation_database
system variables.
数据库的默认(default) 字符集(character se)t和 排序(collation) 是被 character_set_database
和 collation_database
这两个系统变量system variables的值決定
Additional character set and collation system variables are
involved in handling traffic for the connection between a client
and the server. Every client has connection-related character set
and collation system variables.
¯`'•.¸(•¿•).•:*¨¨*:•.(°_°)~*~get it ??? ~*~(°_°).•:*¨¨*:•.(•¿•)¸.•'´¯
Consider what a “connection” is: It's what you make
when you connect to the server. The client sends SQL statements,
such as queries, over the connection to the server. The server
sends responses, such as result sets, over the connection back to
the client.
This leads to several questions about character set
and collation handling for client connections, each of which can
be answered in terms of system variables:
想想"连接connection"是什么:是你连到服务器Server时作的事情。客户端Client通过这条连接发送SQL语句(statements),
比如查询(queries),通过 connection到server, 服务器则送回回应,像是结果集合 result sets, 通过连接给客户端client,
这导致
了客户端连接client connections处理字符集和collation的一些问题,它们每个都可以按照系统变量system variables来回答:
For Example : system variables指的就是 X SET character_set_client = x; SET character_set_results = x; SET character_set_connection = x;
character_set_client
system variable to be the character set in which statements
are sent by the client. character_set_client
这个变量variable, 来 作为客户端发送查询所用的字符集
MySQL Server预期接收到的数据是 latin1
但Client( Bigfat) 端藉着php传来的资料却是gb语系
例如 :Client( Bigfat)送出一些中文资料, 却不使用character_set_client
告诉MySQL Server 这送来的资料是gb2312或big5语系,结果 当然是乱码,
因为MySQL Server字符集的默认值default=latin1,
造成乱码的问题很多 光看这些 MySQL 中文參考手冊的翻譯, 还不足以解决你的问题,我尽可能多用例子說明
首先 你应该对电脑的编码方式 有清楚的概念
character_set_connection
and
collation_connection
system variables. It
converts statements sent by the client from
character_set_client
to
character_set_connection
(except for string
literals that have an introducer such as
_latin1
or _utf8
).
character_set_connection
和 collation_connection
这两个变量variables。
它把客户端送来的查询 从 character_set_client
转换成
character_set_connection
(除了latin1或者utf8 的字符串)collation_connection
is important for
comparisons of literal strings. For comparisons of strings
with column values, collation_connection
does not matter because columns have their own collation,
which has a higher collation precedence. collation_connection
对于比较字符串literal strings非常
重要,对于 列columns值比较字符串是没有关系does not matter,因为列columns有自己的collation(拥有更高优先) mysql4.1.x及以后都增加了的语言编码自动转换功能,它把客户端送来的查询 从character_set_client
转换成character_set_connection
光看文字的表面 我也搞不太清楚MySQL转换的机制,为什麽要转换 ? MySQL数据库只是储存空间,放进去的是什么,那么拿出来的 就该是什么。 转换 这样不是容易出现乱码 ? MySQL 4.0没有character sett字符集and排序(collation) 数据储存在表 table, 也没有support字符集的转换(converting). 要转换character set需要更改server's character set,然後使用外部工具去转换data. MySQL 4.0升级到MySQL 4.1 由于大部分用户使用的是latin语系问题不大 但导致中文语系问题的出现和复杂化, 有了新规定character_set_client
所以 MySQL 4.1 要求客户端必须指定这个字符集, 若你都不说(keep silence就是默认), MySQL就会按照默认值latin1 处理, MySQL再问你:character_set_connection
转换成什麽 ? 若你说:I have right to keep silence, MySQL就会把你的 "东东" ,转换为数据库默认的 "东西" 可能你存进MySQL是Big5的10000元, 出来变成latin1_swedish的10000元
如果 你不希望发生这种事, 可以去看看原作者的解释, 或继续看下去 明白下面的解释 Jim Winstead that lead web developer with MySQL AB
character_set_results
system variable
indicates the character set in which the server returns query
results to the client. This includes result data such as
column values, and result metadata such as column names. character_set_results
变量指示了送回给客户端 结果result集合的字符集,这包括了列column值,或者列column名 等结果数据result data
You can fine-tune the settings for these variables, or you can
depend on the defaults (in which case, you can skip the rest of
this section).
你可以调整这些变量的值,或者就使用默认的(那样 你就可以跳過以下部分)
There are two statements that affect the connection character
sets:
有两个语句statements作用(affect)连接字符集设置:
SET NAMES 'charset_name
' SET CHARACTER SETcharset_name
SET NAMES 'big5'
MySQL保存时就不知道用什么编码来保存数据,因此产生了乱码!
SET NAMES
indicates what character set the
client will use to send SQL statements to the server.
Thus,
SET NAMES 'cp1251'
tells the server
“future incoming messages from this client are in character
set cp1251
.”
It also specifies the
character set that the server should use for sending results back
to the client. (For example, it indicates what character set to
use for column values if you use a SELECT
statement.)
SET NAMES
指示:客户端送到server的SQL语句里是什么character set。
因此,SET NAMES 'cp1251' 就告诉server "将來future incoming的信息, 从这个客户端送来的信息将是使用'cp1251'这个字符集。
这也指定了server送回的结果所用的字符集,(例如 如果你用了一个select语句 它会指出列值column values
拥有的字符集)
A SET NAMES '
statement is equivalent to these three statements:
x
'
SET NAMES '
语句相当于下面三个语句: x
'
SET character_set_client =x
; SET character_set_results =x
; SET character_set_connection =x
;
Setting character_set_connection
to
x
also sets
collation_connection
to the default collation
for x
.
SET CHARACTER SET
is similar to SET
NAMES
but sets the connection character set and
collation to be those of the default database.
但connection character set and collation 是转换成默认值, 当然是 瑞典文 latin1_swedish_ci
A SET
CHARACTER SET
statement is equivalent to these three statements: x
SET
CHARACTER SET
语句相当于这三个语句: x
SET character_set_client =x
; SET character_set_results =x
; SET collation_connection = @@collation_database;
SET CHARACTER SET
is similar类似 to SET
NAMES
,
比较上面6句 红色statements 只有以下2句不同:
SET character_set_connection = x
;
SET collation_connection = @@collation_database;
叁考下面例子: mysql> SET NAMES 'gbk' ; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE '%collation%' ; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | gbk_chinese_ci | | collation_database | latin1_swedish_ci | | collation_server | latin1_swedish_ci | +----------------------+-------------------+ 3 rows in set (0.02 sec) mysql> SET CHARACTER SET gbk ; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE '%collation%' ; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | latin1_swedish_ci | | collation_database | latin1_swedish_ci | | collation_server | latin1_swedish_ci | +----------------------+-------------------+ 3 rows in set (0.00 sec) mysql> SHOW VARIABLES LIKE '%character%' ; +--------------------------+-------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------+ | character_set_client | gbk | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_results | gbk | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | E:\wamp\mysql\share\charsets\ | +--------------------------+-------------------------------+ 7 rows in set (0.00 sec)
有些參考書上建议这麽做 : mysql> SET collation_connection = 'big5_chinese_ci' ; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE '%collation%' ; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | big5_chinese_ci | | collation_database | latin1_swedish_ci | | collation_server | latin1_swedish_ci | +----------------------+-------------------+ 3 rows in set (0.00 sec) It apply in PHP : mysqli_query($link, 'SET CHARACTER SET big5'); mysqli_query($link, "SET collation_connection = 'big5_chinese_ci'");
Setting collation_connection
also sets
character_set_connection
to the character set
associated with the collation.
When a client connects, it sends to the server the name of the
character set that it wants to use. The server uses the name to
set the character_set_client
,
character_set_results
, and
character_set_connection
system variables.
In
effect, the server performs a SET NAMES
operation using the character set name.
当一个客户连接,它向Server发送它想使用的字符集的名字,Server用这名字把character_set_client
, character_set_results
, and character_set_connection
这些变量设置成那个字符集,
实际上,Server使用字符集名字执行了SET NAMES
操作
With the mysql client, it is not necessary to
execute SET NAMES
every time you start up if
you want to use a character set different from the default. You
can add the --default-character-set
option
setting to your mysql statement line, or in
your option file.
For example, the following option file setting
changes the three character set variables set to
koi8r
each time you invoke
mysql:
如果你不想用默认字符集, mysql 客户端 不需要每次启动时执行SET NAMES
。
你可以加上 --default-character-set
在 mysql执行语句行, 或在option file(configuration file)加上,
比如,下面的option file(my.ini)选项文件设置 使你每次执行mysql程序时把三句字符集变量
改成 koi8r:
[mysql] default-character-set=koi8r
当启动MySQL Server配置文件Configuration File会被读取,
Windows平台 放置在 C:\Program Files\MySQL\MySQL Server 5.1或 C:\WINDOWS\my.ini 或C:\wamp\mysql\my.ini
Unix平台 放置在 /etc/my.cnf
如果找不到可以用find命令找一下:
起动mysqld时须指定 --default-character-set=gb2312
#或default-character-set=utf8
#或default-character-set=gbk,gb2312,big5,utf8(加上 # 為註解,沒有作用)
然后 重新启动MySQL server,就生效了。
在mysqld配置项中加入下面配置:
# The default character set that will be used when a new schema or table is
# created and no character set is defined
default-character-set=latin1
若要正确显示中文繁、简、日文、韩文 使用utf8
default-character-set=utf8当你建立一个新的表格,若未指定字集(character set),预设字型latin1将被使用that is the same as saying --character-set-server=latin1
创建数据库时,你可以指定字符集,如果没指定,就使用数据库服务器的字符集的默认值(default)
指定MySQL数据库服务器的字符集把 my.ini 改成 default-character-set=utf8 然后打入 show variables like '%character%'; 就会看到下图, 注意 原本默认Value全是 latin1 mysql> show variables like '%character%'; +--------------------------+-------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_results | latin1 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | E:\wamp\mysql\share\charsets\ | +--------------------------+-------------------------------+ 7 rows in set (0.00 sec) mysql> show variables like '%collation%'; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | latin1_swedish_ci | | collation_database | utf8_general_ci | | collation_server | utf8_general_ci | +----------------------+-------------------+ 3 rows in set (0.00 sec) mysql>
可以看到 _database, _server的值 都改变, 这是MySQL server的my.ini 在你的掌控下
简单么 ?
但有时MySQL server在远方,就不简单 你就只能用PHP 去控制 MySQL server
Example: Suppose that column1
is defined as
CHAR(5) CHARACTER SET latin2
. If you do not say
SET NAMES
or SET CHARACTER
SET
, then for SELECT column1 FROM t
,
the server sends back all the values for
column1
using the character set that the client
specified when it connected. On the other hand, if you say
SET NAMES 'latin1'
or SET CHARACTER
SET latin1
before issuing the SELECT
statement, the server converts the latin2
values to latin1
just before sending results
back. Conversion may be lossy if there are characters that are not
in both character sets.
例如:假设column1
定义是 CHAR(5) CHARACTER SET latin2
。如果你不用SET NAMES
或者 SET CHARACTER
SET
,那么对于你的 SELECT column1 FROM t
请求,Server
会把column1
的所有值用连接建立时 客户端指定的字符集 送回。另一方面,如果你
用了 SET NAMES 'latin1'
or SET CHARACTER
SET latin1
`,那么在送回结果之前,
Server会把 latin2
的值转成latin1
,如果 有字符 不在两种字符集里,
转化Conversion可能会失败 。
If you do not want the server to perform any conversion of result
sets, set character_set_results
to
NULL
:
如果你不希望Server作任何转换,就把 character_set_results
设置成 NULL
SET character_set_results = NULL;
上面提到 你若不希望MySQL server 把Big5的10000元 转换成latin1_swedish的10000元 可以把Conversion(转换)关掉,使用 mysql> SET character_set_results = NULL; Example: column1的数据是gbk
,如果你用了SET NAMES 'gb2312'
orSET CHARACTER SET gb2312
查询column1的数据 结果会是??? ??? ? 这种乱码 因为在送回结果之前,Server会把gbk
的值转成gb2312
Note: Currently, UCS-2 cannot be
used as a client character set, which means that SET
NAMES 'ucs2'
does not work.
To see the values of the character set and collation system variables that apply to your connection, use these statements:
SHOW VARIABLES LIKE 'character_set%'; SHOW VARIABLES LIKE 'collation%';
Get it ?
清楚上面的游戏规则 ?
但不能 保证不乱码,还必须确定 网页自身的编码,Web page里指定的编码,与MySQL处理的编码是统一的
User Comments
This example are usable for russian users who want to have windows-1251 encoding on the site and koi8-r encoding into the database:
set CHARACTER SET cp1251_koi8
If you are wondering why -despite all UTF8 settings- you still don't get non-ASCII characters right, it might be the case that the _connection_ character set is still standard latin1.
To change the connection charset permanently to UTF-8, add the following line in the [mysqld] section:
要改变connection charset永久 成UTF-8 把下面这行加入[mysqld] :
[mysqld]
init-connect='SET NAMES utf8'
init代表 initial初始
The other way to let MySQL know what connection charset you intend to use is per-connection based. After a connection is established (with host, name, password), add the following two lines in your application:
把下面这二行加入你的应用程式 application:
SET NAMES utf8;
SET CHARACTER_SET utf8;
The last hint is given most of the time, but not everybody is happy to change every application (esp. when some lazy add-on and extension programmers use their own connection stuff instead of the (PHP) application.
mysql> SET character_set_client = x;
mysql> SET character_set_results = x;
mysql> SET collation_connection = @@collation_database;
When a client connects, it sends to the server the name of the character set that it wants to use. The server sets the character_set_client, character_set_results, and character_set_connection variables to that character set. (In effect, the server performs a SET NAMES operation using the character set.)
With the mysql client, it is not necessary to execute SET NAMES every time you start up if you want to use a character set different from the default. You can add the --default-character-set option setting to your mysql statement line, or in your option file. For example, the following option file setting changes the three character set variables set to koi8r each time you run mysql:
[mysql]
default-character-set=koi8r
Example: Suppose that column1 is defined as CHAR(5) CHARACTER SET latin2. If you do not say SET NAMES or SET CHARACTER SET, then for SELECT column1 FROM t, the server sends back all the values for column1 using the character set that the client specified when it connected. On the other hand, if you say SET NAMES 'latin1' or SET CHARACTER SET latin1, then just before sending results back, the server converts the latin2 values to latin1. Conversion may be lossy if there are characters that are not in both character sets.
If you do not want the server to perform any conversion, set character_set_results to NULL:
mysql> SET character_set_results = NULL;