perl操作mysql正确处理utf8
2013-06-03 15:12:35 阿炯

本站赞助商链接,请多关照。 在日常的编程过程中,会经常碰到各种与编码相关的问题,在操作数据库时这个问题也很突出;为了统一操作与适应主流编码起见,全程采用utf8编码,以解决相关过程中的编码问题。

注意:编写的脚本文件本身要以utf8的编码格式保存。

先说说数据库mysql的编码
First off, make sure the database is using UTF8.
mysql> STATUS;

This command will show you the “Db characterset”. If it’s not “utf8” then:
mysql> ALTER DATABASE database_name CHARACTER SET utf8;

Although the database is now using UTF8 your tables and columns might be set to something else. Check the table’s character set by doing this:
mysql> SHOW CREATE TABLE table_name;

This will show you more information about your table than a simple “DESCRIBE” will. The last part will show the table’s CHARSET. If it’s not “utf8”:
mysql> ALTER TABLE table_name CONVERT TO CHARACTER SET utf8;

I don’t know how to find out the encoding of individual columns, or how likely it is that they could still be set to something else, but this is how to convert them to UTF8 (in this case, a VARCHAR(255) column):
mysql> ALTER TABLE table_name MODIFY column_name VARCHAR(255) CHARACTER SET utf8;

To add to the fun, your connection to the database itself has its own encoding. We’ll see more of this later, but if you’re using MySQL on the command line, entering this command:
mysql> SET NAMES utf8;

will mean that your current session is in UTF8. If you’re using PHPMyAdmin, you might not have to worry about this(?). I expect your command line client also has its own encoding…

连接到数据库时声明指定utf8

It may have nothing to do with Perl. Check to make sure you're using UTF encodings in the pertinent MySQL table columns.

It's worth noting that if you're running a version of DBD::mysql new enough (3.0008 on), you can do the following: $dbh->{'mysql_enable_utf8'} = 1; and then everything's decode()ed/encode()ed for you on the way out from/in to DBI.

Enable UTF8, when you connect to database likde this:
my $dbh = DBI->connect("dbi:mysql:dbname=db_name", "db_user", "db_pass", {RaiseError => 0, PrintError => 0, mysql_enable_utf8 => 1}) or die "Connect to database failed.";

$dbh->do(qq{SET NAMES 'utf8';});

definitely saves the day for accessing an utf-8 declared database, but take notice, if you are going to do any perl processing of any data obatined from the db it would be wise to store it in a perl var as an utf8 string with, as this operation is not implicit.

这里提供一示例:
my %mydbattr=(
 RaiseError=>1,
 AutoCommit=>1,
 mysql_enable_utf8=>1
);

$pdbh=DBI->connect_cached($pdsn,$dbcfg->{user},$dbcfg->{password},\%mydbattr);

Perl脚本内容及输出指定
$utfstring = decode('utf8',$string_from_db);

of course, for proper i/o handling of utf8 strings (reading, printing, writing to output) remember to set
use open ':utf8';
and
binmode STDOUT, ":utf8";

the latter being essential for printing out utf8 strings.

先探测是否为utf8,再转换
utf8::decode( $row{'item_title'} ) unless utf8::is_utf8( $row{'item_title'} );

示例如下:
use utf8;
use 5.010;
use CGI qw/:standard/;
use strict;
use Encode;

binmode(STDOUT,':encoding(utf8)');

#初始化cgi执行环境
my $q=CGI->new;

#设置http输出头
print $q->header({-type=>'text/html',-charset=>'utf-8'});

即可在终端及网页中正常显示及处理中文了。