Perl文件操作入门
2013-04-27 09:57:11 阿炯

本文是关于Perl文件操作的入门参考,介绍了基本的文件操作相关的函数及方法,对其I/O认识和操作很有帮助。

About filehandle-关于文件句柄

A filehandle is a named internal Perl structure that associates a physical file with a name. All filehandles are capable of read/write access, so you can read from and update any file or device associated with a filehandle. However, when you associate a filehandle, you can specify the mode in which the filehandle is opened.
Perl的文件句柄是与物理文件相关的内部数据结构,所有文件句柄都可以读/写、访问,因此您可以读取和更新任何文件或设备关联到一个文件句柄。当然也可以指定模式中打开的文件句柄。

Three basic file handles are - STDIN, STDOUT, and STDERR.
有三种内置默认的文件句柄,分别是:输入、输出及错误句柄。

Perl 使用一种叫做文件句柄类型的变量来操作文件,从文件读取或者写入数据需要使用文件句柄,文件句柄(file handle)是一个I/O连接的名称。Perl提供了三种文件句柄:STDIN,STDOUT,STDERR,分别代表标准输入、标准输出和标准出错输出。Perl 中打开文件可以使用以下方式:
open FILEHANDLE, EXPR
open FILEHANDLE

sysopen FILEHANDLE, FILENAME, MODE, PERMS
sysopen FILEHANDLE, FILENAME, MODE

参数说明:
FILEHANDLE:文件句柄,用于存放一个文件唯一标识符。
EXPR:文件名及文件访问类型组成的表达式。
MODE:文件访问类型。
PERMS:访问权限位(permission bits)。

Opening and Closing Files-打开及关闭文件
There are following two functions with multiple forms which can be used to open any new or existing file in Perl.

open FILEHANDLE, EXPR
open FILEHANDLE
sysopen FILEHANDLE, FILENAME, MODE, PERMS
sysopen FILEHANDLE, FILENAME, MODE


Here FILEHANDLE is the file handle returned by open function and EXPR is the expression having file name and mode of opening the file.

Following is the syntax to open file.txt in read-only mode. Here less than < signe indicates that file has to be opend in read-only mode:
open(DATA, "<file.txt");

Here DATA is the file handle which will be used to read the file. Here is the example which will open a file and will print its content over the screen.

open(DATA, "<file.txt");
while(<DATA>){
 print "$_";
}

open 函数

以下代码我们使用 open 函数以只读的方式(<)打开文件 file.txt:
open(DATA, "<file.txt");

<表示只读方式。

代码中的 DATA 为文件句柄用于读取文件,下面将打开文件并将文件内容输出:
open(DATA, "<file.txt") or die "file.txt 文件无法打开, $!";

while(<DATA>){
   print "$_";
}

以下代码以写入( > )的方式打开文件 file.txt:
open(DATA, ">file.txt") or die "file.txt 文件无法打开, $!";

>表示写入方式。

如果需要以读写方式打开文件,可以在 > 或 < 字符前添加 + 号:
open(DATA, "+<file.txt"); or die "file.txt 文件无法打开, $!";

这种方式不会删除文件原来的内容,如果要删除,格式如下所示:
open DATA, "+>file.txt" or die "file.txt 文件无法打开, $!";

如果要向文件中追加数据,则在追加数据之前,只需要以追加方式打开文件即可:
open(DATA,">>file.txt") || die "file.txt 文件无法打开, $!";

>> 表示向现有文件的尾部追加数据,如果需要读取要追加的文件内容可以添加 + 号:
open(DATA,"+>>file.txt") || die "file.txt 文件无法打开, $!";

下表列出了不同的访问模式:

模式描述
< 或 r只读方式打开,将文件指针指向文件头。
> 或 w写入方式打开,将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
>> 或 a写入方式打开,将文件指针指向文件末尾。如果文件不存在则尝试创建之。
+< 或 r+读写方式打开,将文件指针指向文件头。
+> 或 w+读写方式打开,将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
+>> 或 a+读写方式打开,将文件指针指向文件末尾。如果文件不存在则尝试创建之。


Entities    Definition
< or r        Read Only Access
> or w        Creates, Writes, and Truncates
>> or a        Writes, Appends, and Creates
+< or r+    Reads and Writes
+> or w+    Reads, Writes, Creates, and Truncates
+>> or a+    Reads, Writes, Appends, and Creates


sysopen 函数

sysopen 函数类似于 open 函数,只是它们的参数形式不一样。

以下实例是以读写(+<filename)的方式打开文件:
sysopen(DATA, "file.txt", O_RDWR);

如果需要在更新文件前清空文件,则写法如下:
sysopen(DATA, "file.txt", O_RDWR|O_TRUNC );

你可以使用 O_CREAT 来创建一个新的文件,O_WRONLY 为只写模式,O_RDONLY 为只读模式。

The PERMS 参数为八进制属性值,表示文件创建后的权限,默认为 0x666。

下表列出了可能的模式值:

模式描述
O_RDWR读写方式打开,将文件指针指向文件头。
O_RDONLY只读方式打开,将文件指针指向文件头。
O_WRONLY写入方式打开,将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
O_CREAT创建文件
O_APPEND追加文件
O_TRUNC将文件大小截为零
O_EXCL如果使用O_CREAT时文件存在,就返回错误信息,它可以测试文件是否存在
O_NONBLOCK非阻塞I/O使我们的操作要么成功,要么立即返回错误,不被阻塞。


Value        Definition
O_RDWR        Read and Write
O_RDONLY    Read Only
O_WRONLY    Write Only
O_CREAT        Create the file
O_APPEND    Append the file
O_TRUNC        Truncate the file
O_EXCL        Stops if file already exists
O_NONBLOCK    Non-Blocking usability


close 函数

在文件使用完后,要关闭文件,以刷新与文件句柄相关联的输入输出缓冲区,关闭文件的语法如下:
close FILEHANDLE
close

FILEHANDLE 为指定的文件句柄,如果成功关闭则返回 true。

close(DATA) || die "无法关闭文件";

Reading and Writing Filehandles-文件句柄的读写
Once you have an open filehandle, you need to be able to read and write information. There are a number of different ways of reading and writing data into the file.

The <FILEHANDL> Operator

The main method of reading the information from an open filehandle is the <FILEHANDLE> operator. In a scalar context it returns a single line from the filehandle. For example:
print "What is your name?\n";
$name = <STDIN>;
print "Hello $name\n";

When you use the <FILEHANDLE> operator in a list context, it returns a list of lines from the specified filehandle. For example, to import all the lines from a file into an array:
open(DATA,"<import.txt") or die "Can't open data";
@lines = <DATA>;
close(DATA);

getc 函数
The getc function returns a single character from the specified FILEHANDLE, or STDIN if none is specified:

getc FILEHANDLE
getc
If there was an error, or the filehandle is at end of file, then undef is returned instead.

read 函数
The read function reads a block of information from the buffered filehandle: This function is used to read binary data from the file.

read FILEHANDLE, SCALAR, LENGTH, OFFSET
read FILEHANDLE, SCALAR, LENGTH
The length of the data read is defined by LENGTH, and the data is placed at the start of SCALAR if no OFFSET is specified. Otherwise data is placed after OFFSET bytes in SCALAR. The function returns the number of bytes read on success, zero at end of file, or undef if there was an error.

print 函数
For all the different methods used for reading information from filehandles, the main function for writing information back is the print function.

print FILEHANDLE LIST
print LIST
print
The print function prints the evaluated value of LIST to FILEHANDLE, or to the current output filehandle (STDOUT by default). For example:
print "Hello World!\n";

Copying Files-复制文件
Here is the example, which opens an existing file file1.txt and read it line by line and generate another copy file file2.txt.(这个示例用于将file1.txt逐行地复制到file2.txt)

# Open file to read
open(DATA1, "<file1.txt");
# Open new file to write
open(DATA2, ">file2.txt");

# Copy data from one file to another.
while(<DATA1>){
   print DATA2 $_;
}
close( DATA1 );
close( DATA2 );

Renaming a file-为文件更名
Here is an example which shows how we can rename a file file1.txt to file2.txt. Assuming file is available in /usr/test directory.

rename ("/usr/test/file1.txt", "/usr/test/file2.txt" );

This function rename takes two arguments and it just rename existing file(该函数在调用时要有两个参数,且被重命名的文件要存在)

Deleting an exiting file-删除文件
Here is an example which shows how to delete a file file1.txt using unlink function.
unlink ("/usr/test/file1.txt");

Locating Your Position Within a File-在文件中定位
You can use to tell function to know the current position of a file and seek function to point a particular position inside the file.

tell 函数
The first requirement is to find your position within a file, which you do using the tell function.

tell FILEHANDLE
tell
This returns the position of the file pointer, in bytes, within FILEHANDLE if specified, or the current default selected filehandle if none is specified.

seek 函数
The seek function positions the file pointer to the specified number of bytes within a file.

seek FILEHANDLE, POSITION, WHENCE
The function uses the fseek system function, and you have the same ability to position relative to three different points: the start, the end, and the current position. You do this by specifying a value for WHENCE.

Zero sets the positioning relative to the start of the file. For example, the line sets the file pointer to the 256th byte in the file.

seek DATA, 256, 0;

Getting File Information-取得文件信息
You can test certain features very quickly within Perl using a series of test operators known collectively as -X tests.(在Perl中可使用'-X'操作来快速得到或测试文件具有的属性)

For example, to perform a quick test of the various permissions on a file, you might use a script like this:(如下示例,用于快速得到和测试一个文件的各种权限等信息)
my (@description,$size);
if (-e $file){
   push @description, 'binary' if (-B _);
   push @description, 'a socket' if (-S _);
   push @description, 'a text file' if (-T _);
   push @description, 'a block special file' if (-b _);
   push @description, 'a character special file' if (-c _);
   push @description, 'a directory' if (-d _);
   push @description, 'executable' if (-x _);
   push @description, (($size = -s _)) ? "$size bytes" : 'empty';
   print "$file is ", join(', ',@description),"\n";
}

Here is the list of features which you can check for a file:

Operator Description
-A     Age of file (at script startup) in days since modification.
-B     Is it a binary file?
-C     Age of file (at script startup) in days since modification.
-M     Age of file (at script startup) in days since modification.
-O     Is the file owned by the real user ID?
-R     Is the file readable by the real user ID or real group?
-S     Is the file a socket?
-T     Is it a text file?
-W     Is the file writable by the real user ID or real group?
-X     Is the file executable by the real user ID or real group?
-b     Is it a block special file?
-c     Is it a character special file?
-d     Is the file a directory?
-e     Does the file exist?
-f     Is it a plain file?
-g     Does the file have the setgid bit set?
-k     Does the file have the sticky bit set?
-l     Is the file a symbolic link?
-o     Is the file owned by the effective user ID?
-p     Is the file a named pipe?
-r     Is the file readable by the effective user or group ID?
-s     Returns the size of the file, zero size = empty file.
-t     Is the filehandle opened by a TTY (terminal)?
-u     Does the file have the setuid bit set?
-w     Is the file writable by the effective user or group ID?
-x     Is the file executable by the effective user or group ID?
-z     Is the file size zero?


Working with Directories-操作目录
以下是用来目录操作的标准函数。

opendir DIRHANDLE, EXPR  # To open a directory
readdir DIRHANDLE        # To read a directory
rewinddir DIRHANDLE      # Positioning pointer to the begining
telldir DIRHANDLE        # Returns current position of the dir
seekdir DIRHANDLE, POS   # Pointing pointer to POS inside dir
closedir DIRHANDLE       # Closing a directory.

Here is an example which opens a directory and list out all the files available inside this directory.

opendir (DIR, '.') or die "Couldn't open directory, $!";
while ($file = readdir DIR){
 print "$file\n";
}
closedir DIR;

Another example to print the list of C source code files, you might use:
opendir(DIR, '.') or die "Couldn't open directory, $!";
foreach (sort grep(/^.*\.c$/,readdir(DIR))){
 print "$_\n";
}
closedir DIR;

你可以使用mkdir函数创建一个新目录,使用rmdir函数删除文件夹功能,可以使用chdir函数改变当前目录。

------------------------------
用法示例
Remove trailing linefeeds with chomp
To remove the "\n", and any other trailing whitespace, call chomp.
my $line = <$fh>;
chomp $line;

Change your line delimiter with $/

It's possible to change your input record separator, $/. It's only set to "\n" by default.
Set $/ to read a paragraph at a time. Set $/ to undef to read the entire file at once. See perlvar for details.

Slurp an entire file at once
Typically you'll see novices read a file using one of thse two methods:
open (FILE,$filename) || die "Cannot open '$filename': $!";
undef $/;
my $file_as_string = <FILE>;

OR
open (FILE,$filename) || die "Cannot open '$filename': $!";
my $file_as_string = join '', <FILE>;

Of those two, choose the former. The second one reads all the lines into an array, and then glomps together a big string. The first one just reads into a string, without creating the intervening list of lines. The best way yet is like so:
my $file_as_string = do{
open( my $fh, $filename ) or die "Can't open $filename: $!";
local $/ = undef;
<$fh>;
};

The do() block returns the last value evaluated in the block. This method localizes the $/ so that it gets set back outside the scope of the block. Without localizing $/, it retains the value being set to it and another piece of code might not be expecting it to have been set to undef. Here's another way:
use File::Slurp qw( read_file );
my $file_as_string = read_file( $filename );

File::Slurp is a handy module for reading and writing a file at a time, and it does magic fast processing on the back end.

Get lists of files with glob()
Use standard shell globbing patterns to get a list of files.
my @files = glob( "*" );

Pass them through grep to do quick filtering. For example, to get files and not directories:
my @files = grep { -f } glob( "*" );

Use unlink to remove a file
The Perl built-in delete deletes elements from a hash, not files from the filesystem.
my %stats;
$stats{filename} = 'foo.txt';
unlink $stats{filename}; # RIGHT: Removes "foo.txt" from the filesystem
delete $stats{filename}; # WRONG: Removes the "filename" element from %stats

The term "unlink" comes from the Unix idea of removing a link to the file from the directory nodes.

Use Unix-style directories under Windows
Even though Unix uses paths like /usr/local/bin and Windows uses C:\foo\bar\bat, you can still use forward slashes in your filenames.
 my $filename = 'C:/foo/bar/bat';
 open( my $fh, '<', $filename ) or die "Can't open $filename: $!";

In this case, $filename contains five characters: 'C', ':', a tab character, 'm' and 'p'. Instead, it should have been written as one of:
 my $filename = 'C:\tmp';
 my $filename = "C:\\tmp";

------------------------------
本节参考来源:Perl File Handling: open, read, write and close files


------------------------------
I/O Operations

本节转自Robert Stockton的个人主页,感谢原作者。

binmode(FILEHANDLE)
binmode FILEHANDLE

Arranges for the file to be read in "binary" mode in operating systems that distinguish between binary and text files. Files that are not read in binary mode have CR LF sequences translated to LF on input and LF translated to CR LF on output. Binmode has no effect under Unix. If FILEHANDLE is an expression, the value is taken as the name of the filehandle.

close(FILEHANDLE)
close FILEHANDLE

Closes the file or pipe associated with the file handle. You don't have to close FILEHANDLE if you are immediately going to do another open on it, since open will close it for you. (See open.) However, an explicit close on an input file resets the line counter ($.), while the implicit close done by open does not. Also, closing a pipe will wait for the process executing on the pipe to complete, in case you want to look at the output of the pipe afterwards. Closing a pipe explicitly also puts the status value of the command into $?. Example:

open(OUTPUT, '|sort >foo'); # pipe to sort
...    # print stuff to output
close OUTPUT; # wait for sort to finish
open(INPUT, 'foo'); # get sort's results

FILEHANDLE may be an expression whose value gives the real filehandle name.

dbmclose(ASSOC_ARRAY)
dbmclose ASSOC_ARRAY

Breaks the binding between a dbm file and an associative array. The values remaining in the associative array are meaningless unless you happen to want to know what was in the cache for the dbm file. This function is only useful if you have ndbm.

dbmopen(ASSOC,DBNAME,MODE)

This binds a dbm or ndbm file to an associative array. ASSOC is the name of the associative array. (Unlike normal open, the first argument is NOT a filehandle, even though it looks like one). DBNAME is the name of the database (without the .dir or .pag extension). If the database does not exist, it is created with protection specified by MODE (as modified by the umask). If your system only supports the older dbm functions, you may perform only one dbmopen in your program. If your system has neither dbm nor ndbm, calling dbmopen produces a fatal error.

Values assigned to the associative array prior to the dbmopen are lost. A certain number of values from the dbm file are cached in memory. By default this number is 64, but you can increase it by preallocating that number of garbage entries in the associative array before the dbmopen. You can flush the cache if necessary with the reset command.

If you don't have write access to the dbm file, you can only read associative array variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy array entry inside an eval, which will trap the error.

Note that functions such as keys() and values() may return huge array values when used on large dbm files. You may prefer to use the each() function to iterate over large dbm files. Example:

# print out history file offsets
dbmopen(HIST,'/usr/lib/news/history',0666);
while (($key,$val) = each %HIST) {
    print $key, ' = ', unpack('L',$val), "\n";
}
dbmclose(HIST);

eof(FILEHANDLE)
eof()
eof

Returns 1 if the next read on FILEHANDLE will return end of file, or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle name. (Note that this function actually reads a character and then ungetc's it, so it is not very useful in an interactive context.) An eof without an argument returns the eof status for the last file read. Empty parentheses () may be used to indicate the pseudo file formed of the files listed on the command line, i.e. eof() is reasonable to use inside a while (<>) loop to detect the end of only the last file. Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop. Examples:

# insert dashes just before last line of last file
while (<>) {
    if (eof()) {
        print "--------------\n";
    }
    print;
}

# reset line numbering on each input file
while (<>) {
    print "$.\t$_";
    if (eof) {# Not eof().
        close(ARGV);
    }
}

fcntl(FILEHANDLE,FUNCTION,SCALAR)

Implements the fcntl(2) function. You'll probably have to say
require "fcntl.ph";    # probably /usr/local/lib/perl/fcntl.ph

first to get the correct function definitions. If fcntl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/fcntl.h>. (There is a perl script called h2ph that comes with the perl kit which may help you in this.) Argument processing and value return works just like ioctl below. Note that fcntl will produce a fatal error if used on a machine that doesn't implement fcntl(2).

fileno(FILEHANDLE)
fileno FILEHANDLE

Returns the file descriptor for a filehandle. Useful for constructing bitmaps for select(). If FILEHANDLE is an expression, the value is taken as the name of the filehandle.

flock(FILEHANDLE,OPERATION)

Calls flock(2) on FILEHANDLE. See manual page for flock(2) for definition of OPERATION. Returns true for success, false on failure. Will produce a fatal error if used on a machine that doesn't implement flock(2). Here's a mailbox appender for BSD systems.

$LOCK_SH = 1;
$LOCK_EX = 2;
$LOCK_NB = 4;
$LOCK_UN = 8;

sub lock {
    flock(MBOX,$LOCK_EX);
    # and, in case someone appended
    # while we were waiting...
    seek(MBOX, 0, 2);
}

sub unlock {
    flock(MBOX,$LOCK_UN);
}

open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}") || die "Can't open mailbox: $!";

do lock();
print MBOX $msg,"\n\n";
do unlock();

getc(FILEHANDLE)
getc FILEHANDLE
getc

Returns the next character from the input file attached to FILEHANDLE, or a null string at EOF. If FILEHANDLE is omitted, reads from STDIN.

ioctl(FILEHANDLE,FUNCTION,SCALAR)

Implements the ioctl(2) function. You'll probably have to say
require "ioctl.ph"; # probably /usr/local/lib/perl/ioctl.ph

first to get the correct function definitions. If ioctl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/ioctl.h>. (There is a perl script called h2ph that comes with the perl kit which may help you in this.) SCALAR will be read and/or written depending on the FUNCTION--a pointer to the string value of SCALAR will be passed as the third argument of the actual ioctl call. (If SCALAR has no string value but does have a numeric value, that value will be passed rather than a pointer to the string value. To guarantee this to be true, add a 0 to the scalar before using it.) The pack() and unpack() functions are useful for manipulating the values of structures used by ioctl(). The following example sets the erase character to DEL.

require 'ioctl.ph';
$sgttyb_t = "ccccs"; # 4 chars and a short
if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
    @ary = unpack($sgttyb_t,$sgttyb);
    $ary[2] = 127;
    $sgttyb = pack($sgttyb_t,@ary);
    ioctl(STDIN,$TIOCSETP,$sgttyb) || die "Can't ioctl: $!";
}

The return value of ioctl (and fcntl) is as follows:

if OS returns:\h'|3i' perl returns:
    -1\h'|3i'  undefined value
    0\h'|3i'  string "0 but true"
    anything else\h'|3i'  that number

Thus perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:
($retval = ioctl(...)) || ($retval = -1);
printf "System returned %d\n", $retval;

open(FILEHANDLE,EXPR)
open(FILEHANDLE)
open FILEHANDLE

Opens the file whose filename is given by EXPR, and associates it with FILEHANDLE. If FILEHANDLE is an expression, its value is used as the name of the real filehandle wanted. If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE contains the filename. If the filename begins with "<" or nothing, the file is opened for input. If the filename begins with ">", the file is opened for output. If the filename begins with ">>", the file is opened for appending. (You can put a '+' in front of the '>' or '<' to indicate that you want both read and write access to the file.) If the filename begins with "|", the filename is interpreted as a command to which output is to be piped, and if the filename ends with a "|", the filename is interpreted as command which pipes input to us. (You may not have a command that pipes both in and out.) Opening '-' opens STDIN and opening '>-' opens STDOUT. Open returns non-zero upon success, the undefined value otherwise. If the open involved a pipe, the return value happens to be the pid of the subprocess. Examples:

$article = 100;
open article || die "Can't find article $article: $!\n";
while (<article>) {...
open(LOG, '>>/usr/spool/news/twitlog'); # (log is reserved)

open(article, "caesar <$article |"); # decrypt article

open(extract, "|sort >/tmp/Tmp$$"); # $$ is our process#

# process argument list of files along with any includes

foreach $file (@ARGV) {
    do process($file, 'fh00');    # no pun intended
}

sub process {
    local($filename, $input) = @_;
    $input++; # this is a string increment
    unless (open($input, $filename)) {
        print STDERR "Can't open $filename: $!\n";
        return;
    }
    while (<$input>) { # note use of indirection
        if (/^#include "(.*)"/) {
            do process($1, $input);
            next;
        }
        ...        # whatever
    }
}

You may also, in the Bourne shell tradition, specify an EXPR beginning with ">&", in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) which is to be duped and opened. You may use & after >, >>, <, +>, +>> and +<. The mode you specify should match the mode of the original filehandle. Here is a script that saves, redirects, and restores STDOUT and STDERR:

#!/usr/bin/perl
open(SAVEOUT, ">&STDOUT");
open(SAVEERR, ">&STDERR");

open(STDOUT, ">foo.out") || die "Can't redirect stdout";
open(STDERR, ">&STDOUT") || die "Can't dup stdout";

select(STDERR); $| = 1; # make unbuffered
select(STDOUT); $| = 1; # make unbuffered

print STDOUT "stdout 1\n";    # this works for
print STDERR "stderr 1\n";     # subprocesses too

close(STDOUT);
close(STDERR);

open(STDOUT, ">&SAVEOUT");
open(STDERR, ">&SAVEERR");

print STDOUT "stdout 2\n";
print STDERR "stderr 2\n";

If you open a pipe on the command "-", i.e. either "|-" or "-|", then there is an implicit fork done, and the return value of open is the pid of the child within the parent process, and 0 within the child process. (Use defined($pid) to determine if the open was successful.) The filehandle behaves normally for the parent, but i/o to that filehandle is piped from/to the STDOUT/ STDIN of the child process. In the child process the filehandle isn't opened--i/o happens from/to the new STDOUT or STDIN. Typically this is used like the normal piped open when you want to exercise more control over just how the pipe command gets executed, such as when you are running setuid, and don't want to have to scan shell commands for metacharacters. The following pairs are more or less equivalent:

open(FOO, "|tr '[a-z]' '[A-Z]'");
open(FOO, "|-") || exec 'tr', '[a-z]', '[A-Z]';

open(FOO, "cat -n '$file'|");
open(FOO, "-|") || exec 'cat', '-n', $file;

Explicitly closing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in $?. Note: on any operation which may do a fork, unflushed buffers remain unflushed in both processes, which means you may need to set $| to avoid duplicate output.

The filename that is passed to open will have leading and trailing whitespace deleted. In order to open a file with arbitrary weird characters in it, it's necessary to protect any leading and trailing whitespace thusly:
$file =~ s#^(\s)#./$1#;
open(FOO, "< $file\0");

pipe(READHANDLE,WRITEHANDLE)

Opens a pair of connected pipes like the corresponding system call. Note that if you set up a loop of piped processes, deadlock can occur unless you are very careful. In addition, note that perl's pipes use stdio buffering, so you may need to set $| to flush your WRITEHANDLE after each command, depending on the application. [Requires version 3.0 patchlevel 9.]

print(FILEHANDLE LIST)
print(LIST)
print FILEHANDLE LIST
print LIST
print

Prints a string or a comma-separated list of strings. Returns non-zero if successful. FILEHANDLE may be a scalar variable name, in which case the variable contains the name of the filehandle, thus introducing one level of indirection. (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be misinterpreted as an operator unless you interpose a + or put parens around the arguments.) If FILEHANDLE is omitted, prints by default to standard output (or to the last selected output channel--see select()). If LIST is also omitted, prints $_ to STDOUT. To set the default output channel to something other than STDOUT use the select operation. Note that, because print takes a LIST, anything in the LIST is evaluated in an array context, and any subroutine that you call will have one or more of its expressions evaluated in an array context. Also be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print--interpose a + or put parens around all the arguments.

printf(FILEHANDLE LIST)
printf(LIST)
printf FILEHANDLE LIST
printf LIST

Equivalent to a "print FILEHANDLE sprintf(LIST)".

read(FILEHANDLE,SCALAR,LENGTH,OFFSET)
read(FILEHANDLE,SCALAR,LENGTH)

Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHANDLE. Returns the number of bytes actually read, or undef if there was an error. SCALAR will be grown or shrunk to the length actually read. An OFFSET may be specified to place the read data at some other place than the beginning of the string. This call is actually implemented in terms of stdio's fread call. To get a true read system call, see sysread.

seek(FILEHANDLE,POSITION,WHENCE)

Randomly positions the file pointer for FILEHANDLE, just like the fseek() call of stdio. FILEHANDLE may be an expression whose value gives the name of the filehandle. Returns 1 upon success, 0 otherwise.

select(FILEHANDLE)
select

Returns the currently selected filehandle. Sets the current default filehandle for output, if FILEHANDLE is supplied. This has two effects: first, a write or a print without a filehandle will default to this FILEHANDLE. Second, references to variables related to output will refer to this output channel. For example, if you have to set the top of form format for more than one output channel, you might do the following:
select(REPORT1);
$^ = 'report1_top';
select(REPORT2);
$^ = 'report2_top';

FILEHANDLE may be an expression whose value gives the name of the actual filehandle. Thus:
$oldfh = select(STDERR); $| = 1; select($oldfh);

sprintf(FORMAT,LIST)

Returns a string formatted by the usual printf conventions. The * character is not supported.

sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)
sysread(FILEHANDLE,SCALAR,LENGTH)

Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHANDLE, using the system call read(2). It bypasses stdio, so mixing this with other kinds of reads may cause confusion. Returns the number of bytes actually read, or undef if there was an error. SCALAR will be grown or shrunk to the length actually read. An OFFSET may be specified to place the read data at some other place than the beginning of the string.

syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)
syswrite(FILEHANDLE,SCALAR,LENGTH)

Attempts to write LENGTH bytes of data from variable SCALAR to the specified FILEHANDLE, using the system call write(2). It bypasses stdio, so mixing this with prints may cause confusion. Returns the number of bytes actually written, or undef if there was an error. An OFFSET may be specified to place the read data at some other place than the beginning of the string.

tell(FILEHANDLE)
tell FILEHANDLE
tell

Returns the current file position for FILEHANDLE. FILEHANDLE may be an expression whose value gives the name of the actual filehandle. If FILEHANDLE is omitted, assumes the file last read.

write(FILEHANDLE)
write(EXPR)
write

Writes a formatted record (possibly multi-line) to the specified file, using the format associated with that file. By default the format for a file is the one having the same name is the filehandle, but the format for the current output channel (see select) may be set explicitly by assigning the name of the format to the $~ variable.

Top of form processing is handled automatically: if there is insufficient room on the current page for the formatted record, the page is advanced by writing a form feed, a special top-of-page format is used to format the new page header, and then the record is written. By default the top-of-page format is the name of the filehandle with "_TOP" appended, but it may be dynamicallly set to the format of your choice by assigning the name to the $^ variable while the filehandle is selected. The number of lines remaining on the current page is in variable $-, which can be set to 0 to force a new page.

If FILEHANDLE is unspecified, output goes to the current default output channel, which starts out as STDOUT but may be changed by the select operator. If the FILEHANDLE is an EXPR, then the expression is evaluated and the resulting string is used to look up the name of the FILEHANDLE at run time. For more on formats, see the section on formats later on.

Note that write is NOT the opposite of read.

open的pragmas的一些用法:

通常有两种指定编码的方式:一是在打开文件句柄时在模式部分指定;二是使用binmode操作符作用于句柄。

use open IN, ':encoding(UTF-8)';
use open OUT => ':encoding(UTF-8)';

use open IN => ":crlf", OUT => ":bytes";

#输入和输出使用同样的编码方式
use open IO => "...";
use open ':encoding(UTF-8)';

默认文件句柄已经处于打开状态,所以需要使用:std子编译指令来让改用之前设定的编码方式:
use open ':std';

推荐的用法

use open qw< IN :bytes OUT :utf8 :std >;
use open ":std", IN => ":bytes", OUT => ":utf8";

use Encode qw< encode decode >;
use warnings qw< FATAL all >;


------------------------------

官方参考文档中关于I/O的推荐操作
I/O-Operators

while (defined($_ = <STDIN>)) { print; }
while ($_ = <STDIN>) { print; }
while (<STDIN>) { print; }

for (;<STDIN>;) { print; }
print while defined($_ = <STDIN>);
print while ($_ = <STDIN>);
print while <STDIN>;