Perl文件查找模块之File::Find::Rule
2013-07-09 21:50:27
前面介绍了Perl内置的文件查找模块File::Find,这里介绍另外一款查找模块,可以将它视为前者的改进版本。它提供更多api调用,以及更好的体验。
File::Find::Rule - Alternative interface to File::Find,it allows you to build rules which specify the desired files and directories.
Matching Rules-匹配规则
---------------
name(@patterns)-(名称匹配)
Specifies names that should match. May be globs or regular expressions.
$set->name( '*.mp3', '*.ogg' ); # mp3s or oggs
$set->name( qr/\.(mp3|ogg)$/ ); # the same as a regex
$set->name( 'foo.bar' ); # just things named foo.bar
---------------
-X tests-(文件测试操作)
Synonyms are provided for each of the -X tests. See "-X" in perlfunc for details. None of these methods take arguments.
---------------
stat tests-(文件属性判断)
The following stat based methods are provided: dev, ino, mode, nlink, uid, gid, rdev, size, atime, mtime, ctime, blksize, and blocks. See "stat" in perlfunc for details.
Each of these can take a number of targets, which will follow Number::Compare semantics.
$rule->size( 7 ); # exactly 7
$rule->size( ">7Ki" ); # larger than 7 * 1024 * 1024 bytes
$rule->size( ">=7" )->size( "<=90" ); # between 7 and 90, inclusive
$rule->size( 7, 9, 42 ); # 7, 9 or 42
---------------
any(@rules)
or(@rules)-(连续匹配其它的规则)
Allows shortcircuiting boolean evaluation as an alternative to the default and-like nature of combined rules. any and or are interchangeable.
# find avis, movs, things over 200M and empty files
$rule->any( File::Find::Rule->name( '*.avi', '*.mov' ),
File::Find::Rule->size( '>200M' ),
File::Find::Rule->file->empty,
);
---------------
none(@rules)
not(@rules)-(连续不匹配其它的规则)
Negates a rule. (The inverse of any.) none and not are interchangeable.
# files that aren't 8.3 safe
$rule->file->not( $rule->new->name( qr/^[^.]{1,8}(\.[^.]{0,3})?$/ ) );
---------------
prune-(不处理的情况,与File::Find相同)
Traverse no further. This rule always matches.
---------------
discard
Don't keep this file. This rule always matches.
---------------
exec(\&subroutine( $shortname, $path, $fullname) )-(对于文件执行相关函数代码并返回符合的规则)
Allows user-defined rules. Your subroutine will be invoked with $_ set to the current short name, and with parameters of the name, the path you're in, and the full relative filename.
Return a true value if your rule matched.
# get things with long names
$rules->exec( sub { length > 20 } );
---------------
grep(@specifiers)-(对文件内容进行grep操作,这个操作是File::Find所不具备的,对文件进行内容上的查找提供了方便)
Opens a file and tests it each line at a time.
For each line it evaluates each of the specifiers, stopping at the first successful match. A specifier may be a regular expression or a subroutine. The subroutine will be invoked with the same parameters as an ->exec subroutine.
It is possible to provide a set of negative specifiers by enclosing them in anonymous arrays. Should a negative specifier match the iteration is aborted and the clause is failed. For example:
$rule->grep( qr/^#!.*\bperl/, [ sub { 1 } ] );
Is a passing clause if the first line of a file looks like a perl shebang line.
---------------
maxdepth($level)-(可向下查找的目录最大深度,当为'1'时,仅处理当前工作目录)
Descend at most $level (a non-negative integer) levels of directories below the starting point.
May be invoked many times per rule, but only the most recent value is used.
---------------
mindepth($level)-(可向下查找的目录最小深度)
Do not apply any tests at levels less than $level (a non-negative integer).
---------------
extras(\%extras)-(通过hash形式为查找定义行为选项,与File::Find的%option相似)
Specifies extra values to pass through to File::File::find as part of the options hash.
For example this allows you to specify following of symlinks like so:
my $rule = File::Find::Rule->extras({ follow => 1 });
May be invoked many times per rule, but only the most recent value is used.
---------------
relative
Trim the leading portion of any path found
---------------
not_*-(上面提及的函数取反操作)
Negated version of the rule. An effective shortand related to ! in the procedural interface.
$foo->not_name('*.pl');
$foo->not( $foo->new->name('*.pl' ) );
Query Methods-查询方法
---------------
in(@directories)-(返回符合条件的文件及目录)
Evaluates the rule, returns a list of paths to matching files and directories.
---------------
start(@directories)-(预定义一个工作目录)
Starts a find across the specified directories. Matching items may then be queried using "match". This allows you to use a rule as an iterator.
my $rule = File::Find::Rule->file->name("*.jpeg")->start( "/web" );
while ( defined ( my $image = $rule->match ) ) {
...
}
---------------
match-(返回下一个符合条件的文件)
Returns the next file which matches, false if there are no more.
示例
my $directory='.';
#得到指定目录下所有的子目录
my @subdirs = File::Find::Rule->directory->in($directory);
#在Perl模块路径里查找所有以'pl'的扩展名文件
my @plfiles = File::Find::Rule->file()->name('*.pl')->in(@INC);
#查找'txt|pl'为扩展名的文件,其包含'html'字符串
my $rule=File::Find::Rule->new;
$rule->file;
$rule->name(qr/\.(txt|pl)$/);
$rule->grep(qr/html/);
my @files=$rule->in('.');
查找perl脚本
my $finder = File::Find::Rule->or(
File::Find::Rule->name( '*.pl' ),
File::Find::Rule->exec(
sub {if (open my $fh, $_) {
my $shebang = <$fh>;
close $fh;
return $shebang =~ /^#!.*\bperl/;
}
return 0;
}),
);
忽略'.svn'目录
my $rule = File::Find::Rule->new;
$rule->or($rule->new
->directory
->name('CVS')
->prune
->discard,
$rule->new);
参考来源:
File::Find::Rule
File::Find::Rule - Alternative interface to File::Find,it allows you to build rules which specify the desired files and directories.
Matching Rules-匹配规则
---------------
name(@patterns)-(名称匹配)
Specifies names that should match. May be globs or regular expressions.
$set->name( '*.mp3', '*.ogg' ); # mp3s or oggs
$set->name( qr/\.(mp3|ogg)$/ ); # the same as a regex
$set->name( 'foo.bar' ); # just things named foo.bar
---------------
-X tests-(文件测试操作)
Synonyms are provided for each of the -X tests. See "-X" in perlfunc for details. None of these methods take arguments.
---------------
stat tests-(文件属性判断)
The following stat based methods are provided: dev, ino, mode, nlink, uid, gid, rdev, size, atime, mtime, ctime, blksize, and blocks. See "stat" in perlfunc for details.
Each of these can take a number of targets, which will follow Number::Compare semantics.
$rule->size( 7 ); # exactly 7
$rule->size( ">7Ki" ); # larger than 7 * 1024 * 1024 bytes
$rule->size( ">=7" )->size( "<=90" ); # between 7 and 90, inclusive
$rule->size( 7, 9, 42 ); # 7, 9 or 42
---------------
any(@rules)
or(@rules)-(连续匹配其它的规则)
Allows shortcircuiting boolean evaluation as an alternative to the default and-like nature of combined rules. any and or are interchangeable.
# find avis, movs, things over 200M and empty files
$rule->any( File::Find::Rule->name( '*.avi', '*.mov' ),
File::Find::Rule->size( '>200M' ),
File::Find::Rule->file->empty,
);
---------------
none(@rules)
not(@rules)-(连续不匹配其它的规则)
Negates a rule. (The inverse of any.) none and not are interchangeable.
# files that aren't 8.3 safe
$rule->file->not( $rule->new->name( qr/^[^.]{1,8}(\.[^.]{0,3})?$/ ) );
---------------
prune-(不处理的情况,与File::Find相同)
Traverse no further. This rule always matches.
---------------
discard
Don't keep this file. This rule always matches.
---------------
exec(\&subroutine( $shortname, $path, $fullname) )-(对于文件执行相关函数代码并返回符合的规则)
Allows user-defined rules. Your subroutine will be invoked with $_ set to the current short name, and with parameters of the name, the path you're in, and the full relative filename.
Return a true value if your rule matched.
# get things with long names
$rules->exec( sub { length > 20 } );
---------------
grep(@specifiers)-(对文件内容进行grep操作,这个操作是File::Find所不具备的,对文件进行内容上的查找提供了方便)
Opens a file and tests it each line at a time.
For each line it evaluates each of the specifiers, stopping at the first successful match. A specifier may be a regular expression or a subroutine. The subroutine will be invoked with the same parameters as an ->exec subroutine.
It is possible to provide a set of negative specifiers by enclosing them in anonymous arrays. Should a negative specifier match the iteration is aborted and the clause is failed. For example:
$rule->grep( qr/^#!.*\bperl/, [ sub { 1 } ] );
Is a passing clause if the first line of a file looks like a perl shebang line.
---------------
maxdepth($level)-(可向下查找的目录最大深度,当为'1'时,仅处理当前工作目录)
Descend at most $level (a non-negative integer) levels of directories below the starting point.
May be invoked many times per rule, but only the most recent value is used.
---------------
mindepth($level)-(可向下查找的目录最小深度)
Do not apply any tests at levels less than $level (a non-negative integer).
---------------
extras(\%extras)-(通过hash形式为查找定义行为选项,与File::Find的%option相似)
Specifies extra values to pass through to File::File::find as part of the options hash.
For example this allows you to specify following of symlinks like so:
my $rule = File::Find::Rule->extras({ follow => 1 });
May be invoked many times per rule, but only the most recent value is used.
---------------
relative
Trim the leading portion of any path found
---------------
not_*-(上面提及的函数取反操作)
Negated version of the rule. An effective shortand related to ! in the procedural interface.
$foo->not_name('*.pl');
$foo->not( $foo->new->name('*.pl' ) );
Query Methods-查询方法
---------------
in(@directories)-(返回符合条件的文件及目录)
Evaluates the rule, returns a list of paths to matching files and directories.
---------------
start(@directories)-(预定义一个工作目录)
Starts a find across the specified directories. Matching items may then be queried using "match". This allows you to use a rule as an iterator.
my $rule = File::Find::Rule->file->name("*.jpeg")->start( "/web" );
while ( defined ( my $image = $rule->match ) ) {
...
}
---------------
match-(返回下一个符合条件的文件)
Returns the next file which matches, false if there are no more.
示例
my $directory='.';
#得到指定目录下所有的子目录
my @subdirs = File::Find::Rule->directory->in($directory);
#在Perl模块路径里查找所有以'pl'的扩展名文件
my @plfiles = File::Find::Rule->file()->name('*.pl')->in(@INC);
#查找'txt|pl'为扩展名的文件,其包含'html'字符串
my $rule=File::Find::Rule->new;
$rule->file;
$rule->name(qr/\.(txt|pl)$/);
$rule->grep(qr/html/);
my @files=$rule->in('.');
查找perl脚本
my $finder = File::Find::Rule->or(
File::Find::Rule->name( '*.pl' ),
File::Find::Rule->exec(
sub {if (open my $fh, $_) {
my $shebang = <$fh>;
close $fh;
return $shebang =~ /^#!.*\bperl/;
}
return 0;
}),
);
忽略'.svn'目录
my $rule = File::Find::Rule->new;
$rule->or($rule->new
->directory
->name('CVS')
->prune
->discard,
$rule->new);
参考来源:
File::Find::Rule