Perl文件查找模块之File::Find::Rule
2013-07-09 21:50:27

前面介绍了Perl内置的文件查找模块File::Find,这里介绍另外一款查找模块,可以将它视为前者的改进版本。它提供更多api调用,以及更好的体验。
File::Find::Rule - Alternative interface to File::Find,it allows you to build rules which specify the desired files and directories.

Matching Rules-匹配规则
---------------
name(@patterns)-(名称匹配)
Specifies names that should match. May be globs or regular expressions.

$set->name( '*.mp3', '*.ogg' ); # mp3s or oggs
$set->name( qr/\.(mp3|ogg)$/ ); # the same as a regex
$set->name( 'foo.bar' );        # just things named foo.bar

---------------
-X tests-(文件测试操作)
Synonyms are provided for each of the -X tests. See "-X" in perlfunc for details. None of these methods take arguments.

---------------
stat tests-(文件属性判断)
The following stat based methods are provided: dev, ino, mode, nlink, uid, gid, rdev, size, atime, mtime, ctime, blksize, and blocks. See "stat" in perlfunc for details.

Each of these can take a number of targets, which will follow Number::Compare semantics.

$rule->size( 7 ); # exactly 7
$rule->size( ">7Ki" );  # larger than 7 * 1024 * 1024 bytes
$rule->size( ">=7" )->size( "<=90" );  # between 7 and 90, inclusive
$rule->size( 7, 9, 42 );  # 7, 9 or 42

---------------
any(@rules)
or(@rules)-(连续匹配其它的规则)

Allows shortcircuiting boolean evaluation as an alternative to the default and-like nature of combined rules. any and or are interchangeable.

# find avis, movs, things over 200M and empty files
$rule->any( File::Find::Rule->name( '*.avi', '*.mov' ),
  File::Find::Rule->size( '>200M' ),
  File::Find::Rule->file->empty,
);

---------------
none(@rules)
not(@rules)-(连续不匹配其它的规则)

Negates a rule. (The inverse of any.) none and not are interchangeable.

# files that aren't 8.3 safe
$rule->file->not( $rule->new->name( qr/^[^.]{1,8}(\.[^.]{0,3})?$/ ) );

---------------
prune-(不处理的情况,与File::Find相同)
Traverse no further. This rule always matches.

---------------
discard
Don't keep this file. This rule always matches.

---------------
exec(\&subroutine( $shortname, $path, $fullname) )-(对于文件执行相关函数代码并返回符合的规则)
Allows user-defined rules. Your subroutine will be invoked with $_ set to the current short name, and with parameters of the name, the path you're in, and the full relative filename.

Return a true value if your rule matched.

# get things with long names
$rules->exec( sub { length > 20 } );

---------------
grep(@specifiers)-(对文件内容进行grep操作,这个操作是File::Find所不具备的,对文件进行内容上的查找提供了方便)
Opens a file and tests it each line at a time.

For each line it evaluates each of the specifiers, stopping at the first successful match. A specifier may be a regular expression or a subroutine. The subroutine will be invoked with the same parameters as an ->exec subroutine.

It is possible to provide a set of negative specifiers by enclosing them in anonymous arrays. Should a negative specifier match the iteration is aborted and the clause is failed. For example:

$rule->grep( qr/^#!.*\bperl/, [ sub { 1 } ] );
Is a passing clause if the first line of a file looks like a perl shebang line.

---------------
maxdepth($level)-(可向下查找的目录最大深度,当为'1'时,仅处理当前工作目录)
Descend at most $level (a non-negative integer) levels of directories below the starting point.

May be invoked many times per rule, but only the most recent value is used.

---------------
mindepth($level)-(可向下查找的目录最小深度)
Do not apply any tests at levels less than $level (a non-negative integer).

---------------
extras(\%extras)-(通过hash形式为查找定义行为选项,与File::Find的%option相似)
Specifies extra values to pass through to File::File::find as part of the options hash.

For example this allows you to specify following of symlinks like so:
my $rule = File::Find::Rule->extras({ follow => 1 });

May be invoked many times per rule, but only the most recent value is used.

---------------
relative
Trim the leading portion of any path found

---------------
not_*-(上面提及的函数取反操作)
Negated version of the rule. An effective shortand related to ! in the procedural interface.

$foo->not_name('*.pl');
$foo->not( $foo->new->name('*.pl' ) );

Query Methods-查询方法

---------------
in(@directories)-(返回符合条件的文件及目录)
Evaluates the rule, returns a list of paths to matching files and directories.

---------------
start(@directories)-(预定义一个工作目录)
Starts a find across the specified directories. Matching items may then be queried using "match". This allows you to use a rule as an iterator.

my $rule = File::Find::Rule->file->name("*.jpeg")->start( "/web" );
 while ( defined ( my $image = $rule->match ) ) {
  ...
 }

---------------
match-(返回下一个符合条件的文件)
Returns the next file which matches, false if there are no more.

示例


my $directory='.';
#得到指定目录下所有的子目录
my @subdirs = File::Find::Rule->directory->in($directory);
#在Perl模块路径里查找所有以'pl'的扩展名文件
my @plfiles = File::Find::Rule->file()->name('*.pl')->in(@INC);
#查找'txt|pl'为扩展名的文件,其包含'html'字符串
my $rule=File::Find::Rule->new;
$rule->file;
$rule->name(qr/\.(txt|pl)$/);
$rule->grep(qr/html/);
my @files=$rule->in('.');

查找perl脚本
my $finder = File::Find::Rule->or(
 File::Find::Rule->name( '*.pl' ),
 File::Find::Rule->exec(
  sub {if (open my $fh, $_) {
  my $shebang = <$fh>;
  close $fh;
  return $shebang =~ /^#!.*\bperl/;
  }
 return 0;
 }),
);

忽略'.svn'目录
my $rule = File::Find::Rule->new;
$rule->or($rule->new
->directory
->name('CVS')
->prune
->discard,
$rule->new);

参考来源:
File::Find::Rule