用perl进行多文件行计数,单词计数

| 暂无评论 | 暂无引用通告

-->

简单的行计数

#!/usr/bin/perl
while(<>){
        $count{$ARGV}++;
}
foreach $file (sort keys %count){
        print "$file has $count{$file}\n";
}

注意点:巧妙的使用hash

运行结果

[root@rs1 root]# ./count.pl statistics tcp_deny.txt nohup.out serverstatistics
nohup.out has 3346
serverstatistics has 11
statistics has 22
tcp_deny.txt has 1

以上是根据文件名排序,下面对行数进行排序

 #!/usr/bin/perl
while(<>){
        $count{$ARGV}++;
}
foreach $file (sort by_count keys %count){
        print "$file has $count{$file}\n";
}
sub by_count {
        $count{$a} <=> $count{$b}
}

注意此处sort使用了一个子例程作参数来定义排序规则

如果我们想以从大到小排序可以使用  $count{$b} <=>  $count{$a}

运行结果:

[root@rs1 root]# ./count.pl statistics tcp_deny.txt nohup.out serverstatistics
tcp_deny.txt has 1
serverstatistics has 11
statistics has 22
nohup.out has 3346

上面的程序好象作用不是很大,我们现在变换一下,来进行单词计数

将while中的内容更改一下

while(<>){
        @words=split /\W+/;
        $count{$ARGV}+=@words;
}

这样就可以得到单词排序,

[root@rs1 root]# ./count.pl statistics tcp_deny.txt nohup.out serverstatistics
tcp_deny.txt has 18 words
serverstatistics has 44 words
statistics has 202 words
nohup.out has 39069 words

下面来统计各个单词在所有文件中的数量

#!/usr/bin/perl
while(<>){
        @words=split /\W+/;
        foreach $word (@words){
                $count{$word}++;
        }
}
foreach $word (sort by_count keys %count){
        print "$word occurs $count{$word} times\n";
}
sub by_count {
        $count{$a} <=> $count{$b}
}


运行结果(太长了)的一部分:

[root@rs1 root]# ./count.pl statistics tcp_deny.txt nohup.out serverstatistics
IS6_26_ occurs 1 times
4518452 occurs 1 times
798 occurs 1 times
ORGCODE22_0_ occurs 1 times
load occurs 1 times
col_0_0_ occurs 1 times

 下面来实现单词在每个文件中的计数

#!/usr/bin/perl
while(<>){
        @words=split /\W+/;
        foreach $word (@words){
                $count{$word}{$ARGV}++;
        }
}
foreach $word (sort keys %count){
        foreach $file (sort keys %{$count{$word}}){
                print "$word occurs $count{$word}{$file} times in $file\n";
        }
}

注意学习二维hash的使用 $count{$word}为一个hash引用

让输出更明了些

#!/usr/bin/perl
while(<>){
        @words=split /\W+/;
        foreach $word (@words){
                $count{$word}{$ARGV}++;
        }
}
foreach $word (sort keys %count){
                print "$word:    ",join(",",map "$_:$count{$word}{$_}",keys %{$count{$word}}),"\n";
}

此处要学习一下map 和join的用法

如生成这样的结果:

1:    nohup.out:8,statistics:3,serverstatistics:1

暂无引用通告

发送引用通告网址: http://supersun.info/mt/mt-tb.cgi/622
如果您想引用这篇日记到您的Blog,请复制上面的链接,放置到您发表文章时的相应界面中。

发表评论

最新资源

  • IMG_1437.JPG
  • line.png
  • bar.png
  • perl_calander.jpg

关于此日记

此日记由 supersun 发表于 2007年4月17日 11:00

此Blog上的上一篇日记当鼠标悬停在链接上链接背景变色

此Blog上的下一篇日记英语学习

首页归档页可以看到最新的日记和所有日记。