在查日志的时候经常会遇到文件的去重,排序获得想要的结果,下面我们就来看看具体的案例:
文本行去重:测试文件 test.txt
Hello World.
Apple and Nokia.
Hello World.
I wanna buy an Apple device.
The Iphone of Apple company.
Hello World.
The Iphone of Apple company.
My name is Friendfish.
Hello World.
Apple and Nokia.
第一步:排序
由于uniq命令只能对相邻行进行去重复操作,所以在进行去重前,先要对文本行进行排序,使重复行集中到一起。
$ sort test.txt
Apple and Nokia.
Apple and Nokia.
Hello World.
Hello World.
Hello World.
Hello World.
I wanna buy an Apple device.
My name is Friendfish.
The Iphone of Apple company.
The Iphone of Apple company.
第二步:去掉相邻的重复行
$ sort test.txt | uniq
Apple and Nokia.
Hello World.
I wanna buy an Apple device.
My name is Friendfish.
The Iphone of Apple company.
第三步:文本行去重并按重复次数排序
(1)首先,对文本行进行去重并统计重复次数(uniq命令加-c选项可以实现对重复次数进行统计。)。
$ sort test.txt | uniq -c
2 Apple and Nokia.
4 Hello World.
1 I wanna buy an Apple device.
1 My n