NAME

fileSimilars - Similar files locator


SYNOPSIS

  [perl -S] fileSimilars.pl [--level=1] [dirs...]

Similar-sized and similar-named files are picked as suspicious candidates of duplicated files.


DESCRIPTION

What descirbes better than a actual output? Here is an example of suspicious duplicated files:

  ## =========
          1574 PopupTest.java          /home/tong/.../examples/chap10
          1561 CardLayoutTest.java     /home/tong/.../examples/chap1
          1570 PopupButtonFrame.class  /home/tong/.../examples/chap6
  ## =========
         22984 BinderyHelloWorld.jpg  /home/tong/...
         17509 MacHelloWorld.gif      /home/tong/...

The first column is the size of the file, 2nd the name, and 3rd the path. The motto for the listing is that, I would rather my program overkills (wrongly picking out suspicious ones) than neglects something that would cause me otherwise years to notice.

By default, fileSimilars.pl assumes that similar files within the same folder are OK. Hence you will not get duplicate warnings for generated files (like .o, .class or .aux, and .dvi files) or other file series.

Once you are sure that there are no duplications between different folders and want fileSimilars.pl to scoop further, specify the --level=1 command line switch (or -l 1). This is very good to eliminate similar mp3 files within the same folder, or downloaded files from big sites where different packaging methods are used, e.g.:

  ## =========
         66138 jdc-src.tar.gz  .../ftp.ora.com/published/oreilly/java/javadc
        147904 jdc-src.zip     .../ftp.ora.com/published/oreilly/java/javadc


AUTHOR

 @Author:  SUN, Tong <suntong at users sourceforge net>
 @HomeURL: http://xpt.sourceforge.net/


SEE ALSO

File::Compare(3), File::Find::Duplicates(3)

perl(1).


COPYRIGHT

Copyright (c) 1997-2003 Tong SUN. All rights reserved.


TODO