# BEGIN SourceDeps(oneline):
BuildRequires: perl(Exporter.pm) perl(ExtUtils/MakeMaker.pm) perl(Test.pm) perl(Text/Soundex.pm)
# END SourceDeps(oneline)
%define module_version 1.7
%define module_name File-Similars
%define _unpackaged_files_terminate_build 1
BuildRequires: rpm-build-perl perl-devel perl-podlators

Name: perl-%module_name
Version: 1.7
Release: alt1
Summary: Fast similar-files finder
Group: Development/Perl
License: perl
Url: %CPAN %module_name

Source0: http://cpan.org.ua/authors/id/S/SU/SUNTONG/%module_name-%module_version.tgz
BuildArch: noarch

%description
Extremely fast file similarity checker. It uses advanced soundex vector
algorithm to determine the similarity between files. Generally it means that
if there are n files, each having approximately m words, the degree of
calculation is merely

  O(n^2 * m)

which is over hundreds times faster than any existing file fingerprinting
technology.

The self-test output will help you understand what the module do and what
would you expect from the outcome.

  $ make test
  PERL_DL_NONLAZY=1 /usr/bin/perl "-Iblib/lib" "-Iblib/arch" test.pl
  1..4
  # Running under perl version 5.010000 for linux
  # Current time local: Wed Oct 29 11:35:06 2008
  # Current time GMT:   Wed Oct 29 15:35:06 2008
  # Using Test.pm version 1.25
  # Testing File::Searcher::Similars version 1.23
  
  == Testing 1, files under test/ subdir:
  
    9 test/(eBook) GNU - Python Standard Library 2001.pdf
    3 test/CardLayoutTest.java
    5 test/GNU - 2001 - Python Standard Library.pdf
    4 test/GNU - Python Standard Library (2001).rar
    9 test/LayoutTest.java
    3 test/PopupTest.java
    2 test/Python Standard Library.zip
    5 test/TestLayout.java
  ok 1
  
  Note:
  
  - The fileSimilars.pl script will pick out similar files from them in next test.
  - Let's assume that the number represent the file size in KB.
  
  == Testing 2 result should be:
  
  ## =========
             3 'CardLayoutTest.java' 'test/'
             5 'TestLayout.java' 'test/'
  
  ## =========
             4 'GNU - Python Standard Library (2001).rar' 'test/'
             5 'GNU - 2001 - Python Standard Library.pdf' 'test/'
  ok 2
  
  Note:
  
  - There are 2 groups of similar files picked out by the script.
    The second group makes more sense.
  - The similar files are picked because their file names looks similar.
  - However, the file size plays an important role as well.
  - There are 2 files in the second similar files group.
  - The file 'Python Standard Library.zip' is not considered to be similar to
    the group because its size is not similar to the group.
  
  == Testing 3, if Python.zip is bigger, result should be:
  
  ## =========
             3 'CardLayoutTest.java' 'test/'
             5 'TestLayout.java' 'test/'
  
  ## =========
             4 'Python Standard Library.zip' 'test/'
             4 'GNU - Python Standard Library (2001).rar' 'test/'
             5 'GNU - 2001 - Python Standard Library.pdf' 'test/'
  ok 3
  
  Note:
  
  - There are 3 files in the second similar files group.
  - The file 'Python Standard Library.zip' is now in the 2nd similar files
    group because its size is now similar to the group.
  
  == Testing 4, if Python.zip is even bigger, result should be:
  
  ## =========
             3 'CardLayoutTest.java' 'test/'
             5 'TestLayout.java' 'test/'
  
  ## =========
             4 'GNU - Python Standard Library (2001).rar'       'test/'
             5 'GNU - 2001 - Python Standard Library.pdf'       'test/'
             6 'Python Standard Library.zip'                    'test/'
             9 '(eBook) GNU - Python Standard Library 2001.pdf' 'test/'
  ok 4
  
  Note:
  
  - There are 4 files in the second similar files group.
  - The file 'Python Standard Library.zip' is still in the group.
  - But this time, because it is also considered to be similar to the .pdf
    file (since their size are now similar, 6 vs 9), a 4th file the .pdf
    is now included in the 2nd group.
  - If the size of file 'Python Standard Library.zip' is 12(KB), then the
    second similar files group will be split into two. Do you know why and
    which files each group will contain?

The File::Searcher::Similars package comes with a fully functional demo
script fileSimilars.pl. Please refer to its help file for further
explanations.

This package is highly customizable. Refer to hash variable %%config and/or
the 3 arrwash_ functions for customization hints.

%package scripts
Summary: %module_name scripts
Group: Development/Perl
Requires: %{?epoch:%epoch:}%name = %version-%release

%description scripts
scripts for %module_name


%prep
%setup -n %module_name-%module_version

%build
%perl_vendor_build INSTALLMAN1DIR=%_man1dir

%install
%perl_vendor_install

%files
%doc README COPYING README.html Changes
%perl_vendor_privlib/F*

%files scripts
%_bindir/*
%_man1dir/*

%changelog
