%define module_name SWISH-Filter
# BEGIN SourceDeps(oneline):
BuildRequires: perl(Data/Dump.pm) perl(ExtUtils/MakeMaker.pm) perl(File/Temp.pm) perl(Getopt/Long.pm) perl(MIME/Types.pm) perl(Module/Pluggable.pm) perl(Pod/Usage.pm) perl(Test/LeakTrace.pm) perl(Test/More.pm) perl(URI.pm)
# END SourceDeps(oneline)
%define _unpackaged_files_terminate_build 1
BuildRequires: rpm-build-perl perl-devel perl-podlators

Name: perl-%module_name
Version: 0.191
Release: alt2
Summary: filter documents for indexing with Swish-e
Group: Development/Perl
License: perl
Url: %CPAN %module_name

Source0: http://mirror.yandex.ru/mirrors/cpan/authors/id/K/KA/KARMAN/%{module_name}-%{version}.tar.gz
BuildArch: noarch

%description
SWISH::Filter provides a unified way to convert documents into a type that
Swish-e can index.  Individual filters are installed as separate subclasses (modules).
For example, there might be a filter that converts from PDF format to HTML
format.

SWISH::Filter is a framework that relies on other packages to do the heavy lifting
of converting non-text documents to text.  Additional helper
programs or Perl modules may need to be installed to use SWISH::Filter to filter
documents.  For example, to filter PDF documents you must install the `Xpdf'
package.

The filters are automatically loaded when `SWISH::Filters->new()' is
called.  Filters define a type and priority that determines the processing
order of the filter.  Filters are processed in this sort order until a filter
accepts the document for filtering. The filter uses the document's content type
to determine if the filter should handle the current document.  The
content-type is determined by the files suffix if not supplied by the calling
program.

The individual filters are not designed to be used as separate modules.  All
access to the filters is through this SWISH::Filter module.

Normally, once a document is filtered processing stops.  Filters can filter the
document and then set a flag saying that filtering should continue (for example
a filter that uncompresses a MS Word document before passing on to the filter
that converts from MS Word to text).  All this should be transparent to the end
user.  So, filters can be pipe-lined.

The idea of SWISH::Filter is that new filters can be created, and then
downloaded and installed to provide new filtering capabilities.  For example,
if you needed to index MS Excel documents you might be able to download a
filter from the Swish-e site and magically next time you run indexing MS Excel
docs would be indexed.

The SWISH::Filter setup can be used with -S prog or -S http.  It works best
with the -S prog method because the filter modules only need to be loaded and
compiled one time.  The -S prog program spider.pl will automatically use
SWISH::Filter when spidering with default settings (using "default" as the
first parameter to spider.pl).

The -S http indexing method uses a Perl helper script called swishspider.
swishspider has been updated to work with SWISH::Filter, but (unlike
spider.pl) does not contain a "use lib" line to point to the location of
SWISH::Filter.  This means that by default swishspider will not use
SWISH::Filter for filtering.  The reason for this is because swishspider
runs for every URL fetched, and loading the Filters for each document can be
slow.  The recommended way of spidering is using -S prog with spider.pl, but if
-S http is desired the way to enable SWISH::Filter is to set PERL5LIB before
running swish so that swishspider will be able to locate the SWISH::Filter
module.  Here's one way to set the PERL5LIB with the bash shell:

  $ export PERL5LIB=`swish-filter-test -path`

%prep
%setup -q -n %{module_name}-%{version}

%build
%perl_vendor_build

%install
%perl_vendor_install

%files
%doc README Changes example
%perl_vendor_privlib/S*

%changelog
