%define module_name Fsdb
# BEGIN SourceDeps(oneline):
BuildRequires: /usr/bin/groff /usr/bin/pod2html perl(ExtUtils/MakeMaker.pm) perl(HTML/Parser.pm) perl(IO/Compress/Bzip2.pm) perl(IO/Compress/Gzip.pm) perl(IO/Compress/Xz.pm) perl(IO/Uncompress/AnyUncompress.pm) perl(IPC/Cmd.pm) perl(Pod/Usage.pm) perl(Test/More.pm) perl(Test/Pod.pm) perl(Test/Pod/Coverage.pm) perl(Text/CSV_XS.pm) perl(XML/Simple.pm) perl(YAML/XS.pm)
# END SourceDeps(oneline)
%define _unpackaged_files_terminate_build 1
BuildRequires: rpm-build-perl perl-devel perl-podlators

Name: perl-%module_name
Version: 3.4
Release: alt1
Summary: a flat-text database for shell scripting
Group: Development/Perl
License: gpl
Url: %CPAN %module_name

Source0: http://mirror.yandex.ru/mirrors/cpan/authors/id/J/JO/JOHNH/%{module_name}-%{version}.tar.gz
BuildArch: noarch

%description
Fsdb, the flatfile streaming database is package of commands
for manipulating flat-ASCII databases from
shell scripts.  Fsdb is useful to process medium amounts of data (with
very little data you'd do it by hand, with megabytes you might want a
real database).
Fsdb was known as as Jdb from 1991 to Oct. 2008.

Fsdb is very good at doing things like:

=over 4

=item *

extracting measurements from experimental output

=item *

examining data to address different hypotheses

=item *

joining data from different experiments

=item *

eliminating/detecting outliers

=item *

computing statistics on data
(mean, confidence intervals, correlations, histograms)

=item *

reformatting data for graphing programs

=back

Fsdb is built around the idea of a flat text file as a database.
Fsdb files (by convention, with the extension .fsdb),
have a header documenting the schema (what the columns mean),
and then each line represents a database record (or row).

For example:

_#fsdb experiment duration
_ufs_mab_sys 37.2
_ufs_mab_sys 37.3
_ufs_rcp_real 264.5
_ufs_rcp_real 277.9

Is a simple file with four experiments (the rows), 
each with a description, size parameter, and run time
in the first, second, and third columns.

Rather than hand-code scripts to do each special case, Fsdb provides
higher-level functions.  Although it's often easy throw together a
custom script to do any single task, I believe that there are several
advantages to using this library:

=over 4

=item *

these programs provide a higher level interface than plain Perl, so

=over 4

=item **

Fewer lines of simpler code:

    dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration

Picks out just one type of experiment and computes statistics on it,
rather than:

    while (<>) { split; $sum+=$F[1]; $ss+=$F[1]**2; $n++; }
    $mean = $sum / $n; $std_dev = ...

in dozens of places.

=back

=item *

the library uses names for columns, so

=over 4

=item **

No more `$F[1]', use `_duration'.

=item **

New or different order columns?  No changes to your scripts!

=back

Thus if your experiment gets more complicated with a size parameter,
so your log changes to:

_#fsdb experiment size duration
_ufs_mab_sys 1024 37.2
_ufs_mab_sys 1024 37.3
_ufs_rcp_real 1024 264.5
_ufs_rcp_real 1024 277.9
_ufs_mab_sys 2048 45.3
_ufs_mab_sys 2048 44.2

Then the previous scripts still work, even though duration is
now the third column, not the second.

=item *

A series of actions are self-documenting (each program records what it does).

=over 4

=item **

No more wondering what hacks were used to compute the
final data, just look at the comments at the end
of the output.

=back

For example, the commands

    dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration

add to the end of the output the lines
    #    | dbrow _experiment eq "ufs_mab_sys"
    #    | dbcolstats duration


=item *

The library is mature, supporting large datasets, 
corner cases, error handling, backed by an automated test suite.

=over 4

=item **

No more puzzling about bad output because your custom script
skimped on error checking.

=item **

No more memory thrashing when you try to sort ten million records.

=back

=item *

Fsdb-2.x supports Perl scripting (in addition to shell scripting),
with libraries to do Fsdb input and output, and easy support for pipelines.
The shell script

    dbcol name test1 | dbroweval '_test1 += 5;'

can be written in perl as:

    dbpipeline(dbcol(qw(name test1)), dbroweval('_test1 += 5;'));

=back

(The disadvantage is that you need to learn what functions Fsdb provides.)

Fsdb is built on flat-ASCII databases.  By storing data in simple text
files and processing it with pipelines it is easy to experiment (in
the shell) and look at the output.  
To the best of my knowledge, the original implementation of
this idea was `/rdb', a commercial product described in the book
*UNIX relational database management: application development in the UNIX environment*
by Rod Manis, Evan Schaffer, and Robert Jorgensen (and
also at the web page http://www.rdb.com/).  Fsdb is an incompatible
re-implementation of their idea without any accelerated indexing or
forms support.  (But it's free, and probably has better statistics!).

Fsdb-2.x supports threading and will exploit multiple processors or cores,
and provides Perl-level support for input, output, and threaded-pipelines.

Installation instructions follow at the end of this document. 
Fsdb-2.x requires Perl 5.8 to run.  
All commands have manual pages and provide usage with the `--help' option.
All commands are backed by an automated test suite.

The most recent version of Fsdb is available on the web at
http://www.isi.edu/~johnh/SOFTWARE/FSDB/index.html.

%package scripts
Summary: %module_name scripts
Group: Development/Perl
Requires: %name = %{?epoch:%epoch:}%version-%release

%description scripts
scripts for %module_name
%prep
%setup -q -n %{module_name}-%{version}

%build
%perl_vendor_build

%install
%perl_vendor_install

%files
%doc README COPYING README.html
%perl_vendor_privlib/F*

%files scripts
%_man1dir/*
%_bindir/*

%changelog
