new Catalog - Catalog of SQL Phrasebooks

I have a philosophical issue with SQL Phrasebooks in general. This may
have to do with the types of applications I tend to write, so please
take me with a grain of salt.

My applications tend to have a DBI wrapper that provides two pieces of
functionality:
1) Business-level functions for accessing datastores (databases and other stuff)
2) Functionality for generating SQL on the fly, using somewhat complex
graph theory

The first is for things like "Given this username and password, can
the person log in?". The second is for reports. My reports aren't your
grandma's reports - you have a list of options to choose from and the
app figures out what SQL to run from there, including all the table
joins.

So, what need do I have for a phrasebook? I can understand, to some
degree, putting the statements from the first grouping into a
phrasebook, but they're already behind a layer of abstraction and in
one place. So, doesn't that mean I've satisfied the benefits of a
phrasebook?

Rob

Post by Terrence Brannon
What is a SQL Phrasebook?
A Perl module which separates SQL from your Perl code, making it
accessible via a unique piece of keyed text. Think of it as
Locale::KeyedText for relational databases.
Why would I need a SQL Phrasebook
* Better software engineering
Good software is designed in a DWIM fashion, where you spend time
doing what you mean as opposed to making repeated DWIS efforts. Any
large software project should not spend time defining sql in a
scalar and then accessing that scalar. This clutters your core code,
making it harder to follow, among other things. These tools provide
support for creating a separate library.
* Knowledge of tables accessed
It is important to know what tables are being accessed in your
project. This is important for numerous reasons. First, for
statistical purposes, highly accessed tables might need tuning.
Having all of your SQL in one place and having your code base
searchable for a single label simplifies the process of
understanding how your code is accessing tables. Second, if a table
changes, you need to know which SQL was using that table. Having all
the SQL in one place makes it easy to handle table changes.
* Testing
It is much easier to run model tests if all of your SQL is accessed
in the same fashion. It is also easier to group the tests by concept
* Tuning
If a database expert can stare at all of the SQL for a project in
one place, it makes it easier for her to decide how to restructure
the entire SQL set.... or even more exciting, he might create
several SQL sets for different purposes and the Perl programmer can
haul in the appropriate SQL set tuned for his task, all without
modifying the Perl code!
* Reuse
You drop your SQL in the library once, use as many times as you want
List of SQL Phrasebooks
* SQL::Catalog by Terrence Brannon
* Class::Phrasebook::SQL by Rani Pinchuk
* Data::Phrasebook by Barbie
* DBIx::Librarian by Jason May
* SQL::Library by Doug Gorley
Modules with a larger purpose
* DBIx::SQLEngine by Matthew Simon Ryan Cavaletto
* Class::DBI/Ima::DBI
* DBIx::Recordset
ERRATA
This file is
<http://www.metaperl.com/article-pod/Catalog-sql_phrasebooks>
AUTHOR
--
Carter's Compass: I know I'm on the right track when,
by deleting something, I'm adding functionality.
_______________________________________________
sw-design mailing list
http://metaperl.com/cgi-bin/mailman/listinfo/sw-design

Terrence Brannon

2005-04-11 13:42:37 UTC

Post by Rob Kinyon
I have a philosophical issue with SQL Phrasebooks in general. This may
have to do with the types of applications I tend to write, so please
take me with a grain of salt.
My applications tend to have a DBI wrapper that provides two pieces of

0) Cataloging your database connection information,
e.g. DBIx::Connect, Config::DBI, DBIx::DBH, DBIx::Password

Post by Rob Kinyon
1) Business-level functions for accessing datastores (databases and other stuff)

that sounds like DBIx::AnyDBD to me:

use DBIx::AnyDBD;

my $db = DBIx::AnyDBD->connect("dbi:Oracle:sid1",
"user", "pass", {}, "MyClass");

my $foo = $db->sales_to_date('2005-03');
my $blee = $db->month_to_date(2004-12');

Post by Rob Kinyon
2) Functionality for generating SQL on the fly, using somewhat complex

graph theory

I remember Ovid mentioning he did a similar thing at his previous
company. I don't believe either of you have released these powerful
modules to CPAN.

Post by Rob Kinyon
The first is for things like "Given this username and password, can
the person log in?". The second is for reports. My reports aren't your
grandma's reports - you have a list of options to choose from and the
app figures out what SQL to run from there, including all the table
joins.

any code samples to look at?

Post by Rob Kinyon
So, what need do I have for a phrasebook? I can understand, to some
degree, putting the statements from the first grouping into a
phrasebook, but they're already behind a layer of abstraction and in
one place. So, doesn't that mean I've satisfied the benefits of a
phrasebook?

yes, I see your point. You custom-generate SQL
dyanmically. SQL::Phrasebooks are for storing static SQL. They are
completely inappropriate for dynamic SQL.

--
Carter's Compass: I know I'm on the right track when,
by deleting something, I'm adding functionality.

Rob Kinyon

2005-04-11 14:29:21 UTC

Post by Rob Kinyon
My applications tend to have a DBI wrapper that provides two pieces of

0) Cataloging your database connection information,
e.g. DBIx::Connect, Config::DBI, DBIx::DBH, DBIx::Password

Config::DBI is something I would use in the future, and those modules
are useful in their own right and can be delegated to.

Post by Rob Kinyon
1) Business-level functions for accessing datastores (databases and other stuff)

use DBIx::AnyDBD;
my $db = DBIx::AnyDBD->connect("dbi:Oracle:sid1",
"user", "pass", {}, "MyClass");
my $foo = $db->sales_to_date('2005-03');
my $blee = $db->month_to_date(2004-12');

You're still writing the MyClass::Oracle and MyClass::MySQL. And,
that's not the point of encapsulating at this level. The point is to
write your business-level code in a business-speak. DBIx::AnyDBD
doesn't facilitate that. In fact, it somewhat complicates that effort.

Post by Rob Kinyon
2) Functionality for generating SQL on the fly, using somewhat complex

graph theory

I remember Ovid mentioning he did a similar thing at his previous
company. I don't believe either of you have released these powerful
modules to CPAN.

I have a few problems:
1) API
How should this be designed to be used? I know how we used it in the
one place I wrote it for. Once I write it for somewhere else, maybe
I'll have a better idea of general usage. But, this isn't a simple API
task.

2) Schema normalization
My code allowed for selectively denormalized tables, but the general
solution requires normalizations to at least 3rd normal, if not
Boyce-Codd. Otherwise, you cannot guarantee that the correct joins
will be made. (It's a relative of the halting problem.)

3) Schema discovery
I wrote mine to require that the tables and joins be listed in some
HoH. But, schema discovery (including joins based on FK's) would be
preferable. How should this be done? How should the user be able to
override it?

In other words, I whipped something out and never wrote any
general-purpose requirements for it. It did what I needed it to do,
but requires more infrastructure than a CPAN module should rightly
have. If you want to help me figure out the correct API / user
interface, I would love to release my code.

any code samples to look at?

Kinda-sorta. Basically, you would have the following structure:
1) Report base class
2) Report child class that implements the actual report
3) Parameters that the report child class declares as needed, each
with its own class

The report child class would specify the invariants that made this the
"Sales Report" - the select clauses and the minimally-sufficient where
clauses. Maybe a few groupbys. Then, the base class would handle the
parameters. This is where it would become the "Sales Report for Green
Widgets in the Foo Region grouped by Branch for the months of Janurary
and February sorted by total sales ascending". The parameters would
simply be added to the SQL statement as select, where, and group by
clauses. Then, the fancy-shmancy SQL builder would run, pulling in
tables and join statements as necessary into the FROM clauses. It
would run, then it would call an inflater function in the child class.
That inflater would take the flat dataset from the SQL statement and
convert it into whatever it needed to. This included calling other SQL
statements, maybe a few id->name functions, or whatever. It would then
call the parent's sorting routine, so it would sorted correctly.

Finally, it goes to the base-class's print function, which determines
if it should be HTML, PDF, or XLS, and outputs it appropriately.

The code for that is about 2000 lines long, across roughly 50 classes.
What would you like to look at?

yes, I see your point. You custom-generate SQL
dyanmically. SQL::Phrasebooks are for storing static SQL. They are
completely inappropriate for dynamic SQL.

About half of my SQL statements are usually static. But, I've already
abstracted them out of the business logic. So, why do I need a
phrasebook? In fact, I think I'm past a phrasebook in that the
business logic doesn't even know that the data is coming from a RDBMS.
All it cares is that the $foo object can get the needed data.

Rob

Adam Kennedy

2005-04-12 02:00:04 UTC

Post by Rob Kinyon
2) Functionality for generating SQL on the fly, using somewhat complex

graph theory

I remember Ovid mentioning he did a similar thing at his previous
company. I don't believe either of you have released these powerful
modules to CPAN.

And count me in as another person doing the same thing and not releasing
it to CPAN.

Quite frankly, you already HAVE stuff to do this in CPAN. It's
practically just another Class::DBI OO->DB persistant datastore for the
most part.

Class::DBI
SQL::Routine
Tangram
Rosetta
etc
etc
etc

CPAN has a number of perfectly workable modules as it is.

FWIW, in my case I've tried on a number of occasions to split it out to
CPAN, about 20-30 of my CPAN modules are spin-offs from my system, but
unfortunately, the SQL generation is tied to the data type objects,
which is tied to the widget library, which is all tied to the metadata
subsystem.

Summary, it simply can't be broken down into small enough pieces.

I would imagine others are in the same situation.

Adam K

Rob Kinyon

2005-04-12 02:57:51 UTC

Post by Adam Kennedy
Quite frankly, you already HAVE stuff to do this in CPAN. It's
practically just another Class::DBI OO->DB persistant datastore for the
most part.
Class::DBI
SQL::Routine
Tangram
Rosetta
etc
etc
etc
CPAN has a number of perfectly workable modules as it is.

I don't think any of them are the round peg I was looking for.

C::DBI is too slow. (I had to search million++ row tables with
multiple joins and return back in under 2 seconds on a low-midsize
machine.)

SQL::Routine / Rosetta (they're linked) are just waaaay too
complicated. The learning curve is so steep that it was easier for me
to roll my own than to understand their stuff.

Tangram is a OO-inheritance-to-relational mapper (which is dumb to
apply to the relational world, but that's another topic).

I tried to get my code into C::DBI, to facilitate multi-table joins
(which it doesn't do very well right now), but was put off by the
complexity of C::DBI (which I've never used past sandbox).

Post by Adam Kennedy
FWIW, in my case I've tried on a number of occasions to split it out to
CPAN, about 20-30 of my CPAN modules are spin-offs from my system, but
unfortunately, the SQL generation is tied to the data type objects,
which is tied to the widget library, which is all tied to the metadata
subsystem.
Summary, it simply can't be broken down into small enough pieces.

I'm not sure that's exactly right. I know my stuff is in very small
pieces. I think the problem is that there isn't very good
interoperability between my pieces and your pieces. No-one decomposes
the problem very well.

Now, it may be a good idea to get you, me, Ovid, and whomever else
involved in a solid decomposition of the problemspace and really lay
out some good specs and a solid API. I'd be up for it ... Adam?
Curtis? Anyone else?

Rob

Adam Kennedy

2005-04-12 03:17:16 UTC

Post by Rob Kinyon
Now, it may be a good idea to get you, me, Ovid, and whomever else
involved in a solid decomposition of the problemspace and really lay
out some good specs and a solid API. I'd be up for it ... Adam?
Curtis? Anyone else?

Sure, I'd be happy to chip in, although I have to say that I'm so
unbelievably overloaded at the moment, the prospect of adding yet
another major project makes me tired just thinking about it.

Once I've finished and 1.0'ed at least 2-3 of...

- Parsing Perl without perl (PPI)
- Parameter coercion (Param::Coerce)
- Data Products in CPAN (Data::Package)
- CVS Monitor 0.7
- My code generation startup
- My automotive industry startup
- Sleep

...THEN get back to me on a large database'ey project :)

Or maybe we can look at this as a Perl 6 module? :)

Adam K

Dave Rolsky

2005-04-12 05:30:48 UTC

Post by Rob Kinyon
C::DBI is too slow. (I had to search million++ row tables with
multiple joins and return back in under 2 seconds on a low-midsize
machine.)
SQL::Routine / Rosetta (they're linked) are just waaaay too
complicated. The learning curve is so steep that it was easier for me
to roll my own than to understand their stuff.
Tangram is a OO-inheritance-to-relational mapper (which is dumb to
apply to the relational world, but that's another topic).
I tried to get my code into C::DBI, to facilitate multi-table joins
(which it doesn't do very well right now), but was put off by the
complexity of C::DBI (which I've never used past sandbox).

Did you look at Alzabo at all? I think it does a much better job than
C::DBI of handling the generation of complex queries programmatically.
It's less complex than Rosetta, from what I can tell, but moreso than
C::DBI. And it's not an OO-RDBMS type of tool at all, even less so than
C::DBI.

-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/

Terrence Brannon

2005-04-12 10:59:47 UTC

Post by Rob Kinyon
C::DBI is too slow. (I had to search million++ row tables with
multiple joins and return back in under 2 seconds on a low-midsize
machine.)

yes, it is heavily and eagerly object-oriented. The thing I like about
DBIx::SQLEngine (http://search.cpan.org/~evo/DBIx-SQLEngine-0.93/) is
that it has a very direct convenience layer to DBI.

Post by Rob Kinyon
SQL::Routine / Rosetta (they're linked) are just waaaay too
complicated. The learning curve is so steep that it was easier for me
to roll my own than to understand their stuff.

There is a new player in the game which looks rather nice: (Rose::DB)

http://search.cpan.org/~jsiracusa/Rose-DB-0.0142/

Post by Rob Kinyon
I tried to get my code into C::DBI, to facilitate multi-table joins
(which it doesn't do very well right now), but was put off by the
complexity of C::DBI (which I've never used past sandbox).

I find CDBI very straightforward and convenient. The docs are a bit
light at times. And unfortunately there is not good handling of the
vendor dependant syntax for LIMIT and GROUP BY. I don't like having to
tell it the relation to other tables and I don't trust the loader
modules. Alzabo wins big here because of the schema design tool: you
have visual confirmation that it understands your database.

Post by Dave Rolsky
Did you look at Alzabo at all? I think it does a much better job than
C::DBI of handling the generation of complex queries
programmatically.

Well Class::DBI::AbstractSearch makes SQL::Abstract available to
CDBI. It doesn't appear easy to go from form data (or let's just stay
a hash of query data) to query generation
in Alzabo. For example, you can take HTML::Mason's %ARGS and submit it
directly to DBIx::Recordset->Search() to issue a SQL query.

Post by Dave Rolsky
It's less complex than Rosetta, from what I can tell, but moreso
than C::DBI. And it's not an OO-RDBMS type of tool at all, even
less so than C::DBI.

The nail in the coffin for my interest in Alzabo came when I asked if
I could catalog and formulate my own pure SQL and use it with Alzabo
and the answer was no... this feels very limiting to me.

The most complete comparison of DBI wrappers I have seen is here:

http://search.cpan.org/~evo/DBIx-SQLEngine-0.93/SQLEngine/Docs/Related.pod

Is there an update on Alzabo's ability to easily switch between dev
and production database connection information? If so, then the table
in that doc needs updating.

--
Carter's Compass: I know I'm on the right track when,
by deleting something, I'm adding functionality.

Dave Rolsky

2005-04-12 16:15:50 UTC

Post by Terrence Brannon
Well Class::DBI::AbstractSearch makes SQL::Abstract available to
CDBI. It doesn't appear easy to go from form data (or let's just stay
a hash of query data) to query generation
in Alzabo. For example, you can take HTML::Mason's %ARGS and submit it
directly to DBIx::Recordset->Search() to issue a SQL query.

Well, it's hardly all that difficult. Assuming you're doing a one table

Post by Terrence Brannon
The nail in the coffin for my interest in Alzabo came when I asked if
I could catalog and formulate my own pure SQL and use it with Alzabo
and the answer was no... this feels very limiting to me.

Err, the answer is yes.

my $dbh = $schema->driver->handle;

Post by Terrence Brannon
http://search.cpan.org/~evo/DBIx-SQLEngine-0.93/SQLEngine/Docs/Related.pod
Is there an update on Alzabo's ability to easily switch between dev
and production database connection information? If so, then the table
in that doc needs updating.

If by that you're referring to what called "named configs" in that table,
then no. But bolting it on is easy enough, since connections are simply a
hash of parameters.

$schema->connect( My::Config->production_config );

$schema->connect( My::Config->dev_config );

I do this sort of thing all the time.

-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/

John Siracusa

2005-04-13 15:46:00 UTC

Post by Terrence Brannon
There is a new player in the game which looks rather nice: (Rose::DB)
http://search.cpan.org/~jsiracusa/Rose-DB-0.0142/

I think Rose::DB::Object is the module you want. Rose::DB::Object uses
Rose::DB for database abstraction, but Rose::DB doesn't model entire
schemas a la Alzabo. Right now, both modules are more focused on
object-relational mapping at the table, row, and column levels.

C::DBI is too slow. (I had to search million++ row tables with multiple
joins and return back in under 2 seconds on a low-midsize machine.)

FWIW, Rose::DB::Object is faster that Class::DBI for loading and saving
individual rows and fetching objects referenced by foreign keys. I
haven't benchmarked the multiple object searchs, but considering that
Rose::DB::Object::Manager's iterator doesn't fetch all rows before
starting the iteration like Class:DBI does, I suspect it'll fare well.
It also supports certain kinds of joins during searches.

Post by Terrence Brannon
I find CDBI very straightforward and convenient. The docs are a bit
light at times. And unfortunately there is not good handling of the
vendor dependant syntax for LIMIT and GROUP BY.

Rose::DB::Object::Manager handles one oddball LIMIT syntax so far (the
"FIRST ..." syntax from Informix) but only handles the standard GROUP
BY syntax.

It does, however, support AbstractSearch-like stuff, including nested
boolean logic (and more, e.g., searching for values inside Informix SET
columns and querying columns in auto-joined tables), but with the twist
that it can (optionally) parse and reformat your query values for you
based on the column types they reference, or even the particulars of
the object methods that service them. It's very nice to be able to be
able to throw, say, DateTime objects in as query parameters, let me
tell you :)

Rose::DB::Object[::Manager] also handles the auto-inlining of query
values that can't be bound using some DBD drivers at this time (e.g.,
"CURRENT" when using DBD::Informix). Example from the POD:

CREATE TABLE test (d DATETIME YEAR TO SECOND);

INSERT INTO test (d) VALUES (CURRENT); # okay

$sth = $dbh->prepare('INSERT INTO test (d) VALUES (?)');
$sth->execute('CURRENT'); # boom!

"DBD::Informix::st execute failed: SQL: -1262: Non-numeric
character in datetime or interval."

That used to drive me nuts.

Anyway, I plan to create a tutorial POD for Rose::DB::Object
eventually. In the meantime, please check it out. I find it very
useful and pleasant to use (surprise :)

-John

Rob Kinyon

2005-04-12 13:12:06 UTC

I've spent about an hour going through the Alzabo docs and I'm about
20% done and haven't even downloaded it. It looks amazingly powerful
and I think it solves the problem Adam, Curtis, and I each
independently solved, but in a different way.

The only problem I see with Alzabo (and CDBI and similar tools) is
that you have to design your project with it in mind. Converting an
existing codebase to Alzabo or CDBI is going to be impossible without
a complete rewrite.

Now, this isn't an objection to Alzabo specifically -
Class::MakeMethods and similar tools are the same way. I just don't
have that many new projects where I can really play with it that I
wouldn't use DBM::Deep for. :-)

Adam Kennedy

2005-04-13 00:41:58 UTC