PyTable RDBMS Wrapper

PyTable is a relational database wrapper based on BasicProperty's rich property-based modeling system.  PyTable provides a "thick" wrapper for PostgreSQL (and MySQL) database adapters which fills in missing functionality in the adapter layer to provide a uniform rich external API.

PyTable is not an Object-Relational mapper as such. Instead it is a middle-level interface which takes care of much of the bookkeeping normally required of an application using the DB-API.  Code using PyTable is still focused on RDBMS development, it simply has a number of (considerable) shortcuts in the development process.

PyTable was originally created as part of the wxPython Properties Distribution, as an attempt to enable simple RDBMS-based applications.  The code is under daily use, but is not yet widely used.  It is released under a BSD-style license. PyTable can be downloaded from the project page on SourceForge.  You will need the BasicProperty package installed to use it.  There's not a lot of documentation at the moment, but there is the pydoc reference available.

Usage

PyTable is not a very large project.  It's basically just a wrapper around the DB-API to make DB-API-like programming a little easier.  However, the patterns of programming you use with PyTable tend to be different than those you'd use with raw DB-API code, as the bookkeeping services of PyTable make it very convenient to refer to the schema objects in order to direct and control your SQL queries.

Creating Specifiers and Connections

PyTable generally uses propertied objects for representing any database-based object.  This includes the specifiers describing a database connection.  The DBSpecifier object is responsible for looking up the appropriate database driver, and using that driver to create a database connection.

For example, let's create a database specifier for connecting to our local database:

"""Simple connection example

This just shows how to use a dbspecifier
object to connect to a particular database.

Note: you need a database named test running
on the specified host, with given user and
password to run this script!
"""
from pytable import dbspecifier

specifier = dbspecifier.DBSpecifier(
drivername = "PyPgSQL",
host = "localhost",
user = "test",
password = "password",
database = "test",
)

driver, connection = specifier.connect( )
print "Driver:", driver
print "Connection:", connection

Now, just to show that we're actually just a regular DB-API-style interface, we will do something that you would normally never do with PyTable, we'll use the raw connection/cursor to prove that the connection is active.

cursor = connection.cursor()
cursor.execute( """SELECT 42;""" )
print "Life:", cursor.fetchall()

And that's it.  Selecting a different database is simply a matter of specifying a different drivername.  You can store DBSpecifier objects in pickles, so you can allow your users to edit them and save the edited values to reconnect to the same database.

Driver Name
Implementation Class
Description
SQLitepytable.pysqlite.sqlitedriver.SQLiteDriverSQLite via PySQLite
MkSQLpytable.mk.mkdriver.MkDriverMetakit via MkSQL (unfinished)
PyPgSQLpytable.pypgsql.pgdriver.PGDriverPostgreSQL via PyPgSQL
PyGreSQLpytable.pygresql.pgdriver.PGDriverPostgreSQL via PyGreSQL
psycopgpytable.psycopg.psycodriver.PsycoDriverPostgreSQL via psycopg
MySQLpytable.mysql.mydriver.MyDriverMySQL via MySQLdb

Creating Schemas Manually

Most of the time when using PyTable, you work with a predefined database schema tree.  This tree allows for looking up particular tables/fields/constraints at runtime, associating particular row/result-set classes with tables, and generally storing most of the relevant information about your database design in a convenient in-memory format.

For most application development, you create the database schema directly, specifying each table and field in the Python code, specifying constraints and comments regarding each table as you go.  This example shows how you would create such a schema.

"""Example of manually creating a database schema

There are two normal ways to create database schemas
for use with pytable. This approach, (manually creating
the description using the schemabuilder module) is the
more common "application" development pattern, while the
reverse engineering mechanism is more commonly used for
"scripts" which need to deal with existing databases.
"""
from pytable.schemabuilder import *

The schemabuilder module is just a convenience wrapper around the classes defined in the dbschema module.  You will likely want to refer to the dbschema pydoc reference to see all the various properties associated with the various schema sub-types.  You can also check the table at the end of this section which shows the various schema classes and their associated schemabuilder aliases.


schema = database(
name = "test",
comment = """A simple testing database""",

Most schema objects have a comment property, currently these are used solely for documentation purposes, but eventually they may generate "COMMENT" declarations in the database.

    tables = [
table(
"pets",
comment ="""Storage for simple pets information""",
fields = [
field(
"pet_name", "text", 0, """The name of the pet""",
defaultValue = "'stringValue'",

The positional arguments to the "field" function are: field-name, SQL-field-type, SQL-field-size, and comment

Note the use of the raw SQL syntax for the defaultValue declaration.  This allows you to include arbitrary SQL code, including calls out to sequences, functions and the like.

                    constraints = [ primary(), notNull() ],

Constraints on the field are specified as a list of constraint objects.

                ),
field(
"pet_age", "integer", 0, """The age of the pet""",
constraints = [ notNull() ],
),
],
),
table(
"houses",
comment ="""Storage for simple house information""",
fields = [
field(
"house_id", "serial", 0, """Unique house identifier""",

The use of the serial data type limits this schema to use on PostgreSQL databases.  PyTable doesn't know anything about the data type, and won't warn you about such situations until you try to move the schema to another database.

                    constraints = [ primary(), notNull() ],
),
field(
"type", "text", 0, """The type of the house""",
constraints = [ notNull()],
),
],
defaultRecords = [
{ 'type': 'cage' },
{ 'type': 'dog house' },
{ 'type': 'dog basket' },
{ 'type': 'cat basket' },
{ 'type': 'bowl' },
{ 'type': 'acquarium' },
],

The defaultRecords property stores a list of dictionaries (or dictionary-like objects) which are used to generate INSERT statements during SQL generation.  The INSERT statements are produced after all other table-related statements as of version 0.5.9

            indices = [
index( unique=1, fields=('type', ) ),
],
),
table(
"house_pets",
comment = """Stupid word-play mapping pet: house""",
fields = [
field(
'house_id', 'integer', 0, """Reference to the house""",
constraints = [
foreignKey(
"houses", # uses primary key by default...
onDelete = "CASCADE",
onUpdate = "CASCADE",
),

Foreign key constraints are specified much the same as any other constraint, the first two positional arguments are the foreign table and list of foreign fields to which the foreign key refers.  If using the primary key of the foreign table (as above) you may omit the foreign fields argument entirely.  The onDelete and onUpdate properties of the constraint allow for specifying the reactions to these events (referenced-record deletion, and referenced-record key-value change).

                        notNull(),
],
),
field(
'pet_name', 'text', 0, """Reference to the pet""",

),
],
constraints = [
foreignKey(
"pets", # foreign table which constrains
("pet_name",), # foreign fields which constrain
fields = ("pet_name",), # local fields constrained
onDelete = "SET NULL",
onUpdate = "CASCADE",
),

Table constraints are specified in almost the same way as field constraints, the difference being that the "fields" property of the constraint becomes necessary to specify which fields are being constrained.

            ],
),
],
)
Class
Schema Builder Names (definition)
Description
DatabaseSchema
database( name, tables=(), comment="", **named )
Schema for an overall database object
IndexSchema
index( fields=(), unique=1, **named )
Schema for defining a particular index (on a particular table)
TableSchema
table( name, fields=(), comment="", **named )
High-level representation of a table's structure/schema
FieldSchema
field( name, dbDataType, displaySize=0, comment="", **named )
Schema for a particular field of a table
SequenceSchema
sequence( name, comment="", **named )
count
A sequence object used for implementing e.g. serial fields
ForeignKeyConstraint
foreignKey( foreignTable, foreignFields=(), comment="", **named )
foreign
references
Foreign-key constraint for a field or table
CheckConstraint
check( expression, **named )
An SQL-statement CHECK constraint
PrimaryConstraint
primary
primaryKey
(Possibly multi-field) primary-key unique constraint
UniqueConstraint
unique
(Possibly Multi-field) unique-value constraint
NotNullConstraint
notNull
A field's not-null constraint

Generating SQL Statements from a Schema

Since we have manually created our schema above, we would likely want to generate SQL statements for building the schema inside a database.  This process is fairly straightforward, an "SQLCreateStatements" object is instantiated, and then called with the schema as argument.  It produces a sequence of SQL statements which will create the schema through a database connection (note that the SQL statements do not include creating the database itself, only the tables and sequences within the database schema).

"""Generate DB structure from manual declaration

This example imports the previously created
example schema and generates the SQL statements
required to build that schema.

The sqlgeneration module provides the actual SQL
statement generation. The sqlquery module is
used to run the SQL statements.
"""
from pytable import dbspecifier, sqlgeneration, sqlquery
import example_schema

specifier = dbspecifier.DBSpecifier(
drivername = "PyPgSQL",
host = "localhost",
user = "test",
password = "password",
database = "test",
)

driver, connection = specifier.connect( )

generator = sqlgeneration.SQLCreateStatements(
# driver can be used to customise what's generated...
driver,
)
# call the generator with the schema as argument
statements = generator( example_schema.schema )
for statement in statements:
print statement # just so we see what's being executed
sqlquery.SQLQuery( sql = statement )( connection )

# as with the DB-API, autoCommit is false by default, so
# everything we've done will disappear when we exit...

In real world situations, you would normally use: SQLDropStatements, SQLCreateStatements, SQLGrantStatements and SQLRevokeStatements (in order) to drop any previous table/sequences, and establish some useful access restrictions.  All of the statement generators work in the same way, save that Grant and Revoke have the properties; privileges (list of privilege-description strings), users (list of user/group names) and isGroup (whether the names are groups or users).

Query Objects (Usage)

Generating SQL query strings tends to be a rather tedious and error-prone task, however, in the end, people working with databases professionally do tend to know how to sling SQL, so we'd like to let them use that SQL knowledge to query their databases.  In particular, we need a way to build up query strings from a number of different types of elements:

PyTable's approach to this set of problems is the SQLQuery object.  This class is both a base class from which you can derive "complex" queries, and a readily-used mechanism for performing simple queries.

SQLQuery objects have two levels of escaping/substitution of named parameters:

they are callable objects which can have their __call__ method overridden, and/or can have their "processResults" method overridden to alter how the result cursor is treated.

Here's a simple example of using an SQLQuery object without subclassing:

import build, example_schema
from pytable import sqlquery

query = sqlquery.SQLQuery(
sql = """INSERT INTO
pets(pet_name,pet_age)
VALUES
(%%(pet_name)s,%%(pet_age)s);""",
)

Note the user of %% formatting for raw data-values.  We don't want to substitute the values in as raw SQL (a single %), we want them to go through the driver's data-value escaping.

query(
build.connection,

Note the need to pass in a connection or cursor object as the first parameter.

    pet_name = 'Tweety',
pet_age = 3,
)

cursor = sqlquery.SQLQuery(
sql = """SELECT * FROM pets %(whereClause)s;""",

Note the single % for fragments of SQL syntax.

)(
build.connection,
whereClause = "WHERE pet_name = %(name)s",

Note that, within fragments which are substituted into the query, data-value substitution parameters need only a single %.  In other words, the fragment is showing up after the first substitution pass, so it only needs the single escaping.

    name = "Tweety",
)
for row in cursor:
print 'Row:', row

This last works because the dbcursor class provides an iteration method.

The SQLQuery object has two properties, sql, which stores the SQL query string, and debug, which is a boolean indicating whether the query should print debugging information to the console when running.  The debugging information is the query after it's first substitution pass (i.e. just before the driver is given the data-values to escape), and the set of arguments to the query at that point.  More precisely, the debug print happens just before the cursor.execute operation.

Query Objects (Subclassing)

SQLQuery objects are designed to be sub-classed.  In particular we would often like to be able to specify:

Normally we sub-class the query objects in order to create a reusable query that needs a great deal of paramaterisation or post-processing of the results and which is used from a number of different places.  Customising the SQLQuery object normally occurs by overriding either __call__ or processResults.

The first sample demonstrates overriding __call__ to pre-process query arguments, in this case, to allow either the primary key of a table, or a row-record for the table to be passed as the determinant for a WHERE clause.

from pytable import sqlquery

class FindHouses( sqlquery.SQLQuery ):
"""Queries database for houses by pet_name or pet
"""
sql = """SELECT h.*
FROM
houses h, house_pets hp
WHERE
hp.pet_name = %%(pet)s -- note use of doubled percent signs!
AND
h.house_id = hp.house_id
;"""
def __call__( self, cursor, pet, **named ):
"""Coerce pet to pet_name and then execute query"""
if not isinstance( pet, (str,unicode)):
name = getattr( pet, 'pet_name', None )
if name is None or not isinstance( name, (str,unicode)):
raise TypeError( """%r instance %s could not be coerced to a pet_name"""%(
pet.__class__, pet,
))
pet = name
return super( FindHouses, self ).__call__(
cursor,
pet = pet,
**named
)

Note that the sql property of the sqlquery.SQLQuery class is being hidden by the "sql" attribute defined in the subclass.

The second sample demonstrates overriding processResults to return something other than the default cursor object from a query:

class ListDatabases( sqlquery.SQLQuery ):
"""Queries PostgreSQL server for list of database-names

returns a simple list of string names
"""
sql = """SELECT datname FROM pg_database
WHERE datname != 'template1' AND datname != 'template0'
"""
def processResults( self, cursor, **namedarguments ):
"""Read database name list from cursor"""
return [ row[0] for row in cursor.fetchall() ]

processResults receives the cursor which executed the query as the first positional argument, and all other arguments to the query as named arguments.

LazyResultSet and DBResultSet

The results of an SQLQuery are normally a DBCursor object.  This is a fairly minimal wrapper around a DB-API cursor, just providing generator-based iteration and a pointer back to the connection from which the cursor was created. Often we want a more list-like operation, particularly the ability to access results in random-access order.  We'd often like to be able to do this without loading the whole result-set at once as well.

LazyResultSet provides these semantics, loading and wrapping rows from the cursor and caching the wrapped rows to allow for random-access operation.  The DBResultSet object simply adds a pointer to a table-schema to which attribute access is delegated.

Table and Field Implementation Classes

TableSchema and FieldSchema objects and their methods, also DBRow and DBProperty.

Database Introspection Facilities

PyTable has support for querying PostgreSQL, SQLite and MySQL databases to obtain general schema information about the current database.  These queries include listing of active databases, tables and fields within those tables.  Introspection facilities are used like so:

driver, connection = specifier.connect()
if hasattr( driver, 'listDatabases'):
result = driver.listDatabases( connection )
assert result, """0 databases on server? %s"""%(specifier)
if hasattr( driver, 'listTables' ):
tables = driver.listTables( connection )
if hasattr( driver, 'listIndices' ):
for table in tables:
result = driver.listIndices( connection, tableName=table )
if hasattr( driver, 'attrDescription'):
for table in tables:
cursor = driver.attrDescription( connection, tableName=table )
if hasattr( driver, 'attrDefault'):
for table in tables:
cursor = driver.attrDefault( connection, tableName=table )

Changes

There are two projects within the PyTable SourceForge project, the core PyTable package, and a Zope Database Adapater (DA).  This changelog does not cover the project changes from before the split out of the wxPython Properties Distribution.

PyTable

Version 0.8.3

Version 0.8.2

Version 0.8.1

Version 0.8.0

Version 0.7.12

Version 0.7.11

Version 0.7.10

Version 0.7.9

Version 0.7.8

Version 0.7.7

Version 0.7.6

Version 0.7.5

Version 0.7.2

Version 0.7.1

Version 0.7.0

Version 0.6.7

Verison 0.6.6

Version 0.6.5

Version 0.5.9

Version 0.5.8

Version 0.5.7

PyTableDA (Zope Database Adapter for PyTable)

Version 0.1.3

Version 0.1.2

Version 0.1.1

Version 0.1.0

License

PyTable RDBMS Middleware
Copyright (c) 2002-2004, Michael C. Fletcher
All rights reserved.

THIS SOFTWARE IS NOT FAULT TOLERANT AND SHOULD NOT BE USED IN ANY
SITUATION ENDANGERING HUMAN LIFE OR PROPERTY.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials
provided with the distribution.

The name of Michael C. Fletcher may not be used to endorse or
promote products derived from this software without specific
prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT HOLDERS AND CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
OF THE POSSIBILITY OF SUCH DAMAGE.