(Last updated 2006.09.08 at 16:35 UTC)
mx.DateTime
? (Updated for 3.2)NUMERIC
/DECIMAL
) Handling (Updated for 3.2)The Firebird relational database engine has a large feature set, conforms closely to SQL standards, and is flexible enough to operate either as a standalone server or as an embedded library on diverse platforms. In spite of this versatility, the database is easy to use--almost self-managing.
The Python programming language supports numerous paradigms, is suitable for constructing both small and large programs, and integrates well with native C and C++ libraries. Despite the versatility of the language, well written Python code achieves an exceptional lucidity that has led some to call the language "executable pseudocode".
These two top-flight software tools intersect in a library named KInterbasDB. KInterbasDB implements Python's standard Database API 2.0, but also extends far beyond, to cover almost all of Firebird's extensive native client API. KInterbasDB strives to deliver the power of Firebird into the hands of the Python programmer without compromising the qualities of either tool.
This Usage Guide is not a tutorial on Python, SQL, or Firebird; rather, it is a topical presentation of KInterbasDB's feature set, with example code to demonstrate basic usage patterns. This guide is meant to be consumed in conjunction with the Python Database API Specification and the Firebird documentation, especially the professional, seven-volume manual for Firebird's commercial ancestor, Interbase®.
The table of contents presents a structural overview of this document.
DATETIME
type comparison singleton
KInterbasDB's deferred loading of dynamic type translators causes this
singleton to behave in violation of the standard until the
kinterbasdb.init
function has been called (whether
explicitly or implicitly).
For more information, see this section.
Cursor
class
nextset
methodThis method is not implemented because the database engine does not support opening multiple result sets simultaneously with a single cursor.
Cursor
class
arraysize
attribute
As required by the spec, the value of this attribute
is observed with respect to the fetchmany
method. However, changing the value of this attribute does
not make any difference in fetch efficiency because the
database engine only supports fetching a single row at a time.
setinputsizes
methodAlthough this method is present, it does nothing, as allowed by the spec.
setoutputsize
methodAlthough this method is present, it does nothing, as allowed by the spec.
KInterbasDB offers a large feature set beyond the minimal requirements of the Python DB API. Most of these extensions are documented in the section of this document entitled Native Database Engine Features and Extensions Beyond the Python DB API.
This section attempts to document only those features that overlap with the DB API, or are too insignificant to warrant their own subsection elsewhere.
connect
function
This function supports the following optional keyword arguments in addition to those required by the spec:
role
-
for connecting to a database with a specific SQL role
(see page 92 of the
Interbase® 6 Operations Guide
for a discussion of Interbase® roles).
Example:
kinterbasdb.connect(dsn='host:/path/database.db', user='limited_user', password='pass', role='MORE_POWERFUL_ROLE')
charset
-
for explicitly specifying the character set of the connection.
See page 221 of the
Interbase® 6 Data Definition Guide
for a list of available character sets, and
this FAQ
for information on handling extended character sets with KInterbasDB.
Example:
kinterbasdb.connect(dsn='host:/path/database.db', user='sysdba', password='pass', charset='UNICODE_FSS')
dialect
-
for explicitly specifying the SQL dialect of the connection.
In KInterbasDB 2.x, the default dialect was 1
(the compatibility dialect for Interbase® 5.5 and earlier).
In KInterbasDB 3.x, the default dialect is 3
(the most featureful dialect, ideal for Interbase® 6.0+
and Firebird).
If you want to connect to Interbase® 5.5 or earlier, you must
explicitly set this argument's value to 1
.
Dialect 2
is a transitional dialect that is
normally used only during ports from IB < 6 to IB >= 6 or
Firebird.
Example:
kinterbasdb.connect(dsn='host:/path/database.db', user='sysdba', password='pass', dialect=1)
Connection
class
charset
attribute (read-only)
The character set of the connection (set via the charset
parameter of kinterbasdb.connect
).
See page 221 of the Interbase® 6 Data Definition Guide for a list of available character sets, and this FAQ for information on handling extended character sets with KInterbasDB.
dialect
attributeThis integer attribute indicates which SQL dialect the connection is using.
You should not change a connection's dialect; instead, discard the connection and establish a new one with the desired dialect.
For more information, see the documentation of the
dialect
argument of the
connect
function.
server_version
attribute (read-only)The version string of the database server to which this connection is connected.
For example, a connection to Firebird 1.0 on Windows has the
following server_version
:
WI-V6.2.794 Firebird 1.0
execute_immediate
methodExecutes a statement without caching its prepared form. The statement must not be of a type that returns a result set.
In most cases
(especially cases in which the same statement--perhaps a parameterized
statement--is executed repeatedly), it is better to create a cursor
using the connection's cursor
method, then execute the statement
using one of the cursor's execute methods.
Arguments:
sql
-
string containing the SQL statement to execute.precision_mode
attribute
Although this attribute is present in KInterbasDB 3.1+ and works in a backward-compatible fashion, it is deprecated in favor of the more general dynamic type translation feature.
commit
and rollback
methods
The commit
and rollback
methods
accept an optional boolean parameter retaining
(default False
) that indicates whether the transactional
context of the transaction being resolved should be recycled.
For details, see the
Advanced Transaction Control: Retaining Operations
section of this document.
The rollback
method accepts an optional string parameter
savepoint
that causes the transaction to roll back only
as far as the designated savepoint, rather than rolling back entirely.
For details, see the
Advanced Transaction Control: Savepoints
section of this document.
Cursor
class
description
attribute
KInterbasDB makes absolutely no guarantees about
description
except
those required by the Python Database API Specification 2.0 (that
is, description
is
either None
or a sequence of 7-element sequences).
Therefore, client programmers should not rely on
description
being an instance of a particular class or
type.
KInterbasDB provides several named positional constants to be
used as indices into a given element of description
.
The contents of all description
elements are defined by
the DB API spec; these constants are provided merely for
convenience.
DESCRIPTION_NAME DESCRIPTION_TYPE_CODE DESCRIPTION_DISPLAY_SIZE DESCRIPTION_INTERNAL_SIZE DESCRIPTION_PRECISION DESCRIPTION_SCALE DESCRIPTION_NULL_OK
Here is an example of accessing the name of the first
field in the description
of cursor cur
:
nameOfFirstField = cur.description[0][kinterbasdb.DESCRIPTION_NAME]
For more information, see the documentation of Cursor.description in the DB API Specification.
rowcount
attribute
Although KInterbasDB's Cursor
s implement this attribute, the database
engine's own support for the determination of "rows affected"/"rows
selected" is quirky.
The database engine only supports the determination of rowcount for
INSERT
, UPDATE
, DELETE
, and
SELECT
statements.
When stored procedures become involved, row count figures are usually
not available to the client.
Determining rowcount for SELECT
statements is
problematic:
the rowcount is reported as zero until at least one row has been
fetched from the result set,
and the rowcount is misreported if the result set is larger than
1302 rows. The server apparently marshals result sets internally
in batches of
1302, and will misreport the rowcount for result sets larger
than 1302 rows until the 1303rd row is fetched, result sets larger
than 2604 rows until the 2605th row is fetched, and so on,
in increments of 1302.
As required by the Python DB API Spec, the rowcount attribute "is -1 in case no executeXX() has been performed on the cursor or the rowcount of the last operation is not determinable by the interface".
fetch*
methods
KInterbasDB makes absolutely no guarantees
about the return value of the
fetchone
/ fetchmany
/ fetchall
methods except that it is a sequence indexed by
field position.
KInterbasDB makes absolutely no guarantees
about the return value of the
fetchonemap
/ fetchmanymap
/ fetchallmap
methods (documented below)
except that it is a mapping of field name to field
value.
Therefore, client programmers should not rely on the return value being an instance of a particular class or type.
fetchonemap
method
This method is just like the standard fetchone
method
of the DB API, except that it returns a mapping of field name to
field value, rather than a sequence.
fetchmanymap
method
This method is just like the standard fetchmany
method
of the DB API, except that it returns a sequence of mappings of
field name to field value, rather than a sequence of sequences.
fetchallmap
method
This method is just like the standard fetchall
method
of the DB API, except that it returns a sequence of mappings
of field name to field value, rather than a sequence of sequences.
iter
/itermap
methods
These methods are equivalent to the
fetchall
and fetchallmap
methods,
respectively, except that they return iterators rather than
materialized sequences.
iter
and itermap
are exercised in
this example.
This brief tutorial aims to get the reader started by demonstrating elementary usage of KInterbasDB. It is not a comprehensive Python Database API tutorial, nor is it comprehensive in its coverage of anything else.
The numerous advanced features of KInterbasDB are covered in another section of this document, which is not in a tutorial format, though it is replete with examples.
A database connection is typically established with code such as this:
import kinterbasdb # The server is named 'bison'; the database file is at '/temp/test.db'. con = kinterbasdb.connect(dsn='bison:/temp/test.db', user='sysdba', password='pass') # Or, equivalently: con = kinterbasdb.connect( host='bison', database='/temp/test.db', user='sysdba', password='pass' )
Suppose we want to connect to an Interbase® 5.5 server, specifying UNICODE_FSS as the character set of the connection:
import kinterbasdb con = kinterbasdb.connect( dsn='bison:/temp/test.db', user='sysdba', password='pass', dialect=1, # necessary for Interbase® < 6.0 charset='UNICODE_FSS' # specify a character set for the connection )
For this section, suppose we have a table defined and populated by the following SQL code:
create table languages ( name varchar(20), year_released integer ); insert into languages (name, year_released) values ('C', 1972); insert into languages (name, year_released) values ('Python', 1991);
This example shows the simplest way to
print the entire contents of the languages
table:
import kinterbasdb con = kinterbasdb.connect(dsn='/temp/test.db', user='sysdba', password='masterkey') # Create a Cursor object that operates in the context of Connection con: cur = con.cursor() # Execute the SELECT statement: cur.execute("select * from languages order by year_released") # Retrieve all rows as a sequence and print that sequence: print cur.fetchall()
Sample output:
[('C', 1972), ('Python', 1991)]
Here's another trivial example that demonstrates various ways of fetching a
single row at a time from a SELECT
-cursor:
import kinterbasdb con = kinterbasdb.connect(dsn='/temp/test.db', user='sysdba', password='masterkey') cur = con.cursor() SELECT = "select name, year_released from languages order by year_released" # 1. Iterate over the rows available from the cursor, unpacking the # resulting sequences to yield their elements (name, year_released): cur.execute(SELECT) for (name, year_released) in cur: print '%s has been publicly available since %d.' % (name, year_released) # 2. Equivalently: cur.execute(SELECT) for row in cur: print '%s has been publicly available since %d.' % (row[0], row[1]) # 3. Using mapping-iteration rather than sequence-iteration: cur.execute(SELECT) for row in cur.itermap(): print '%(name)s has been publicly available since %(year_released)d.' % row
Sample output:
C has been publicly available since 1972. Python has been publicly available since 1991. C has been publicly available since 1972. Python has been publicly available since 1991. C has been publicly available since 1972. Python has been publicly available since 1991.
The following program is a simplistic table printer
(applied in this example to languages
):
import kinterbasdb as k TABLE_NAME = 'languages' SELECT = 'select * from %s order by year_released' % TABLE_NAME con = k.connect(dsn='/temp/test.db', user='sysdba', password='masterkey') cur = con.cursor() cur.execute(SELECT) # Print a header. for fieldDesc in cur.description: print fieldDesc[k.DESCRIPTION_NAME].ljust(fieldDesc[k.DESCRIPTION_DISPLAY_SIZE]) , print # Finish the header with a newline. print '-' * 78 # For each row, print the value of each field left-justified within # the maximum possible width of that field. fieldIndices = range(len(cur.description)) for row in cur: for fieldIndex in fieldIndices: fieldValue = str(row[fieldIndex]) fieldMaxWidth = cur.description[fieldIndex][k.DESCRIPTION_DISPLAY_SIZE] print fieldValue.ljust(fieldMaxWidth) , print # Finish the row with a newline.
Sample output:
NAME YEAR_RELEASED ------------------------------------------------------------------------------ C 1972 Python 1991
Let's insert more languages:
import kinterbasdb con = kinterbasdb.connect(dsn='/temp/test.db', user='sysdba', password='masterkey') cur = con.cursor() newLanguages = [ ('Lisp', 1958), ('Dylan', 1995), ] cur.executemany("insert into languages (name, year_released) values (?, ?)", newLanguages ) # The changes will not be saved unless the transaction is committed explicitly: con.commit()
Note the use of a parameterized SQL statement above. When dealing with repetitive statements, this is much faster and less error-prone than assembling each SQL statement manually. (You can read more about parameterized SQL statements in the section on Prepared Statements.)
After running Example 4, the table printer from Example 3 would print:
NAME YEAR_RELEASED ------------------------------------------------------------------------------ Lisp 1958 C 1972 Python 1991 Dylan 1995
Interbase® and Firebird support stored procedures written in a proprietary procedural SQL language. IB/FB stored procedures can have input parameters and/or output parameters. Some databases support input/output parameters, where the same parameter is used for both input and output; IB/FB does not support this.
It is important to distinguish between procedures that return a result set and procedures that populate and return their output parameters exactly once. Conceptually, the latter "return their output parameters" like a Python function, whereas the former "yield result rows" like a Python generator.
IB/FB's server-side procedural SQL syntax
makes no such distinction, but client-side SQL code (and C API code)
must.
A result set is retrieved from a stored procedure by
SELECT
ing from the procedure, whereas output
parameters are retrieved with an EXECUTE PROCEDURE
statement.
To retrieve a result set from a stored procedure with KInterbasDB, use code such as this:
cur.execute("select output1, output2 from the_proc(?, ?)", (input1, input2)) # Ordinary fetch code here, such as: for row in cur: ... # process row con.commit() # If the procedure had any side effects, commit them.
To execute a stored procedure and access its output parameters, use code such as this:
cur.callproc("the_proc", (input1, input2)) # If there are output parameters, retrieve them as though they were the # first row of a result set. For example: outputParams = cur.fetchone() con.commit() # If the procedure had any side effects, commit them.
This latter is not very elegant; it would be preferable to access the
procedure's output parameters as the return value of
Cursor.callproc
. The Python DB API specification requires the
current behavior, however.
The Firebird engine stores a database in a fairly straightforward manner: as a single file or, if desired, as a segmented group of files.
The engine supports dynamic database creation via the SQL statement
CREATE DATABASE
, which is documented on page 49 of the
Interbase® 6 Language Reference.
The engine also supports dropping (deleting) databases dynamically, but
dropping is a more complicated operation than creating, for several reasons:
an existing database may be in use by users other than the one who requests the
deletion, it may have supporting objects such as temporary sort files, and it may
even have dependent shadow databases. Although the database engine recognizes a
DROP DATABASE
SQL statement, support for that statement is
limited to the isql
command-line administration utility. However,
the engine supports the deletion of databases via an API call, which
KInterbasDB exposes to Python (see below).
KInterbasDB supports dynamic database creation and deletion via the
module-level function create_database
and the method
Connection.drop_database
. These are documented below, then
demonstrated by a brief example.
create_database
(function; member of kinterbasdb )
|
Creates a database according to the supplied Arguments:
|
drop_database
(method; member of kinterbasdb.Connection )
|
Deletes the database to which the connection is attached. This method performs the database deletion in a responsible fashion. Specifically, it:
This method has no arguments. |
Example program:
import kinterbasdb con = kinterbasdb.create_database( "create database '/temp/db.db' user 'sysdba' password 'pass'" ) con.drop_database()
The database engine features a distributed, interprocess communication mechanism based on messages called database events. Chapter 11 of the Interbase® 6 API Guide describes database events this way:
[A database event is] a message passed from a trigger or stored procedure to an application to announce the occurrence of a specified condition or action, usually a database change such as an insertion, modification, or deletion of a record.
The Interbase® [and Firebird] event mechanism enables applications to respond to actions and database changes made by other, concurrently running applications without the need for those applications to communicate directly with one another, and without incurring the expense of CPU time required for periodic polling to determine if an event has occurred.
Anything that can be accomplished with database events can also be implemented using other techniques, so why bother with events? Since you've chosen to write database-centric programs in Python rather than assembly language, you probably already know the answer to this question, but let's illustrate.
A typical application for database events is the handling of administrative
messages. Suppose you have an administrative message database with a
messages
table, into which various applications insert timestamped
status reports. It may be desirable to react to these messages in diverse ways,
depending on the status they indicate:
to ignore them,
to initiate the update of dependent databases upon their arrival,
to forward them by e-mail to a remote administrator,
or even to set off an alarm so that on-site administrators will know
a problem has occurred.
It is undesirable to tightly couple the program whose status is being reported (the message producer) to the program that handles the status reports (the message handler). There are obvious losses of flexibility in doing so. For example, the message producer may run on a separate machine from the administrative message database and may lack access rights to the downstream reporting facilities (e.g., network access to the SMTP server, in the case of forwarded e-mail notifications). Additionally, the actions required to handle status reports may themselves be time-consuming and error-prone, as in accessing a remote network to transmit e-mail.
In the absence of database event support, the message handler would probably
be implemented via polling. Polling is simply the repetition of
a check for a condition at a specified interval.
In this case, the message handler would check in an infinite loop to see
whether the most recent record in the messages
table was more
recent than the last message it had handled. If so, it would handle the
fresh message(s); if not, it would go to sleep for a specified interval,
then loop.
The polling-based implementation of the message handler is fundamentally flawed. Polling is a form of busy-wait; the check for new messages is performed at the specified interval, regardless of the actual activity level of the message producers. If the polling interval is lengthy, messages might not be handled within a reasonable time period after their arrival; if the polling interval is brief, the message handler program (and there may be many such programs) will waste a large amount of CPU time on unnecessary checks.
The database server is necessarily aware of the exact moment when a new message arrives. Why not let the message handler program request that the database server send it a notification when a new message arrives? The message handler can then efficiently sleep until the moment its services are needed. Under this event-based scheme, the message handler becomes aware of new messages at the instant they arrive, yet it does not waste CPU time checking in vain for new messages when there are none available.
Recall from Chapter 11 of the Interbase® 6 API Guide that
[A database event is] a message passed from a trigger or stored procedure to an application to announce the occurrence of a specified condition or action, usually a database change such as an insertion, modification, or deletion of a record.
To notify any interested listeners that a specific event has occurred,
issue the
POST_EVENT
statement
(see page 176 of the Interbase® 6 Language Reference).
The POST_EVENT
statement has one parameter: the name of the
event to post.
In the preceding example of the administrative message database,
POST_EVENT
might be used from an after insert
trigger on the messages
table, like this:
create trigger trig_messages_handle_insert
for messages
after insert
as
begin
POST_EVENT 'new_message';
end
Note that the physical notification of the client process does not occur until the
transaction in which the POST_EVENT
took place is actually
committed. Therefore, multiple events may conceptually occur
before the client process is physically informed of even one
occurrence.
Furthermore, the database engine makes no
guarantee that clients will be informed of events in the same groupings
in which they conceptually occurred. If, within a single transaction, an event
named event_a
is posted once and an event named
event_b
is posted once, the client may receive those posts
in separate "batches", despite the fact that they occurred in the same
conceptual unit (a single transaction). This also applies to multiple
occurrences of the same event within a single conceptual unit: the
physical notifications may arrive at the client separately.
Note: If you don't care about the gory details of event notification, skip to the section that describes KInterbasDB's Python-level event handling API.
The Interbase®/Firebird C client library offers two forms of event notification.
The first form is synchronous notification, by way of the function
isc_wait_for_event
. This form is admirably simple for a C
programmer to use, but is inappropriate as a basis for KInterbasDB's
event support, chiefly because it's not sophisticated enough to serve as the
basis for a comfortable Python-level API.
The other form of event notification offered by the database client library
is asynchronous, by way of the functions
isc_que_events
(note that the name of that function is
misspelled), isc_cancel_events
, and others.
The details are as nasty as they are numerous, but the essence of using asynchronous notification from C is as follows:
isc_event_block
to create a formatted binary buffer
that will tell the server which events the client wants to listen for.
isc_que_events
(passing the buffer created in the
previous step) to inform the server that the client is ready to receive
event notifications, and provide a callback that will be asynchronously
invoked when one or more of the registered events occurs.
isc_que_events
to initiate event
listening must now do something else.]
isc_event_counts
function to determine how many times each of the registered events has
occurred since the last call to isc_event_counts
(if any).
isc_que_events
.]
isc_que_events
again in order to receive future
notifications. Future notifications will invoke the callback again,
effectively "looping" the callback thread back to Step 4.
The KInterbasDB database event API is comprised of the following:
the method Connection.event_conduit
and the class
EventConduit
.
event_conduit
(method; member of kinterbasdb.Connection )
|
Creates a conduit (an instance of
Arguments:
|
EventConduit:
__init__
(method; member of kinterbasdb.EventConduit )
|
The |
wait
(method; member of kinterbasdb.EventConduit )
|
Blocks the calling thread until at least one of the events occurs,
or the specified
If one or more event notifications has arrived since the last call
to
The names of the relevant events were supplied to the
conduit = connection.event_conduit( ('event_a', 'event_b') ) conduit.wait() Arguments:
Returns:
In the code snippet above, if { 'event_a': 1, 'event_b': 0 } |
close
(method; member of kinterbasdb.EventConduit )
|
Cancels the standing request for this conduit to be notified of events.
After this method has been called, this This method has no arguments. |
flush
(method; member of kinterbasdb.EventConduit )
|
This method allows the Python programmer to manually clear any event notifications that have accumulated in the conduit's internal queue.
From the moment the conduit is created by the
This method has no arguments. Returns:
The number of event notifications that were flushed from the queue.
The "number of event notifications" is not necessarily the
same as the "number of event occurrences", since a single
notification can indicate multiple occurrences of a given event
(see the return value of the |
The following code (a SQL table definition, a SQL trigger definition, and two Python programs) demonstrates KInterbasDB-based event notification.
The example is based on a database at
'localhost:/temp/test.db'
, which contains
a simple table named test_table
.
test_table
has an after insert
trigger that posts several events.
Note that the trigger posts test_event_a
twice,
test_event_b
once, and test_event_c
once.
The Python event handler program connects to the database and
establishes an EventConduit
in the context of that connection.
As specified by the list of RELEVANT_EVENTS
passed to
event_conduit
, the event conduit
will concern itself only with events named test_event_a
and test_event_b
.
Next, the program calls the conduit's wait
method
without a timeout; it will wait infinitely until at least one
of the relevant events is posted in a transaction that is
subsequently committed.
The Python event producer program simply connects to the database,
inserts a row into test_table
, and commits the transaction.
Notice that except for the printed comment, no code in the producer
makes any mention of events--the events are posted as an implicit
consequence of the row's insertion into test_table
.
The insertion into test_table
causes the trigger
to conceptually post events, but those events are not
physically sent to interested listeners until the transaction
is committed.
When the commit occurs, the handler program returns from the wait
call and prints the notification that it received.
SQL table definition:
create table test_table (a integer)
SQL trigger definition:
create trigger trig_test_insert_event for test_table after insert as begin post_event 'test_event_a'; post_event 'test_event_b'; post_event 'test_event_c'; post_event 'test_event_a'; end
Python event handler program:
import kinterbasdb RELEVANT_EVENTS = ['test_event_a', 'test_event_b'] con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') conduit = con.event_conduit(RELEVANT_EVENTS) print 'HANDLER: About to wait for the occurrence of one of %s...\n' % RELEVANT_EVENTS result = conduit.wait() print 'HANDLER: An event notification has arrived:' print result conduit.close()
Python event producer program:
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') cur = con.cursor() cur.execute("insert into test_table values (1)") print 'PRODUCER: Committing transaction that will cause event notification to be sent.' con.commit()
Event producer output:
PRODUCER: Committing transaction that will cause event notification to be sent.
Event handler output (assuming that the handler was already started and waiting when the event producer program was executed):
HANDLER: About to wait for the occurrence of one of ['test_event_a', 'test_event_b']... HANDLER: An event notification has arrived: {'test_event_a': 2, 'test_event_b': 1}
Notice that there is no mention of test_event_c
in the result
dictionary received by the event handler program.
Although test_event_c
was posted by the after insert
trigger, the event conduit in the handler program was created to
listen only for test_event_a
and
test_event_b
events.
Remember that if an EventConduit
is left active (not yet
close
d or garbage collected), notifications for any
registered events that actually occur
will continue to accumulate in the EventConduit
's
internal queue even if the Python programmer doesn't call
EventConduit.wait
to receive the notifications or
EventConduit.flush
to clear the queue.
The ill-informed may misinterpret this behavior as a memory leak in
KInterbasDB; it is not.
The database client library implements the local protocol on some platforms in such a way that deadlocks may arise in bizarre places if you do this. This no-LOCAL prohibition is not limited to connections that are used as the basis for event conduits; it applies to all connections throughout the process.
So why doesn't KInterbasDB protect the Python programmer from this mistake? Because the event handling thread is started by the database client library, and it operates beyond the synchronization domain of KInterbasDB at times.
Note:
The restrictions on the number of active EventConduit
s in a
process, and on the number of event names that a single
EventConduit
can listen for, have been removed in
KInterbasDB 3.2.
Connection timeouts allow the programmer to request that a connection be automatically closed after a specified period of inactivity. The simplest uses of connection timeouts are trivial, as demonstrated by the following snippet:
import kinterbasdb con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test.db', user='sysdba', password='masterkey', timeout={'period': 120.0} # time out after 120.0 seconds of inactivity ) ...
The connection created in the example above is eligible to be automatically closed by KInterbasDB if it remains idle for at least 120.0 consecutive seconds. KInterbasDB does not guarantee that the connection will be closed immediately when the specified period has elapsed. On a busy system, there might be a considerable delay between the moment a connection becomes eligible for timeout and the moment KInterbasDB actually closes it. However, the thread that performs connection timeouts is programmed in such a way that on a lightly loaded system, it acts almost instantaneously to take advantage of a connection's eligibility for timeout.
After a connection has timed out, KInterbasDB reacts to attempts to reactivate the severed connection in a manner dependent on the state of the connection when it timed out. Consider the following example program:
import time import kinterbasdb con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test.db', user='sysdba', password='masterkey', timeout={'period': 3.0} ) cur = con.cursor() cur.execute("recreate table test (a int, b char(1))") con.commit() cur.executemany("insert into test (a, b) values (?, ?)", [(1, 'A'), (2, 'B'), (3, 'C')] ) con.commit() cur.execute("select * from test") print 'BEFORE:', cur.fetchall() cur.execute("update test set b = 'X' where a = 2") time.sleep(6.0) cur.execute("select * from test") print 'AFTER: ', cur.fetchall()
So, should the example program print
BEFORE: [(1, 'A'), (2, 'B'), (3, 'C')] AFTER: [(1, 'A'), (2, 'X'), (3, 'C')]
or
BEFORE: [(1, 'A'), (2, 'B'), (3, 'C')] AFTER: [(1, 'A'), (2, 'B'), (3, 'C')]
or should it raise an exception? The answer is more complex than one might think.
First of all, we cannot guarantee much about the example
program's behavior because there is a race condition between the obvious thread
that's executing the example code (which we'll call "UserThread" for the rest
of this section) and the KInterbasDB-internal background thread that actually
closes connections that have timed out ("TimeoutThread").
If the operating system were to suspend UserThread just after the
kinterbasdb.connect
call for more than the specified timeout
period of 3.0 seconds, the TimeoutThread might close the connection before
UserThread had performed any preparatory operations on the database. Although
such a scenario is extremely unlikely when more "realistic" timeout periods
such as 1800.0 seconds (30 minutes) are used, it is important to consider.
We'll explore solutions to this race condition
later.
The likely (but not guaranteed) behavior of the example program is
that UserThread will complete all preparatory database operations including the
cur.execute("update test set b = 'X' where a = 2")
statement in the example program, then
go to sleep for not less than 6.0 seconds. Not less than 3.0 seconds after
UserThread executes the
cur.execute("update test set b = 'X' where a = 2")
statement, TimeoutThread is likely to close the connection because it has
become eligible for timeout.
The crucial issue is how TimeoutThread should resolve the transaction that
UserThread left open on con
, and what should happen when
UserThread reawakens and tries to execute the
cur.execute("select * from test")
statement,
since the transaction that UserThread left open will no longer be active.
In the context of a particular client program, it is not possible for KInterbasDB to know the best way for TimeoutThread to react when it encounters a connection that is eligible for timeout, but has an unresolved transaction. For this reason, KInterbasDB's connection timeout system offers callbacks that the client programmer can use to guide the TimeoutThread's actions, or to log information about connection timeout patterns.
The client programmer can supply a "before timeout" callback that accepts a single dictionary parameter and returns an integer code to indicate how the TimeoutThread should proceed when it finds a connection eligible for timeout. Within the dictionary, KInterbasDB provides the following entries:
'dsn'
: The dsn
parameter that was passed to
kinterbasdb.connect
when the connection was created.
'has_transaction'
: A bool
ean that indicates
whether the connection has an unresolved transaction.
'active_secs'
: A float
that indicates how
many seconds elapsed between the point when the connection attached to the
server and the last client program activity on the connection.
'idle_secs'
: A float
that indicates how many
seconds have elapsed since the last client program activity on the
connection.
This value will not be less than the specified timeout period, and is
likely to only a fraction of a second longer.
Based on those data, the user-supplied callback should return one of the following codes:
kinterbasdb.CT_VETO
:
Directs the TimeoutThread not to close the connection at the current time, and not to reconsider timing the connection out until at least another timeout period has passed.
For example, if a connection was created with
a timeout period of 120.0 seconds, and the user-supplied "before callback"
returns kinterbasdb.CT_VETO
, the TimeoutThread will not
reconsider timing out that particular connection until at least another
120.0 seconds have elapsed.
kinterbasdb.CT_NONTRANSPARENT
("nontransparent rollback"):
Directs the TimeoutThread to roll back the connection's unresolved
transaction (if any), then close the connection. Any future attempt to use
the connection will raise a kinterbasdb.ConnectionTimedOut
exception.
kinterbasdb.CT_ROLLBACK
("transparent rollback"):
Directs the TimeoutThread to roll back the connection's unresolved transaction (if any), then close the connection. Upon any future attempt to use the connection, KInterbasDB will attempt to transparently reconnect to the database and "resume where it left off" insofar as possible.
Of course, network problems and the like could prevent KInterbasDB's
attempt at transparent resumption from succeeding. Also, highly
state-dependent objects such as open result sets,
BlobReader
s, and PreparedStatement
s cannot be
used transparently across a connection timeout.
kinterbasdb.CT_COMMIT
("transparent commit"):
Directs the TimeoutThread to commit the connection's unresolved transaction (if any), then close the connection. Upon any future attempt to use the connection, KInterbasDB will attempt to transparently reconnect to the database and "resume where it left off" insofar as possible.
If the user does not supply a "before timeout" callback, KInterbasDB considers the timeout transparent only if the connection does not have an unresolved transaction.
If the user-supplied "before timeout" callback returns anything other than one
of the codes listed above, or if it raises an exception, the TimeoutThread will
act as though kinterbasdb.CT_NONTRANSPARENT
had been returned.
You might have noticed that the input dictionary to the "before timeout"
callback does not include a reference to the
kinterbasdb.Connection
object itself. This is a
deliberate
design decision intended to steer the client programmer away from writing
callbacks that take a long time to complete, or that manipulate the
kinterbasdb.Connection
instance directly. See the
caveats section
for more information.
The client programmer can supply an "after timeout" callback that accepts a single dictionary parameter. Within that dictionary, KInterbasDB currently provides the following entries:
'dsn'
: The dsn
parameter that was passed to
kinterbasdb.connect
when the connection was created.
'active_secs'
: A float
that indicates how
many seconds elapsed between the point when the connection attached to the
server and the last client program activity on the connection.
'idle_secs'
: A float
that indicates how many
seconds elapsed between the last client program activity on the connection
and the moment the TimeoutThread closed the connection.
KInterbasDB only calls the "after timeout" callback after the connection
has actually been closed by the TimeoutThread. If the "before timeout"
callback returns kinterbasdb.CT_VETO
to cancel the timeout
attempt, the "after timeout" callback will not be called.
KInterbasDB discards the return value of the "after timeout" callback, and ignores any exceptions.
The same caveats that apply to the "before timeout" callback also apply to the "after timeout" callback.
Manipulating the Connection
object that is being timed out
(or any of that connection's subordinate objects such as
Cursor
s, BlobReader
s, or
PreparedStatement
s) from the timeout callbacks is strictly
forbidden.
CT_VETO
The following program registers a "before timeout" callback that
unconditionally returns kinterbasdb.CT_VETO
, which means that the
TimeoutThread never times the connection out.
Although an "after timeout" callback is also registered, it will never be
called.
import time import kinterbasdb def callback_before(info): print print 'callback_before called; input parameter contained:' for key, value in info.items(): print ' %s: %s' % (repr(key).ljust(20), repr(value)) print # Unconditionally veto any timeout attempts: return kinterbasdb.CT_VETO def callback_after(info): assert False, 'This will never be called.' con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test.db', user='sysdba', password='masterkey', timeout={ 'period': 3.0, 'callback_before': callback_before, 'callback_after': callback_after, } ) cur = con.cursor() cur.execute("recreate table test (a int, b char(1))") con.commit() cur.executemany("insert into test (a, b) values (?, ?)", [(1, 'A'), (2, 'B'), (3, 'C')] ) con.commit() cur.execute("select * from test") print 'BEFORE:', cur.fetchall() cur.execute("update test set b = 'X' where a = 2") time.sleep(6.0) cur.execute("select * from test") rows = cur.fetchall() # The value of the second column of the second row of the table is still 'X', # because the transaction that changed it from 'B' to 'X' remains active. assert rows[1][1] == 'X' print 'AFTER: ', rows
Sample output:
BEFORE: [(1, 'A'), (2, 'B'), (3, 'C')] callback_before called; input parameter contained: 'dsn' : 'localhost:D:\\temp\\test.db' 'idle_secs' : 3.0 'has_transaction' : True AFTER: [(1, 'A'), (2, 'X'), (3, 'C')]
timeout_authorizer
The example programs for
CT_NONTRANSPARENT
,
CT_ROLLBACK
,
and
CT_COMMIT
rely on the TimeoutAuthorizer
class from the module below to
guarantee that the TimeoutThread will not time the connection out before the
preparatory code has executed.
import threading import kinterbasdb class TimeoutAuthorizer(object): def __init__(self, opCodeWhenAuthorized): self.currentOpCode = kinterbasdb.CT_VETO self.opCodeWhenAuthorized = opCodeWhenAuthorized self.lock = threading.Lock() def authorize(self): self.lock.acquire() try: self.currentOpCode = self.opCodeWhenAuthorized finally: self.lock.release() def __call__(self, info): self.lock.acquire() try: return self.currentOpCode finally: self.lock.release()
CT_NONTRANSPARENT
import threading, time import kinterbasdb import timeout_authorizer authorizer = timeout_authorizer.TimeoutAuthorizer(kinterbasdb.CT_NONTRANSPARENT) connectionTimedOut = threading.Event() def callback_after(info): print print 'The connection was closed nontransparently.' print connectionTimedOut.set() con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test.db', user='sysdba', password='masterkey', timeout={ 'period': 3.0, 'callback_before': authorizer, 'callback_after': callback_after, } ) cur = con.cursor() cur.execute("recreate table test (a int, b char(1))") con.commit() cur.executemany("insert into test (a, b) values (?, ?)", [(1, 'A'), (2, 'B'), (3, 'C')] ) con.commit() cur.execute("select * from test") print 'BEFORE:', cur.fetchall() cur.execute("update test set b = 'X' where a = 2") authorizer.authorize() connectionTimedOut.wait() # This will raise a kinterbasdb.ConnectionTimedOut exception because the # before callback returned kinterbasdb.CT_NONTRANSPARENT: cur.execute("select * from test")
Sample output:
BEFORE: [(1, 'A'), (2, 'B'), (3, 'C')] The connection was closed nontransparently. Traceback (most recent call last): File "connection_timeouts_ct_nontransparent.py", line 42, in ? cur.execute("select * from test") kinterbasdb.ConnectionTimedOut: (0, 'A transaction was still unresolved when this connection timed out, so it cannot be transparently reactivated.')
CT_ROLLBACK
import threading, time import kinterbasdb import timeout_authorizer authorizer = timeout_authorizer.TimeoutAuthorizer(kinterbasdb.CT_ROLLBACK) connectionTimedOut = threading.Event() def callback_after(info): print print 'The unresolved transaction was rolled back; the connection has been' print ' closed transparently.' print connectionTimedOut.set() con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test.db', user='sysdba', password='masterkey', timeout={ 'period': 3.0, 'callback_before': authorizer, 'callback_after': callback_after, } ) cur = con.cursor() cur.execute("recreate table test (a int, b char(1))") con.commit() cur.executemany("insert into test (a, b) values (?, ?)", [(1, 'A'), (2, 'B'), (3, 'C')] ) con.commit() cur.execute("select * from test") print 'BEFORE:', cur.fetchall() cur.execute("update test set b = 'X' where a = 2") authorizer.authorize() connectionTimedOut.wait() # The value of the second column of the second row of the table will have # reverted to 'B' when the transaction that changed it to 'X' was rolled back. # The cur.execute call on the next line will transparently reactivate the # connection, which was timed out transparently. cur.execute("select * from test") rows = cur.fetchall() assert rows[1][1] == 'B' print 'AFTER: ', rows
Sample output:
BEFORE: [(1, 'A'), (2, 'B'), (3, 'C')] The unresolved transaction was rolled back; the connection has been closed transparently. AFTER: [(1, 'A'), (2, 'B'), (3, 'C')]
CT_COMMIT
import threading, time import kinterbasdb import timeout_authorizer authorizer = timeout_authorizer.TimeoutAuthorizer(kinterbasdb.CT_COMMIT) connectionTimedOut = threading.Event() def callback_after(info): print print 'The unresolved transaction was committed; the connection has been' print ' closed transparently.' print connectionTimedOut.set() con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test.db', user='sysdba', password='masterkey', timeout={ 'period': 3.0, 'callback_before': authorizer, 'callback_after': callback_after, } ) cur = con.cursor() cur.execute("recreate table test (a int, b char(1))") con.commit() cur.executemany("insert into test (a, b) values (?, ?)", [(1, 'A'), (2, 'B'), (3, 'C')] ) con.commit() cur.execute("select * from test") print 'BEFORE:', cur.fetchall() cur.execute("update test set b = 'X' where a = 2") authorizer.authorize() connectionTimedOut.wait() # The modification of the value of the second column of the second row of the # table from 'B' to 'X' will have persisted, because the TimeoutThread # committed the transaction before it timed the connection out. # The cur.execute call on the next line will transparently reactivate the # connection, which was timed out transparently. cur.execute("select * from test") rows = cur.fetchall() assert rows[1][1] == 'X' print 'AFTER: ', rows
Sample output:
BEFORE: [(1, 'A'), (2, 'B'), (3, 'C')] The unresolved transaction was committed; the connection has been closed transparently. AFTER: [(1, 'A'), (2, 'X'), (3, 'C')]
For the sake of simplicity, KInterbasDB lets the Python programmer
ignore transaction management to the greatest extent allowed by the
Python Database API Specification 2.0. The specification says,
"if the database supports an auto-commit feature, this must be
initially off". At a minimum, therefore, it is necessary to call the
commit
method of the connection in order to persist any
changes made to the database. Transactions left unresolved by the
programmer will be rollback
ed when the connection is
garbage collected.
Remember that because of
ACID,
every data manipulation operation in the Interbase®/Firebird database engine
takes place in the context of a transaction, including operations that are
conceptually "read-only", such as a typical SELECT
.
The client programmer of KInterbasDB establishes a transaction
implicitly by using any SQL execution method, such as
Connection.execute_immediate
, Cursor.execute
,
or Cursor.callproc
.
Although KInterbasDB allows the programmer to pay little attention to transactions, it also exposes the full complement of the database engine's advanced transaction control features: transaction parameters, retaining transactions, savepoints, and distributed transactions.
The database engine offers the client programmer an optional facility called transaction parameter buffers (TPBs) for tweaking the operating characteristics of the transactions he initiates. These include characteristics such as "whether the transaction has read and write access to tables, or read-only access, and whether or not other simultaneously active transactions can share table access with the transaction" (IB 6 API Guide, page 62).
In addition to the implicit transaction initiation mentioned in the
introduction of this section, KInterbasDB
allows the programmer to start transactions explicitly via the
Connection.begin
method.
Connections have a default_tpb
attribute
that can be changed to set the default TPB for all transactions subsequently
started on the connection.
Alternatively, if the programmer only wants to set the TPB for a single
transaction, he can start a transaction explicitly via the
Connection.begin
method and pass a TPB for that single
transaction.
For details about TPB construction, see Chapter 5 of the
Interbase® 6 API Guide.
In particular, page 63 of that document presents a table of possible
TPB elements--single bytes that the C API defines as constants whose names
begin with isc_tpb_
.
KInterbasDB makes all of those TPB constants available (under the same names)
as module-level constants in the form of single-character strings.
A transaction parameter buffer is handled in C as a
character array; KInterbasDB requires that TPBs be constructed as Python
strings. Since the constants in the kinterbasdb.isc_tpb_*
family are single-character Python strings, they can simply be concatenated
to create a TPB.
The following program uses explicit transaction initiation and TPB construction to establish an unobtrusive transaction for read-only access to the database:
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') # Construct a TPB by concatenating single-character strings (bytes) # from the kinterbasdb.isc_tpb_* family. customTPB = ( kinterbasdb.isc_tpb_read + kinterbasdb.isc_tpb_read_committed + kinterbasdb.isc_tpb_rec_version ) # Explicitly start a transaction with the custom TPB: con.begin(tpb=customTPB) # Now read some data using cursors: ... # Commit the transaction with the custom TPB. Future transactions # opened on con will not use a custom TPB unless it is explicitly # passed to con.begin every time, as it was above, or # con.default_tpb is changed to the custom TPB, as in: # con.default_tpb = customTPB con.commit()
The commit
and rollback
methods of
kinterbasdb.Connection
accept an optional boolean parameter retaining
(default False
) to indicate whether to recycle the
transactional context of the transaction being resolved by the
method call.
If retaining
is True
, the infrastructural
support for the transaction active
at the time of the method call will be "retained" (efficiently and
transparently recycled) after the database server has committed or rolled
back the conceptual transaction.
In code that commits or rolls back frequently, "retaining" the transaction yields considerably better performance. However, retaining transactions must be used cautiously because they can interfere with the server's ability to garbage collect old record versions. For details about this issue, read the "Garbage" section of this document by Ann Harrison.
For more information about retaining transactions, see page 291 of the Interbase® 6 API Guide.
Firebird 1.5 introduced support for transaction savepoints. Savepoints are named, intermediate control points within an open transaction that can later be rolled back to, without affecting the preceding work. Multiple savepoints can exist within a single unresolved transaction, providing "multi-level undo" functionality.
Although Firebird savepoints are fully supported from SQL alone via the
SAVEPOINT 'name'
and ROLLBACK TO 'name'
statements, KInterbasDB also exposes savepoints at the Python API level
for the sake of convenience.
The method Connection.savepoint(name)
establishes a savepoint
with the specified name
.
To roll back to a specific savepoint, call the
Connection.rollback
method and provide a value (the name of
the savepoint) for the optional savepoint
parameter.
If the savepoint
parameter of Connection.rollback
is not specified, the active transaction is cancelled in its entirety,
as required by the Python Database API Specification.
The following program demonstrates savepoint manipulation via the KInterbasDB API, rather than raw SQL.
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') cur = con.cursor() cur.execute("recreate table test_savepoints (a integer)") con.commit() print 'Before the first savepoint, the contents of the table are:' cur.execute("select * from test_savepoints") print ' ', cur.fetchall() cur.execute("insert into test_savepoints values (?)", [1]) con.savepoint('A') print 'After savepoint A, the contents of the table are:' cur.execute("select * from test_savepoints") print ' ', cur.fetchall() cur.execute("insert into test_savepoints values (?)", [2]) con.savepoint('B') print 'After savepoint B, the contents of the table are:' cur.execute("select * from test_savepoints") print ' ', cur.fetchall() cur.execute("insert into test_savepoints values (?)", [3]) con.savepoint('C') print 'After savepoint C, the contents of the table are:' cur.execute("select * from test_savepoints") print ' ', cur.fetchall() con.rollback(savepoint='A') print 'After rolling back to savepoint A, the contents of the table are:' cur.execute("select * from test_savepoints") print ' ', cur.fetchall() con.rollback() print 'After rolling back entirely, the contents of the table are:' cur.execute("select * from test_savepoints") print ' ', cur.fetchall()
The output of the example program is shown below.
Before the first savepoint, the contents of the table are: [] After savepoint A, the contents of the table are: [(1,)] After savepoint B, the contents of the table are: [(1,), (2,)] After savepoint C, the contents of the table are: [(1,), (2,), (3,)] After rolling back to savepoint A, the contents of the table are: [(1,)] After rolling back entirely, the contents of the table are: []
kinterbasdb.ConnectionGroup
class
and examine the brief example program below.
import kinterbasdb # Establish multiple connections the usual way: con1 = kinterbasdb.connect(dsn='weasel:/temp/test.db', user='sysdba', password='pass') con2 = kinterbasdb.connect(dsn='coyote:/temp/test.db', user='sysdba', password='pass') # Create a ConnectionGroup to associate multiple connections in such a # way that they can participate in a distributed transaction. # !!! # NO TWO MEMBERS OF A SINGLE CONNECTIONGROUP SHOULD BE ATTACHED TO THE SAME DATABASE! # !!! group = kinterbasdb.ConnectionGroup( connections=(con1,con2) ) # Start a distributed transaction involving all of the members of the group # (con1 and con2 in this case) with one of the following approaches: # - Call group.begin() # - Call con1.begin(); the operation will "bubble upward" and apply to the group. # - Call con2.begin(); the operation will "bubble upward" and apply to the group. # - Just start executing some SQL statements on either con1 or con2. # A transaction will be started implicitly; it will be a distributed # transaction because con1 and con2 are members of a ConnectionGroup. group.begin() # Perform some database changes the usual way (via cursors on con1 and con2): ... # Commit or roll back the distributed transaction by calling the commit # or rollback method of the ConnectionGroup itself, or the commit or # rollback method of any member connection (con1 or con2 in this case). group.commit() # Unless you want to perform another distributed transaction, disband the # group so that member connections can operate independently again. group.clear()
Notes:
While a Connection
belongs to a
ConnectionGroup
, any calls to the connection's transactional
methods
(begin
, prepare
, commit
, rollback
)
will "bubble upward" to apply to the distributed transaction shared by the
group as a whole.
Connections can be dynamically add
ed and remove
d
from a ConnectionGroup
provided that neither the group nor
the connection itself has an unresolved transaction at the time of the
addition/removal.
ConnectionGroup
!
KInterbasDB converts bound parameters marked with a ?
in SQL code in a standard way. However, the module also offers several
extensions to standard parameter binding, intended to make client code
more readable and more convenient to write.
The database engine treats most SQL data types in a weakly typed fashion:
the engine may attempt to convert the raw value to a different type,
as appropriate for the current context.
For instance, the SQL expressions 123
(integer)
and '123'
(string) are treated equivalently when the value is
to be inserted into an integer
field; the same applies when
'123'
and 123
are to be inserted into a
varchar
field.
This weak typing model is quite unlike Python's dynamic yet strong typing. Although weak typing is regarded with suspicion by most experienced Python programmers, the database engine is in certain situations so aggressive about its typing model that KInterbasDB must compromise in order to remain an elegant means of programming the database engine.
An example is the handling of "magic values" for date and time fields.
The database engine interprets certain string values such as
'yesterday'
and 'now'
as having special meaning in
a date/time context.
If KInterbasDB did not accept strings as the values of parameters destined
for storage in date/time fields, the resulting code would be awkward.
Consider the difference between the two Python snippets
below, which insert a row containing an integer and a timestamp into a
table defined with the following DDL statement:
create table test_table (i int, t timestamp)
i = 1 t = 'now' sqlWithMagicValues = "insert into test_table (i, t) values (?, '%s')" % t cur.execute( sqlWithMagicValues, (i,) )
i = 1 t = 'now' cur.execute( "insert into test_table (i, t) values (?, ?)", (i, t) )
If KInterbasDB did not support weak parameter typing, string parameters that the database engine is to interpret as "magic values" would have to be rolled into the SQL statement in a separate operation from the binding of the rest of the parameters, as in the first Python snippet above. Implicit conversion of parameter values from strings allows the consistency evident in the second snippet, which is both more readable and more general.
It should be noted that KInterbasDB does not perform the conversion from string itself. Instead, it passes that responsibility to the database engine by changing the parameter metadata structure dynamically at the last moment, then restoring the original state of the metadata structure after the database engine has performed the conversion.
A secondary benefit is that when one uses KInterbasDB to import large amounts of data from flat files into the database, the incoming values need not necessarily be converted to their proper Python types before being passed to the database engine. Eliminating this intermediate step may accelerate the import process considerably, although other factors such as the chosen connection protocol and the deactivation of indexes during the import are more consequential. For bulk import tasks, the database engine's external tables also deserve consideration. External tables can be used to suck semi-structured data from flat files directly into the relational database without the intervention of an ad hoc conversion program.
Dynamic type translators are conversion functions registered by the Python programmer to transparently convert database field values to and from their internal representation.
The client programmer can choose to ignore translators altogether, in
which case KInterbasDB will manage them behind the scenes.
Otherwise, the client programmer can use any of several
standard type translators
included with KInterbasDB, register custom translators, or
set the translators to None
to deal directly with the
KInterbasDB-internal representation of the data type.
When translators have been registered for a specific SQL data
type, Python objects on their way into a database field of that type
will be passed through the input translator before they are presented
to the database engine; values on their way out of the database into
Python will be passed through the corresponding output translator.
Output and input translation for a given type is usually implemented
by two different functions.
Translators are registered with the [set|get]_type_trans_[in|out]
methods of Connection
and Cursor
.
The set_type_trans_[in|out]
methods accept a single argument:
a mapping of type name to translator.
The get_type_trans[in|out]
methods return a copy of the
translation table. Cursor
s inherit their
Connection
's translation settings, but can override
them without affecting the connection or other cursors (much as
subclasses can override the methods of their base classes).
The following code snippet installs an input translator for fixed
point types (NUMERIC
/DECIMAL
SQL types)
into a connection:
con.set_type_trans_in( {'FIXED': fixed_input_translator_function} )
The following method call retrieves the type translation table for
con
:
con.get_type_trans_in()
The method call above would return a translation table (dictionary) such as this:
{ 'DATE': <function date_conv_in at 0x00920648>, 'TIMESTAMP': <function timestamp_conv_in at 0x0093E090>, 'FIXED': <function <lambda> at 0x00962DB0>, 'TIME': <function time_conv_in at 0x009201B0> }
Notice that although the sample code registered only one type
translator, there are four listed in the mapping returned by the
get_type_trans_in
method. By default, KInterbasDB uses dynamic
type translation to implement the conversion of
DATE
, TIME
, TIMESTAMP
,
NUMERIC
, and DECIMAL
values.
For the source code locations of KInterbasDB's reference translators,
see the
table
in the next section.
In the sample above, a translator is registered under the key
'FIXED'
, but Firebird has no SQL data type named
FIXED
. The following table lists the names of the
database engine's SQL data types in the left column, and the
corresponding KInterbasDB-specific key under which client programmers can
register translators in the right column.
Mapping of SQL Data Type Names to Translator Keys | ||||||||||||||||||
|
Dynamic type translation has
eliminated KInterbasDB's
dependency on mx.DateTime
. Although KInterbasDB will continue
to use mx.DateTime
as its default date/time representation
for the sake of backward compatibility, dynamic type translation allows
users to conveniently deal with database date/time values in terms of the
new standard library module datetime
, or any other representation
they care to write translators for.
Dynamic type translation also allows NUMERIC
/DECIMAL
values to be transparently represented as
decimal.Decimal
objects rather than scaled integers, which is much more convenient.
For backward compatibility, NUMERIC
/DECIMAL
values are still represented by default as Python float
s,
and the older API based on the Connection.precision_mode
attribute is still present.
However, all of these representations are now implemented "under the hood" via
dynamic type translation.
Reference implementations of all of the translators discussed above are provided with KInterbasDB, in these modules:
Reference Translators Included with KInterbasDB | ||||||||||||||||||||
|
Below is a table that specifies the required argument and return value
signatures of input and output converters for the various translator
keys.
Python's native types map perfectly to
'TEXT'
,
'TEXT_UNICODE'
,
'BLOB'
,
'INTEGER'
,
and 'FLOATING'
types, so in those cases the translator signatures are very simple.
The signatures for
'FIXED'
,
'DATE'
,
'TIME'
,
and 'TIMESTAMP'
are not as simple because Python (before 2.4) lacks native types to
represent these values with both precision and convenience.
KInterbasDB handles 'FIXED'
values internally as scaled
integers; the date and time types as tuples.
KInterbasDB itself uses translators implemented according to the rules in
the table below; the code for these reference translators can be found
in the Python modules named kinterbasdb.typeconv_*
(see the
table
in the previous section for details).
Signature Specifications for Input and Output Translators | ||||||||||||||||||||||||||||||
|
import datetime # Python 2.3 standard library module import kinterbasdb import kinterbasdb.typeconv_datetime_stdlib as tc_dt def connect(*args, **kwargs): """ This wrapper around kinterbasdb.connect creates connections that use the datetime module (which entered the standard library in Python 2.3) for both input and output of DATE, TIME, and TIMESTAMP database fields. This wrapper simply registers kinterbasdb's official date/time translators for the datetime module, which reside in the kinterbasdb.typeconv_datetime_stdlib module. An equivalent set of translators for mx.DateTime (which kinterbasdb uses by default for backward compatibility) resides in the kinterbasdb.typeconv_datetime_mx module. Note that because cursors inherit their connection's dynamic type translation settings, cursors created upon connections returned by this function will also use the datetime module. """ con = kinterbasdb.connect(*args, **kwargs) con.set_type_trans_in({ 'DATE': tc_dt.date_conv_in, 'TIME': tc_dt.time_conv_in, 'TIMESTAMP': tc_dt.timestamp_conv_in, }) con.set_type_trans_out({ 'DATE': tc_dt.date_conv_out, 'TIME': tc_dt.time_conv_out, 'TIMESTAMP': tc_dt.timestamp_conv_out, }) return con def _test(): con = connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') cur = con.cursor() # Retrieve the current timestamp of the database server. cur.execute("select current_timestamp from rdb$database") curStamp = cur.fetchone()[0] print 'The type of curStamp is', type(curStamp) print 'curStamp is', curStamp # Create a test table with a single TIMESTAMP column. con.execute_immediate("recreate table test_stamp (a timestamp)") con.commit() # Insert a timestamp into the database, then retrieve it. py23StandardLibTimestamp = datetime.datetime.now() cur.execute("insert into test_stamp values (?)", (py23StandardLibTimestamp,)) cur.execute("select * from test_stamp") curStamp = cur.fetchone()[0] print 'The type of curStamp is', type(curStamp) print 'curStamp is', curStamp if __name__ == '__main__': _test()
Sample output:
The type of curStamp is <type 'datetime.datetime'> curStamp is 2003-05-20 03:55:42 The type of stamp is <type 'datetime.datetime'> stamp is 2003-05-20 03:55:42
kinterbasdb.init
Convenience Codes
KInterbasDB has existed since 1998, five years before the
datetime
module was available in the Python standard library.
Therefore, KInterbasDB's default representation for date and time
values is the mx.DateTime
module. This representation is
recommended by the Python DB API 2.0 Specification, and was an entirely
sensible choice during the many years before the advent of the standard library
datetime
module.
Now that the datetime
module is available in the standard
library, many KInterbasDB users prefer it to mx.DateTime
.
For the sake of backward-compatibility, it is necessary to continue to use
mx.DateTime
by default, but it's both burdensome and wasteful
to import mx.DateTime
in programs that don't use it.
To address this situation, KInterbasDB's type translation
initialization code defers the choice of a default set of translators until the
kinterbasdb.init
function is called. A client program can
explicitly call kinterbasdb.init
to forestall the import of
mx.DateTime
.
The kinterbasdb.init
function takes a keyword argument
type_conv
, which controls KInterbasDB's initial choice of
type translators. type_conv
can be either an integer or an
object that has all of the attributes named in
kinterbasdb.BASELINE_TYPE_TRANSLATION_FACILITIES
(an example
of such an object is the module kinterbasdb.typeconv_backcompat
).
If type_conv
is an integer, it will cause KInterbasDB to use one
of the following predefined type translator configurations:
type_conv integer "convenience code"
|
Resulting translator configuration |
0 |
Minimal type translators that represent
date/time values as Unicode values are not encoded or decoded automatically.
Implemented by the |
1
(the default) |
Backward-compatible type translators that represent
date/time values via the Unicode values are not encoded or decoded automatically.
Implemented by the This configuration, which is the default, perfectly mimics the type translation behavior of KInterbasDB 3.0. |
100 |
This translator configuration, which is intended for use with Python 2.3
and later, represents
date/time values via the standard library module Unicode values are encoded and decoded automatically (see this FAQ for more info).
Implemented by the |
200
(the ideal) |
This translator configuration represents
date/time values via the standard library module Unicode values are encoded and decoded automatically (see this FAQ for more info).
Implemented by the |
199 |
This translator configuration is exactly like
It is fundamentally imprecise to represent fixed point values in
floating point,
so this convenience code is intended solely for users who
wish to use
Implemented by the |
These integer type conversion codes are defined solely for convenience.
The same functionality is available via the object variant of
type_conv
, but setting it up is more laborious for typical
translator configurations.
The deferred type translator loading scheme introduced in KInterbasDB 3.1
goes to great lengths to maintain backward compatibility.
If the client programmer does not call kinterbasdb.init
,
KInterbasDB will implicitly initialize itself in a backward-compatible manner
(type_conv=1
) the first time one of its public functions is called
or one of its public classes is instantiated.
The only known backward incompatibility is this:
the DB API type comparison singleton DATETIME
will not compare
equal to any type until the kinterbasdb.init
function has been
called (whether explicitly or implicitly).
After kinterbasdb.init
has been called, DATETIME
will compare equal to the date, time, and timestamp types that were loaded.
This issue should affect hardly any existing KInterbasDB-based programs.
kinterbasdb.init
import datetime, decimal, os.path, string, sys import kinterbasdb kinterbasdb.init(type_conv=200) # This program never imports mx.DateTime: assert 'mx' not in sys.modules def test(): dbFilename = r'D:\temp\test-deferred.firebird' prepareTestDatabase(dbFilename) # Connect with character set UNICODE_FSS, to match the default character # set of the test database. con = kinterbasdb.connect(dsn=dbFilename, user='sysdba', password='masterkey', charset='UNICODE_FSS' ) cur = con.cursor() # Create a test table. cur.execute(""" create table test ( a numeric(18,2), b date, c time, d timestamp, e varchar(50), /* Defaults to character set UNICODE_FSS. */ f varchar(50), /* Defaults to character set UNICODE_FSS. */ g varchar(50) character set ASCII ) """) con.commit() # Create an input value for each field in the test table. aIn = decimal.Decimal('4.53') # Notice that the DB API date/time constructors in kinterbasdb generate # datetime-based objects instead of mx-based objects because of our earlier # call to kinterbasdb.init(type_conv=200). bIn = kinterbasdb.Date(2004,1,4) assert isinstance(bIn, datetime.date) cIn = kinterbasdb.Time(16,27,59) assert isinstance(cIn, datetime.time) dIn = kinterbasdb.Timestamp(2004,1,4, 16,27,59) assert isinstance(dIn, datetime.datetime) eIn = u'A unicod\u2211 object stored in a Unicode field.' fIn = 'A str object stored in a Unicode field.' gIn = 'A str object stored in an ASCII field.' print '-' * 70 inputValues = (aIn, bIn, cIn, dIn, eIn, fIn, gIn) reportValues('In', inputValues) cur.execute("insert into test values (?,?,?,?,?,?,?)", inputValues) print '-' * 70 cur.execute("select a,b,c,d,e,f,g from test") (aOut, bOut, cOut, dOut, eOut, fOut, gOut) = outputValues = cur.fetchone() reportValues('Out', outputValues) print '-' * 70 # Notice that all values made the journey to and from the database intact. assert inputValues == outputValues def reportValues(direction, values): for (val, c) in zip(values, string.ascii_lowercase[:len(values)]): varName = c + direction print '%s has type %s, value\n %s' % (varName, type(val), repr(val)) def prepareTestDatabase(dbFilename): # Delete the test database if an old copy is already present. if os.path.isfile(dbFilename): conOld = kinterbasdb.connect(dsn=dbFilename, user='sysdba', password='masterkey' ) conOld.drop_database() # Create the test database afresh. kinterbasdb.create_database(""" create database '%s' user 'sysdba' password 'masterkey' default character set UNICODE_FSS """ % dbFilename ) if __name__ == '__main__': test()
Program output:
---------------------------------------------------------------------- aIn has type <class 'decimal.Decimal'>, value Decimal("4.53") bIn has type <type 'datetime.date'>, value datetime.date(2004, 1, 4) cIn has type <type 'datetime.time'>, value datetime.time(16, 27, 59) dIn has type <type 'datetime.datetime'>, value datetime.datetime(2004, 1, 4, 16, 27, 59) eIn has type <type 'unicode'>, value u'A unicod\u2211 object stored in a Unicode field.' fIn has type <type 'str'>, value 'A str object stored in a Unicode field.' gIn has type <type 'str'>, value 'A str object stored in an ASCII field.' ---------------------------------------------------------------------- aOut has type <class 'decimal.Decimal'>, value Decimal("4.53") bOut has type <type 'datetime.date'>, value datetime.date(2004, 1, 4) cOut has type <type 'datetime.time'>, value datetime.time(16, 27, 59) dOut has type <type 'datetime.datetime'>, value datetime.datetime(2004, 1, 4, 16, 27, 59) eOut has type <type 'unicode'>, value u'A unicod\u2211 object stored in a Unicode field.' fOut has type <type 'unicode'>, value u'A str object stored in a Unicode field.' gOut has type <type 'unicode'>, value u'A str object stored in an ASCII field.' ----------------------------------------------------------------------
Notes about Unicode handling in the example above:
Upon input, the Python unicode
object eIn
was
transparently encoded for storage in database field TEST.E
(a
VARCHAR
field with character set UNICODE_FSS
(that
is, UTF-8)). Upon output, the UNICODE_FSS
value in
TEST.E
was decoded transparently into the Python
unicode
object eOut
.
TEST.F
accepted a Python str
object even though
it's a Unicode field. The output value fOut
is a Python
unicode
object rather than a str
.
Although TEST.G
is an ASCII
field, and the input
value gIn
is a str
, the output value
gOut
is a unicode
object.
This is because the connection's charset is UNICODE_FSS
,
and Firebird tries to convert every retrieved value to match that character
set.
All forms of dynamic type translation discussed so far have used the
type of the database field as the basis for selecting a
translator.
KInterbasDB 3.2 also allows the client programmer to control translator
selection on the basis of a field's position within a
Cursor
.
Translator selection based on database field type is called
"typal translation", while selection based on position
is called "positional translation".
Positional translation can be enabled at the Cursor
level by including zero-based integer keys in the dictionary passed to
Cursor.set_type_trans[in|out]
. Consider the following example
program:
import kinterbasdb con = kinterbasdb.connect(dsn=r'D:\temp\test-20.firebird', user='sysdba', password='masterkey' ) cur = con.cursor() cur.execute("recreate table test(a int, b int, c int, d int, e float)") con.commit() cur.execute("insert into test values (?,?,?,?,?)", (1, 2, 3, 4, 5.0)) cur.execute("select a,b,c,d,e from test") print 'Before translator modifications, output row is:' print ' ', cur.fetchone() cur.set_type_trans_out({ 'INTEGER': lambda i: i * 10, 1: lambda i: i * 100, 3: lambda i: i * 1000 }) cur.execute("select a,b,c,d,e from test") print 'After translator modifications, output row is:' print ' ', cur.fetchone()
Program output:
Before translator modifications, output row is: (1, 2, 3, 4, 5.0) After translator modifications, output row is: (10, 200, 30, 4000, 5.0)
The cur.set_type_trans_out
call in the example program specifies
that integer values retrieved by cur
should be multiplied by
10
,
then overrides that setting for specific columns:
the value in the second column (position 1
) is multiplied
by 100
,
while the value in the fourth column (position 3
) is multiplied
by 1000
.
KInterbasDB uses a cascading method of translator selection, listed below in order from highest to lowest precedence:
Cursor
level, take precedence over typal
translation settings.
Cursor
-level translation settings take precedence over
Connection
-level settings.
Connection
-level translation settings take precedence over
the module-level defaults.
kinterbasdb.init
.
If the client programmer does not call kinterbasdb.init
explicitly, KInterbasDB's internals will do so implicitly.
KInterbasDB converts database arrays from Python sequences (except strings) on input; to Python lists on output. On input, the Python sequence must be nested appropriately if the array field is multi-dimensional, and the incoming sequence must not fall short of its maximum possible length (it will not be "padded" implicitly--see below). On output, the lists will be nested if the database array has multiple dimensions.
Database arrays have no place in a purely relational data model, which requires that data values be atomized (that is, every value stored in the database must be reduced to elementary, non-decomposable parts). The Interbase®/Firebird implementation of database arrays, like that of most relational database engines that support this data type, is fraught with limitations.
First of all, the database engine claims to support up to 16 dimensions, but actually malfunctions catastrophically above 10 (this bug is fixed in Firebird 1.5-RC1 and later, thanks to Dmitry Yemanov).
Database arrays are of fixed size, with a predeclared number of dimensions
and number of elements per dimension. Individual array elements cannot
be set to NULL
/None
,
so the mapping between Python lists (which have dynamic length and are
therefore not normally "padded" with dummy values) and non-trivial
database arrays is clumsy.
Stored procedures cannot have array parameters.
Finally, many interface libraries, GUIs, and even the isql command line utility do not support database arrays.
In general, it is preferable to avoid using database arrays unless you have a compelling reason.
The following program inserts an array (nested Python list) into a single database field, then retrieves it.
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') con.execute_immediate("recreate table array_table (a int[3,4])") con.commit() cur = con.cursor() arrayIn = [ [1, 2, 3, 4], [5, 6, 7, 8], [9,10,11,12] ] print 'arrayIn: %s' % arrayIn cur.execute("insert into array_table values (?)", (arrayIn,)) cur.execute("select a from array_table") arrayOut = cur.fetchone()[0] print 'arrayOut: %s' % arrayOut con.commit()
Output:
arrayIn: [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]] arrayOut: [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
KInterbasDB supports the insertion and retrieval of blobs either wholly in memory ("materialized mode") or in chunks ("streaming mode") to reduce memory usage when handling large blobs. The default handling mode is "materialized"; the "streaming" method is selectable via a special case of Dynamic Type Translation.
In materialized mode, input and output blobs are represented as Python
str
objects,
with the result that the entirety of each blob's contents is loaded into
memory.
Unfortunately, flaws in the database engine's C API
prevent
automatic Unicode conversion from applying to textual blobs in the way it
applies to Unicode CHAR
and VARCHAR
fields.
In streaming mode, any Python "file-like" object is acceptable as
input for a blob parameter. Obvious examples of such objects are instances of
file
or StringIO
.
Each output blob is represented by a kinterbasdb.BlobReader
object. BlobReader
itself is a "file-like" class, so it acts
much like a file
instance opened in rb
mode.
BlobReader
adds one method not found in the "file-like"
interface: the chunks
method. BlobReader.chunks
takes a single integer parameter that specifies the number of bytes to retrieve
in each chunk (the final chunk may be smaller).
For example, if the size of the blob is 50000000
bytes,
BlobReader.chunks(2**20)
will return 47
one-megabyte chunks, and a smaller final chunk of 716928
bytes.
Due to the combination of CPython's deterministic finalization with careful
programming in KInterbasDB's internals, it is not strictly necessary to close
BlobReader
instances explicitly. A BlobReader
object will be automatically closed by its __del__
method when it
goes out of scope, or when its Connection
closes, whichever comes
first. However, it is always a better idea to close resources explicitly
(via try...finally
) than to rely on artifacts of the CPython
implementation. (For the sake of clarity, the example program does not follow
this practice.)
The following program demonstrates blob storage and retrieval in both materialized and streaming modes.
import os.path from cStringIO import StringIO import kinterbasdb con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test-20.firebird', user='sysdba', password='masterkey' ) cur = con.cursor() cur.execute("recreate table blob_test (a blob)") con.commit() # --- Materialized mode (str objects for both input and output) --- # Insertion: cur.execute("insert into blob_test values (?)", ('abcdef',)) cur.execute("insert into blob_test values (?)", ('ghijklmnop',)) # Retrieval: cur.execute("select * from blob_test") print 'Materialized retrieval (as str):' print cur.fetchall() cur.execute("delete from blob_test") # --- Streaming mode (file-like objects for input; kinterbasdb.BlobReader # objects for output) --- cur.set_type_trans_in ({'BLOB': {'mode': 'stream'}}) cur.set_type_trans_out({'BLOB': {'mode': 'stream'}}) # Insertion: cur.execute("insert into blob_test values (?)", (StringIO('abcdef'),)) cur.execute("insert into blob_test values (?)", (StringIO('ghijklmnop'),)) f = file(os.path.abspath(__file__), 'rb') cur.execute("insert into blob_test values (?)", (f,)) f.close() # Retrieval using the "file-like" methods of BlobReader: cur.execute("select * from blob_test") readerA = cur.fetchone()[0] print '\nStreaming retrieval (via kinterbasdb.BlobReader):' # Python "file-like" interface: print 'readerA.mode: "%s"' % readerA.mode print 'readerA.closed: %s' % readerA.closed print 'readerA.tell(): %d' % readerA.tell() print 'readerA.read(2): "%s"' % readerA.read(2) print 'readerA.tell(): %d' % readerA.tell() print 'readerA.read(): "%s"' % readerA.read() print 'readerA.tell(): %d' % readerA.tell() print 'readerA.read(): "%s"' % readerA.read() readerA.close() print 'readerA.closed: %s' % readerA.closed # The chunks method (not part of the Python "file-like" interface, but handy): print '\nFor a blob with contents "ghijklmnop", iterating over' print 'BlobReader.chunks(3) produces:' readerB = cur.fetchone()[0] for chunkNo, chunk in enumerate(readerB.chunks(3)): print 'Chunk %d is: "%s"' % (chunkNo, chunk)
Output:
Materialized retrieval (as str): [('abcdef',), ('ghijklmnop',)] Streaming retrieval (via kinterbasdb.BlobReader): readerA.mode: "rb" readerA.closed: False readerA.tell(): 0 readerA.read(2): "ab" readerA.tell(): 2 readerA.read(): "cdef" readerA.tell(): 6 readerA.read(): "" readerA.closed: True For a blob with contents "ghijklmnop", iterating over BlobReader.chunks(3) produces: Chunk 0 is: "ghi" Chunk 1 is: "jkl" Chunk 2 is: "mno" Chunk 3 is: "p"
When you define a Python function, the interpreter initially parses the textual representation of the function and generates a binary equivalent called bytecode. The bytecode representation can then be executed directly by the Python interpreter any number of times and with a variety of parameters, but the human-oriented textual definition of the function never need be parsed again.
Database engines perform a similar series of steps when executing a SQL statement. Consider the following series of statements:
cur.execute("insert into the_table (a,b,c) values ('aardvark', 1, 0.1)") ... cur.execute("insert into the_table (a,b,c) values ('zymurgy', 2147483647, 99999.999)")
If there are many statements in that series, wouldn't it make sense to "define a function" to insert the provided "parameters" into the predetermined fields of the predetermined table, instead of forcing the database engine to parse each statement anew and figure out what database entities the elements of the statement refer to? In other words, why not take advantage of the fact that the form of the statement ("the function") stays the same throughout, and only the values ("the parameters") vary? Prepared statements deliver that performance benefit and other advantages as well.
The following code is semantically equivalent to the series of insert
operations discussed
previously, except that it uses a single SQL statement that contains Firebird's
parameter marker (?
) in the slots where values are expected, then
supplies those values as Python tuples instead of constructing a textual
representation of each value and passing it to the database engine for parsing:
insertStatement = "insert into the_table (a,b,c) values (?,?,?)" cur.execute(insertStatement, ('aardvark', 1, 0.1)) ... cur.execute(insertStatement, ('zymurgy', 2147483647, 99999.999))
Only the values change as each row is inserted; the statement remains the same.
For many years, KInterbasDB has recognized situations similar to this one and
automatically reused the same prepared statement in each
Cursor.execute
call. In KInterbasDB 3.2, the scheme for
automatically reusing prepared statements has become more sophisticated,
and the API has been extended to offer the client programmer manual control
over prepared statement creation and use.
The entry point for manual statement preparation is the
Cursor.prep
method, which takes a single string parameter that
contains the SQL statement to be prepared, and returns a
kinterbasdb.PreparedStatement
object.
PreparedStatement
has no public methods, but does have the
following public read-only properties:
sql
:
A reference to the string that was passed to
Cursor.prep
to create this PreparedStatement
.
statement_type
:
An integer code that can be matched against the statement type constants
in the kinterbasdb.isc_info_sql_stmt_*
series.
The following statement type codes are currently available:
- isc_info_sql_stmt_commit
- isc_info_sql_stmt_ddl
- isc_info_sql_stmt_delete
- isc_info_sql_stmt_exec_procedure
- isc_info_sql_stmt_get_segment
- isc_info_sql_stmt_insert
- isc_info_sql_stmt_put_segment
- isc_info_sql_stmt_rollback
- isc_info_sql_stmt_savepoint
- isc_info_sql_stmt_select
- isc_info_sql_stmt_select_for_upd
- isc_info_sql_stmt_set_generator
- isc_info_sql_stmt_start_trans
- isc_info_sql_stmt_update
n_input_params
:
The number of input parameters the statement requires.
n_output_params
:
The number of output fields the statement produces.
plan
:
A string representation of the execution plan generated for this statement
by the database engine's optimizer. This property can be used, for
example, to verify that a statement is using the expected index.
description
:
A Python DB API 2.0 description sequence (of the same format as
Cursor.description) that describes the
statement's output parameters. Statements without output parameters have
a description
of None
.
In addition to programmatically examining the characteristics of a SQL statement
via the properties of PreparedStatement
, the client programmer
can submit a PreparedStatement
to
Cursor.execute
or Cursor.executemany
for execution.
The code snippet below is semantically equivalent to both of the previous
snippets in this section, but it explicitly prepares the INSERT
statement in advance, then submits it to Cursor.executemany
for
execution:
insertStatement = cur.prep("insert into the_table (a,b,c) values (?,?,?)") inputRows = [ ('aardvark', 1, 0.1), ... ('zymurgy', 2147483647, 99999.999) ] cur.executemany(insertStatement, inputRows)
The following program demonstrates the explicit use of
PreparedStatement
s. It also benchmarks explicit
PreparedStatement
reuse against KInterbasDB's automatic
PreparedStatement
reuse, and against an input strategy that
prevents PreparedStatement
reuse.
import time import kinterbasdb con = kinterbasdb.connect(dsn=r'localhost:D:\temp\test-20.firebird', user='sysdba', password='masterkey' ) cur = con.cursor() # Create supporting database entities: cur.execute("recreate table t (a int, b varchar(50))") con.commit() cur.execute("create unique index unique_t_a on t(a)") con.commit() # Explicitly prepare the insert statement: psIns = cur.prep("insert into t (a,b) values (?,?)") print 'psIns.sql: "%s"' % psIns.sql print 'psIns.statement_type == kinterbasdb.isc_info_sql_stmt_insert:', ( psIns.statement_type == kinterbasdb.isc_info_sql_stmt_insert ) print 'psIns.n_input_params: %d' % psIns.n_input_params print 'psIns.n_output_params: %d' % psIns.n_output_params print 'psIns.plan: %s' % psIns.plan print N = 10000 iStart = 0 # The client programmer uses a PreparedStatement explicitly: startTime = time.time() for i in xrange(iStart, iStart + N): cur.execute(psIns, (i, str(i))) print ( 'With explicit prepared statement, performed' '\n %0.2f insertions per second.' % (N / (time.time() - startTime)) ) con.commit() iStart += N # KInterbasDB automatically uses a PreparedStatement "under the hood": startTime = time.time() for i in xrange(iStart, iStart + N): cur.execute("insert into t (a,b) values (?,?)", (i, str(i))) print ( 'With implicit prepared statement, performed' '\n %0.2f insertions per second.' % (N / (time.time() - startTime)) ) con.commit() iStart += N # A new SQL string containing the inputs is submitted every time, so # KInterbasDB is not able to implicitly reuse a PreparedStatement. Also, in a # more complicated scenario where the end user supplied the string input # values, the program would risk SQL injection attacks: startTime = time.time() for i in xrange(iStart, iStart + N): cur.execute("insert into t (a,b) values (%d,'%s')" % (i, str(i))) print ( 'When unable to reuse prepared statement, performed' '\n %0.2f insertions per second.' % (N / (time.time() - startTime)) ) con.commit() # Prepare a SELECT statement and examine its properties. The optimizer's plan # should use the unique index that we created at the beginning of this program. print psSel = cur.prep("select * from t where a = ?") print 'psSel.sql: "%s"' % psSel.sql print 'psSel.statement_type == kinterbasdb.isc_info_sql_stmt_select:', ( psSel.statement_type == kinterbasdb.isc_info_sql_stmt_select ) print 'psSel.n_input_params: %d' % psSel.n_input_params print 'psSel.n_output_params: %d' % psSel.n_output_params print 'psSel.plan: %s' % psSel.plan # The current implementation does not allow PreparedStatements to be prepared # on one Cursor and executed on another: print print 'Note that PreparedStatements are not transferrable from one cursor to another:' cur2 = con.cursor() cur2.execute(psSel)
Output:
psIns.sql: "insert into t (a,b) values (?,?)" psIns.statement_type == kinterbasdb.isc_info_sql_stmt_insert: True psIns.n_input_params: 2 psIns.n_output_params: 0 psIns.plan: None With explicit prepared statement, performed 9551.10 insertions per second. With implicit prepared statement, performed 9407.34 insertions per second. When unable to reuse prepared statement, performed 1882.53 insertions per second. psSel.sql: "select * from t where a = ?" psSel.statement_type == kinterbasdb.isc_info_sql_stmt_select: True psSel.n_input_params: 1 psSel.n_output_params: 2 psSel.plan: PLAN (T INDEX (UNIQUE_T_A)) Note that PreparedStatements are not transferrable from one cursor to another: Traceback (most recent call last): File "adv_prepared_statements__overall_example.py", line 86, in ? cur2.execute(psSel) kinterbasdb.ProgrammingError: (0, 'A PreparedStatement can only be used with the Cursor that originally prepared it.')
As you can see, the version that prevents the reuse of prepared statements is about five times slower--for a trivial statement. In a real application, SQL statements are likely to be far more complicated, so the speed advantage of using prepared statements would only increase.
As the timings indicate, KInterbasDB does a good job of reusing prepared
statements even if the client program is written in a style strictly compatible
with the Python DB API 2.0 (which accepts only strings--not
PreparedStatement
objects--to the Cursor.execute
method). The performance loss in this case is less than one percent.
The read/write property Cursor.name
allows the Python
programmer to perform scrolling UPDATE
s or DELETE
s
via the "SELECT ... FOR UPDATE
" syntax.
If you don't know what this means, refer to the section of the
Interbase® 6 Language Reference
that covers the SELECT
statement (page 139).
The Cursor.name
property can be ignored entirely if you don't
need to use it.
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') curScroll = con.cursor() curUpdate = con.cursor() curScroll.execute("select city from addresses for update") curScroll.name = 'city_scroller' update = "update addresses set city=? where current of " + curScroll.name for (city,) in curScroll: city = ... # make some changes to city curUpdate.execute( update, (city,) ) con.commit()
Database server maintenance tasks such as user management, load monitoring,
and database backup have traditionally been automated by scripting
the command-line tools gbak
, gfix
, gsec
,
and gstat
. These utilities are documented in the
Interbase® 6 Operations Guide (see
"Overview of command-line tools", page 28).
The API presented to the client programmer by these utilities is inelegant because they are, after all, command-line tools rather than native components of the client language. To address this problem, Interbase® 6 introduced a facility called the Services API, which exposes a uniform interface to the administrative functionality of the traditional command-line tools.
The native Services API, though consistent, is much lower-level than a
Pythonic API. If the native version were exposed directly, accomplishing
a given task would probably require more Python code than scripting the
traditional command-line tools. For this reason, KInterbasDB presents its own
abstraction over the native API via the kinterbasdb.services
module.
All Services API operations are performed in the context of a connection to
a specific database server, represented by the
kinterbasdb.services.Connection
class.
Connection
s are established by calling the
kinterbasdb.services.connect
function, which accepts three keyword
arguments: host
, user
, and password
.
host
is the network name of the computer on which the database
server is running; user
is the name of the database user under
whose authority the maintenance tasks are to be performed;
password
is that
user's password. Since maintenance operations are most often initiated by an
administrative user on the same computer as the database server,
host
defaults to the local computer, and user
defaults to SYSDBA
.
The three calls to kinterbasdb.services.connect
in the following
program are equivalent:
from kinterbasdb import services con = services.connect(password='masterkey') con = services.connect(user='sysdba', password='masterkey') con = services.connect(host='localhost', user='sysdba', password='masterkey')
A no-argument close
method is available to explicitly terminate
a Connection
; if this is not invoked, the underlying connection
will be closed implicitly when the Connection
object is garbage
collected.
To help client programs adapt to version changes, the service manager exposes its version number as an integer:
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getServiceManagerVersion()
Output (on Firebird 1.5.0):
2
kinterbasdb.services
is a thick wrapper of the Services
API that can shield its users from changes in the underlying C API, so this
method is unlikely to be useful to the typical Python client programmer.
The getServerVersion
method returns the server's version string:
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getServerVersion()
Output (on Firebird 1.5.0/Win32):
WI-V1.5.0.4290 Firebird 1.5
At first glance, the kinterbasdb.services.Connection.getServerVersion
method appears to duplicate the functionality of the
kinterbasdb.Connection.server_version
property, but when
working with Firebird, there is a difference.
kinterbasdb.Connection.server_version
is based
on a C API call (isc_database_info
) that existed long
before the introduction of the Services API in Interbase® 6.
Some programs written before the advent of Firebird test the version number in the return value of
isc_database_info
, and refuse to work if it indicates that the
server is too old. Since the first stable version of Firebird
was labeled 1.0
, this pre-Firebird version testing scheme
incorrectly concludes that (e.g.) Firebird 1.0 is older than
Interbase® 5.0.
Firebird addresses this problem by making
isc_database_info
return a "pseudo-Interbase®" version
number, whereas the Services API returns the true Firebird version, as shown:
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:C:/temp/test.db', user='sysdba', password='masterkey') print 'Interbase-compatible version string:', con.server_version import kinterbasdb.services svcCon = kinterbasdb.services.connect(host='localhost', user='sysdba', password='masterkey') print 'Actual Firebird version string: ', svcCon.getServerVersion()
Output (on Firebird 1.5.0/Win32):
Interbase-compatible version string: WI-V6.3.0.4290 Firebird 1.5 Actual Firebird version string: WI-V1.5.0.4290 Firebird 1.5
The getArchitecture
method returns platform information for
the server, including hardware architecture and operating system family:
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getArchitecture()
Output (on Firebird 1.5.0/Windows 2000):
Firebird/x86/Windows NT
Unfortunately, the architecture string is almost useless because its format is irregular and sometimes outright idiotic, as with Firebird 1.5.0 running on x86 Linux:
Firebird/linux Intel
Magically, Linux becomes a hardware architecture, the ASCII store decides to hold a 31.92% off sale, and Intel grabs an unfilled niche in the operating system market.
The getHomeDir
method returns the equivalent of the
RootDirectory
setting from firebird.conf
:
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getHomeDir()
Output (on a particular Firebird 1.5.0/Windows 2000 installation):
C:\dev\db\firebird150\
Output (on a particular Firebird 1.5.0/Linux installation):
/opt/firebird/
The getSecurityDatabasePath
method returns the location of
the server's core security database, which contains user definitions and
such.
Interbase® and Firebird 1.0 named this database isc4.gdb
,
while in Firebird 1.5 and later it's renamed to security.fdb
:
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getSecurityDatabasePath()
Output (on a particular Firebird 1.5.0/Windows 2000 installation):
C:\dev\db\firebird150\security.fdb
Output (on a particular Firebird 1.5.0/Linux installation):
/opt/firebird/security.fdb
The database engine
uses a lock file
to coordinate interprocess communication;
getLockFileDir
returns the directory in which that file resides:
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getLockFileDir()
Output (on a particular Firebird 1.5.0/Windows 2000 installation):
C:\dev\db\firebird150\
Output (on a particular Firebird 1.5.0/Linux installation):
/opt/firebird/
The Services API offers "a bitmask representing the capabilities currently
enabled on the server", but the only available
documentation
for this bitmask suggests that it is "reserved for future implementation".
kinterbasdb exposes this bitmask as a Python int
returned from
the getCapabilityMask
method.
To support internationalized error messages/prompts, the database engine stores
its messages in a file named interbase.msg
(Interbase®
and Firebird 1.0) or firebird.msg
(Firebird 1.5 and later).
The directory in which this file resides can be determined with the
getMessageFileDir
method.
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getMessageFileDir()
Output (on a particular Firebird 1.5.0/Windows 2000 installation):
C:\dev\db\firebird150\
Output (on a particular Firebird 1.5.0/Linux installation):
/opt/firebird/
getConnectionCount
returns the number of active connections to
databases managed by the server.
This count only includes database
connections (such as open instances of kinterbasdb.Connection
),
not services manager connections
(such as open instances of kinterbasdb.services.Connection
).
import kinterbasdb, kinterbasdb.services svcCon = kinterbasdb.services.connect(host='localhost', user='sysdba', password='masterkey') print 'A:', svcCon.getConnectionCount() con1 = kinterbasdb.connect(dsn='localhost:C:/temp/test.db', user='sysdba', password='masterkey') print 'B:', svcCon.getConnectionCount() con2 = kinterbasdb.connect(dsn='localhost:C:/temp/test.db', user='sysdba', password='masterkey') print 'C:', svcCon.getConnectionCount() con1.close() print 'D:', svcCon.getConnectionCount() con2.close() print 'E:', svcCon.getConnectionCount()
On an otherwise inactive server, the example program generates the following output:
A: 0 B: 1 C: 2 D: 1 E: 0
getAttachedDatabaseNames
returns a list of the names of all
databases to which the server is maintaining at least one connection.
The database names are not guaranteed to be in any particular order.
import kinterbasdb, kinterbasdb.services svcCon = kinterbasdb.services.connect(host='localhost', user='sysdba', password='masterkey') print 'A:', svcCon.getAttachedDatabaseNames() con1 = kinterbasdb.connect(dsn='localhost:C:/temp/test.db', user='sysdba', password='masterkey') print 'B:', svcCon.getAttachedDatabaseNames() con2 = kinterbasdb.connect(dsn='localhost:C:/temp/test2.db', user='sysdba', password='masterkey') print 'C:', svcCon.getAttachedDatabaseNames() con3 = kinterbasdb.connect(dsn='localhost:C:/temp/test2.db', user='sysdba', password='masterkey') print 'D:', svcCon.getAttachedDatabaseNames() con1.close() print 'E:', svcCon.getAttachedDatabaseNames() con2.close() print 'F:', svcCon.getAttachedDatabaseNames() con3.close() print 'G:', svcCon.getAttachedDatabaseNames()
On an otherwise inactive server, the example program generates the following output:
A: [] B: ['C:\\TEMP\\TEST.DB'] C: ['C:\\TEMP\\TEST2.DB', 'C:\\TEMP\\TEST.DB'] D: ['C:\\TEMP\\TEST2.DB', 'C:\\TEMP\\TEST.DB'] E: ['C:\\TEMP\\TEST2.DB'] F: ['C:\\TEMP\\TEST2.DB'] G: []
The getLog
method returns the contents of the server's log file
(named interbase.log
by Interbase® and Firebird 1.0;
firebird.log
by Firebird 1.5 and later):
from kinterbasdb import services con = services.connect(host='localhost', user='sysdba', password='masterkey') print con.getLog()
Output (on a particular Firebird 1.5.0/Windows 2000 installation):
WEASEL (Client) Thu Jun 03 12:01:35 2004 INET/inet_error: send errno = 10054 WEASEL (Client) Sun Jun 06 19:21:17 2004 INET/inet_error: connect errno = 10061
The getStatistics
method returns a string containing a printout
in the same format as the output of the gstat
command-line
utility. This method has one required parameter, the location of the database
on which to compute statistics, and five optional boolean parameters for
controlling the domain of the statistics.
The section of the
Interbase® 6 Operations Guide
entitled "gstat command-line tool" (page 181)
documents gstat
's command-line options.
Rather than attempting to duplicate that documentation here,
we present a table of equivalence:
gstat command-line option |
kinterbasdb.services.Connection.getStatistics boolean parameter |
-header |
showOnlyDatabaseHeaderPages |
-log |
showOnlyDatabaseLogPages |
-data |
showUserDataPages |
-index |
showUserIndexPages |
-system |
showSystemTablesAndIndexes |
The following program presents several getStatistics
calls and their gstat
-command-line equivalents. In this
context, output is considered "equivalent" even if their are some whitespace
differences. When collecting textual output from the Services API,
kinterbasdb terminates lines with \n
regardless of the platform's
convention; gstat
is platform-sensitive.
from kinterbasdb import services con = services.connect(user='sysdba', password='masterkey') # Equivalent to 'gstat -u sysdba -p masterkey C:/temp/test.db': print con.getStatistics('C:/temp/test.db') # Equivalent to 'gstat -u sysdba -p masterkey -header C:/temp/test.db': print con.getStatistics('C:/temp/test.db', showOnlyDatabaseHeaderPages=True) # Equivalent to 'gstat -u sysdba -p masterkey -log C:/temp/test.db': print con.getStatistics('C:/temp/test.db', showOnlyDatabaseLogPages=True) # Equivalent to 'gstat -u sysdba -p masterkey -data -index -system C:/temp/test.db': print con.getStatistics('C:/temp/test.db', showUserDataPages=True, showUserIndexPages=True, showSystemTablesAndIndexes=True )
The output of the example program is not shown here because it is quite long.
KInterbasDB offers convenient programmatic control over database backup and
restoration via the backup
and restore
methods.
At the time of this writing, released versions of Firebird/Interbase® do not implement incremental backup, so we can simplistically define backup as the process of generating and storing an archived replica of a live database, and restoration as the inverse. The backup/restoration process exposes numerous parameters, which are properly documented in Chapter 7 ("Database Backup and Restore") of the Interbase® 6 Operations Guide. The KInterbasDB API to these parameters is presented with minimal documentation in the sample code below.
The simplest form of backup
creates a single backup file
that contains everything in the database. Although the extension
'.gbk'
is conventional, it is not required.
from kinterbasdb import services con = services.connect(user='sysdba', password='masterkey') backupLog = con.backup('C:/temp/test.db', 'C:/temp/test_backup.gbk') print backupLog
In the example, backupLog
is a string containing a
gbak
-style log of the backup process. It is too long to
reproduce here.
Although the return value of the backup
method is a freeform
log string, backup
will raise an exception if there is an error.
For example:
from kinterbasdb import services con = services.connect(user='sysdba', password='masterkey') # Pass an invalid backup path to the engine: backupLog = con.backup('C:/temp/test.db', 'BOGUS/PATH/test_backup.gbk') print backupLog
Traceback (most recent call last): File "adv_services_backup_simplest_witherror.py", line 5, in ? backupLog = con.backup('C:/temp/test.db', 'BOGUS/PATH/test_backup.gbk') File "C:\code\projects\kinterbasdb\Kinterbasdb-3.0\build\lib.win32-2.3\kinterbasdb\services.py", line 269, in backup return self._actAndReturnTextualResults(request) File "C:\code\projects\kinterbasdb\Kinterbasdb-3.0\build\lib.win32-2.3\kinterbasdb\services.py", line 613, in _actAndReturnTextualResults self._act(requestBuffer) File "C:\code\projects\kinterbasdb\Kinterbasdb-3.0\build\lib.win32-2.3\kinterbasdb\services.py", line 610, in _act return _ksrv.action_thin(self._C_conn, requestBuffer.render()) kinterbasdb.OperationalError: (-902, '_kiservices could not perform the action: cannot open backup file BOGUS/PATH/test_backup.gbk. ')
The database engine has built-in support for splitting the backup into multiple files, which is useful for circumventing operating system file size limits or spreading the backup across multiple discs.
KInterbasDB exposes this facility via the Connection.backup
parameters destFilenames
and destFileSizes
.
destFilenames
(the second positional parameter of
Connection.backup
) can be either a string (as in the example
above, when creating the backup as a single file) or a sequence of strings
naming each constituent file of the backup.
If destFilenames
is a string-sequence with length N
,
destFileSizes
must be a sequence of integer file sizes
(in bytes) with length N-1
. The database engine will constrain
the size of each backup constituent file named in
destFilenames[:-1]
to the corresponding size specified in
destFileSizes
; any remaining backup data will be placed in the
file name by destFilenames[-1]
.
Unfortunately, the database engine does not appear to expose any convenient
means of calculating the total size of a database backup before its creation.
The page size of the database and the number of pages in the database are
available via
kinterbasdb.Connection.database_info(kinterbasdb.isc_info_page_size, 'i')
and
kinterbasdb.Connection.database_info(kinterbasdb.isc_info_db_size_in_pages, 'i')
,
respectively, but the size of the backup file is usually smaller than the size
of the database.
There should be no harm in submitting too many constituent specifications; the
engine will write an empty header record into the excess constituents.
However, at the time of this writing, released versions of the database engine
hang the backup task if more than 11 constituents are specified (that is,
if len(destFilenames) > 11
).
KInterbasDB does not prevent the programmer from submitting more than 11
constituents, but it does issue a warning.
The following program directs the engine to split the backup
of the database at 'C:/temp/test.db'
into
'C:/temp/back01.gbk'
, a file 4096 bytes in size,
'C:/temp/back02.gbk'
, a file 16384 bytes in size,
and 'C:/temp/back03.gbk'
, a file containing the remainder
of the backup data.
from kinterbasdb import services con = services.connect(user='sysdba', password='masterkey') con.backup('C:/temp/test.db', ('C:/temp/back01.gbk', 'C:/temp/back02.gbk', 'C:/temp/back03.gbk'), destFileSizes=(4096, 16384) )
In addition to the three parameters documented previously
(positional sourceDatabase
,
positional destFilenames
,
and keyword destFileSizes
),
the Connection.backup
method accepts six boolean parameters
that control aspects of the backup process and the backup file output format.
These options are well documented beginning on page 149 of the
Interbase® 6 Operations Guide, so in this
document we present only a table of equivalence between the section caption in
the Interbase® 6 Operations Guide and the name of the boolean keyword
parameter:
IB6 Op. Guide Caption | Connection.backup Parameter Name |
Connection.backup Parameter Default Value |
Format | transportable |
True |
Metadata Only | metadataOnly |
False |
Garbage Collection | garbageCollect |
True |
Transactions in Limbo | ignoreLimboTransactions |
False |
Checksums | ignoreChecksums |
False |
Convert to Tables | convertExternalTablesToInternalTables |
True |
The simplest form of restore
creates a single-file database,
regardless of whether the backup data were split across multiple files.
from kinterbasdb import services con = services.connect(user='sysdba', password='masterkey') restoreLog = con.restore('C:/temp/test_backup.gbk', 'C:/temp/test_restored.db') print restoreLog
In the example, restoreLog
is a string containing a
gbak
-style log of the restoration process. It is too long to
reproduce here.
The database engine has built-in support for splitting the restored database into multiple files, which is useful for circumventing operating system file size limits or spreading the database across multiple discs.
KInterbasDB exposes this facility via the Connection.restore
parameters destFilenames
and destFilePages
.
destFilenames
(the second positional argument of
Connection.restore
) can be either a string (as in the example
above, when restoring to a single database file) or a sequence of strings
naming each constituent file of the restored database.
If destFilenames
is a string-sequence with length N
,
destFilePages
must be a sequence of integers with length
N-1
. The database engine will constrain the size of each
database constituent file named in destFilenames[:-1]
to the
corresponding page count specified in destFilePages
; any
remaining database pages will be placed in the file name by
destFilenames[-1]
.
The following program directs the engine to restore the backup file at
'C:/temp/test_backup.gbk'
into a database with three constituent
files:
'C:/temp/test_restored01.db'
,
'C:/temp/test_restored02.db'
,
and
'C:/temp/test_restored03.db'
.
The engine is instructed to place fifty user data pages in the first file,
seventy in the second, and the remainder in the third file. In practice, the
first database constituent file will be larger than
pageSize*destFilePages[0]
, because metadata pages must also be
stored in the first constituent of a multifile database.
from kinterbasdb import services con = services.connect(user='sysdba', password='masterkey') con.restore('C:/temp/test_backup.gbk', ('C:/temp/test_restored01.db', 'C:/temp/test_restored02.db', 'C:/temp/test_restored03.db'), destFilePages=(50, 70), pageSize=1024, replace=True )
These options are well documented beginning on page 155 of the
Interbase® 6 Operations Guide, so in this
document we present only a table of equivalence between the section caption in
the Interbase® 6 Operations Guide and the name of the keyword
parameter to Connection.restore
:
IB6 Op. Guide Caption | Connection.restore Parameter Name |
Connection.restore Parameter Default Value |
Page Size | pageSize |
[use server default] |
Overwrite | replace |
False |
Commit After Each Table | commitAfterEachTable |
False |
Create Shadow Files | doNotRestoreShadows |
False |
Deactivate Indexes | deactivateIndexes |
False |
Validity Conditions | doNotEnforceConstraints |
False |
Use All Space | useAllPageSpace |
False |
Two additional boolean parameters are not covered by the table above:
cacheBuffers
and accessModeReadOnly
.
cacheBuffers
specifies the default number of cache pages for
the restored database. If left unspecified, cacheBuffers
uses
the server default.
accessModeReadOnly
(default False
) specifies whether
the restored database is read-only (True
) or
writable (False
).
(XXX: not yet documented)
(XXX: not yet documented)
database_info
Method
database_info
(method; member of kinterbasdb.Connection )
|
Wraps the Interbase® C API function
Note that this method is a very thin wrapper around
function
For example, requesting con.database_info(kinterbasdb.isc_info_user_names, 's')will return a binary string containing a raw succession of length-name pairs. A more convenient way to access the same functionality is via the Connection.db_info
method.
Arguments:
|
import kinterbasdb con = kinterbasdb.connect(dsn='localhost:/temp/test.db', user='sysdba', password='pass') # Retrieving an integer info item is quite simple. bytesInUse = con.database_info(kinterbasdb.isc_info_current_memory, 'i') print 'The server is currently using %d bytes of memory.' % bytesInUse # Retrieving a string info item is somewhat more involved, because the # information is returned in a raw binary buffer that must be parsed # according to the rules defined in the Interbase® 6 API Guide section # entitled "Requesting buffer items and result buffer values" (page 51). # # Often, the buffer contains a succession of length-string pairs # (one byte telling the length of s, followed by s itself). # Function kinterbasdb.raw_byte_to_int is provided to convert a raw # byte to a Python integer (see examples below). buf = con.database_info(kinterbasdb.isc_info_db_id, 's') # Parse the filename from the buffer. beginningOfFilename = 2 # The second byte in the buffer contains the size of the database filename # in bytes. lengthOfFilename = kinterbasdb.raw_byte_to_int(buf[1]) filename = buf[beginningOfFilename:beginningOfFilename + lengthOfFilename] # Parse the host name from the buffer. beginningOfHostName = (beginningOfFilename + lengthOfFilename) + 1 # The first byte after the end of the database filename contains the size # of the host name in bytes. lengthOfHostName = kinterbasdb.raw_byte_to_int(buf[beginningOfHostName - 1]) host = buf[beginningOfHostName:beginningOfHostName + lengthOfHostName] print 'We are connected to the database at %s on host %s.' % (filename, host)
Sample output:
The server is currently using 8931328 bytes of memory. We are connected to the database at C:\TEMP\TEST.DB on host WEASEL.
As you can see, extracting data with the database_info
function
is rather clumsy. In KInterbasDB 3.2, a higher-level means of accessing the
same information is available: the
Connection.db_info
method.
Also, the Services API (accessible to Python programmers via the
kinterbasdb.services
module) provides high-level support for querying database statistics and
performing maintenance.
db_info
Method
db_info
(method; member of kinterbasdb.Connection )
|
High-level convenience wrapper around the
For example, requesting con.db_info(kinterbasdb.isc_info_user_names)returns a dictionary that maps (username -> number of open connections). If SYSDBA has one open connection to the
database to which con is connected, and
TEST_USER_1 has three open connections to that same
database, the return value would be
{'SYSDBA': 1, 'TEST_USER_1': 3}
Arguments:
|
import os.path import kinterbasdb DB_FILENAME = r'D:\temp\test-20.firebird' DSN = 'localhost:' + DB_FILENAME ############################################################################### # Querying an isc_info_* item that has a complex result: ############################################################################### # Establish three connections to the test database as TEST_USER_1, and one # connection as SYSDBA. Then use the Connection.db_info method to query the # number of attachments by each user to the test database. testUserCons = [] for i in range(3): tCon = kinterbasdb.connect(dsn=DSN, user='test_user_1', password='pass') testUserCons.append(tCon) con = kinterbasdb.connect(dsn=DSN, user='sysdba', password='masterkey') print 'Open connections to this database:' print con.db_info(kinterbasdb.isc_info_user_names) ############################################################################### # Querying multiple isc_info_* items at once: ############################################################################### # Request multiple db_info items at once, specifically the page size of the # database and the number of pages currently allocated. Compare the size # computed by that method with the size reported by the file system. # The advantages of using db_info instead of the file system to compute # database size are: # - db_info works seamlessly on connections to remote databases that reside # in file systems to which the client program lacks access. # - If the database is split across multiple files, db_info includes all of # them. res = con.db_info( [kinterbasdb.isc_info_page_size, kinterbasdb.isc_info_allocation] ) pagesAllocated = res[kinterbasdb.isc_info_allocation] pageSize = res[kinterbasdb.isc_info_page_size] print '\ndb_info indicates database size is', pageSize * pagesAllocated, 'bytes' print 'os.path.getsize indicates size is ', os.path.getsize(DB_FILENAME), 'bytes'
Sample output:
Open connections to this database: {'SYSDBA': 1, 'TEST_USER_1': 3} db_info indicates database size is 20684800 bytes os.path.getsize indicates size is 20684800 bytes
Note: This section will not be comprehensible unless you understand the basic
characteristics of the Firebird server architectures. These are documented
in the "Classic or Superserver?" section of the
doc/Firebird-1.5-QuickStart.pdf
file included with the Firebird
distribution.
Versions of KInterbasDB prior to 3.2 imposed a global lock over all database client library calls. This lock, referred to as the Global Database API Lock (GDAL), must be active for multithreaded client programs to work correctly with versions of the Firebird client library that do not properly support concurrency. Many such versions are still in use, so the GDAL remains active by default in KInterbasDB 3.2. To determine whether the client library you're using can correctly handle concurrent database calls, read this Overview of Firebird Client Library Thread-Safety.
Note that a single client library might have different thread-safety properties
depending on which protocol the client program specifies via the
parameters of kinterbasdb.connect
. For example, the Firebird 1.5
client library on Windows is thread-safe if the remote protocol is used, as in
kinterbasdb.connect(dsn=r'localhost:C:\temp\test.db', ...)
but is not thread-safe if the local protocol is used, as in
kinterbasdb.connect(dsn=r'C:\temp\test.db', ...)
KInterbasDB 3.2 supports three levels of concurrency:
Level 0: No lock management whatsoever
If the C preprocessor symbol ENABLE_CONCURRENCY
is not
defined when KInterbasDB is compiled, no lock management at all is
performed at runtime. In fact, the code to initialize and manage the
locks is not even compiled in.
Level 0 is intended only for compiling KInterbasDB on non-threaded builds of the Python interpreter. It would not be desirable for a client program running on a normal (threaded) build of the Python interpreter to use Level 0, so no overhead is invested in making it possible to transition to Level 0 at runtime.
Since Level 0 is intended for use in Python interpreters that have no Global Interpreter Lock (GIL), the GIL is not manipulated.
Level 1: Global Database API Lock (GDAL) is active (this is the default level)
At Level 1, a global lock serializes all calls to the database client library. This lock, called the Global Database API Lock (GDAL), is to the database client library as the GIL is to the Python interpreter: a mechanism to guarantee that at most one thread is using the database client library at any time.
Level 1 exists to support those versions of Firebird in which the
client library is not thread-safe at the connection level (see the
Overview of Firebird Client Library Thread-Safety
for details).
In environments where the author of KInterbasDB creates
binaries and distributes them to client programmers, there is no way of
knowing at compile time which Firebird client library configuration the
KInterbasDB binaries will be used with.
Level 1 protects client programmers who are not aware of the
thread-safety properties of their version of the client library.
For these reasons, Level 1 is the default, but Level 2 can be selected at
runtime via the kinterbasdb.init
function (see next
section).
At Level 1, the Python GIL is released and reacquired around most database client library calls in order to avoid blocking the entire Python process for the duration of the call.
Level 2: Global Database API Lock (GDAL) is not active, but connection and disconnection are serialized via the GCDL
At Level 2, calls to the database client library are not serialized, except for calls to the connection attachment and detachment functions, which are serialized by a lock called the Global Connection and Disconnection Lock (GCDL). This limited form of serialization is necessary because the Firebird client library makes no guarantees about the thread-safety of connection and disconnection. Since most client programs written with high concurrency in mind use a connection pool that minimizes the need to physically connect and disconnect, the GCDL is not a serious impediment to concurrency.
Level 2, which can be activated at runtime by calling
kinterbasdb.init(concurrency_level=2)
, is appropriate for
client programmers who are aware of the thread-safety guarantees provided
by their version of the Firebird client library, and have written the
client program accordingly.
For details about the thread-safety of various Firebird client library
versions, see the
Overview of Firebird Client Library Thread-Safety.
At Level 2, the Python GIL is released and reacquired around most database client library calls, just as it is at Level 1.
Level 1 is the default, so if you don't understand these subtleties, or are using a client library configuration that is not thread-safe, you do not need to take any action to achieve thread-safety.
Level 2 can greatly increase the throughput of a database-centric, multithreaded Python application, so you should use it if possible. Once you've determined that you're using an appropriate connection protocol with a capable client library, you can activate Level 2 at runtime with the following call:
kinterbasdb.init(concurrency_level=2)
The kinterbasdb.init
function can only be called once during the
life of a process. If it has not been called explicitly, the function will be
called implicitly when the client program tries to perform any database
operation. Therefore, the recommended place to call
kinterbasdb.init
is at the top level of one of the main modules
of your program. The importation infrastructure of the Python interpreter
serializes all imports, so calling kinterbasdb.init
at import
time avoids the potential for multiple simultaneous calls, which could cause
subtle problems.
threadsafety
versus concurrency_level
Make sure not to confuse KInterbasDB's concurrency_level
with its
threadsafety
. threadsafety
, a module-level
property required by the Python DB API Specification 2.0, represents the
highest level of granularity at which the DB API implementation remains
thread-safe. KInterbasDB is always
"thread-safe at the connection level"
(DB API threadsafety 1
), regardless of which
concurrency_level
is active.
Think of threadsafety
as the level of thread-safety that
KInterbasDB guarantees, and concurrency_level
as the degree
to which KInterbasDB's internals are able to exploit a client program's
potential for concurrency.
Use the Classic server architecture, but the SuperServer client library.
At the time of this writing (December 2005), the thread-centric Vulcan had not been released, so the multi-process Classic architecture was the only Firebird server architecture that could take advantage of multiple CPUs. This means that in most scenarios, Classic is far more concurrency-friendly than SuperServer.
The Windows version of Firebird--whether Classic or SuperServer--offers a single client library, so the following advice is not relevant to Windows.
The non-Windows versions of Firebird Classic include two client libraries:
fbclient
(libfbclient.so
)
communicates with the server solely via
the network protocol (possibly over an emulated network such as the
local loopback).
fbclient
is thread-safe in recent versions
of Firebird.
fbembed
(libfbembed.so
)
uses an in-process Classic server to manipulate the database file
directly.
fbembed
is not thread-safe in any version of Firebird; it
should never be used with KInterbasDB concurrency level 2.
At present, the best way to achieve a concurrency-friendly
KInterbasDB/Firebird configuration is to use a version of KInterbasDB
linked against
a thread-safe fbclient
,
running at concurrency level 2, and communicating with a Classic
server.
On Linux, such a setup can be created by installing the Classic server,
then compiling KInterbasDB with the database_lib_name
option in setup.cfg
set to fbclient
(this is the default setting). A version of KInterbasDB that was linked
against fbembed
(by setting
database_lib_name=fbembed
) will not work in a multithreaded
program if the concurrency level is higher than 1.
On Windows, use a Classic server in combination with one of the standard KInterbasDB Windows binaries for Firebird 1.5 or later, and be sure to set KInterbasDB's concurrency level to 2.
mx.DateTime
?NO!
KInterbasDB uses mx.DateTime
by default due to backward
compatibility constraints, but mx.DateTime
is definitely not
required. The standard library datetime
module can be used
just as easily.
You can read
this section
for a detailed explanation, or simply replace the line
import kinterbasdb
with
import kinterbasdb; kinterbasdb.init(type_conv=200)
in your client program (if you're using a version of Python prior to 2.4,
you'll need to
install the decimal
module manually).
Here
is an example program that uses datetime
instead of
mx.DateTime
.
NUMERIC
/DECIMAL
) Handling
KInterbasDB's
dynamic type translation
allows database fixed point types to be handled both precisely and
conveniently, when combined with a full-featured fixed point data type
such as that implemented by the
decimal
module that entered the standard library in Python 2.4.
An official implementation of dynamic type translators for the
decimal
module is distributed with KInterbasDB in the
kinterbasdb.typeconv_fixed_decimal
module.
It can be activated conveniently using the features discussed in
this section,
and demonstrated in
this example program.
Use the
Cursor.fetch*map
series of methods for traditional fetch
es, or the
Cursor.itermap
method to iterate over mappings rather than
sequences.
Example code appears in the Tutorial section entitled
"Executing SQL Statements".
For the sake of backward-compatibility, KInterbasDB handles Unicode naively by default, leaving responsibility for encoding and decoding to the client programmer.
In KInterbasDB 3.1, a dynamic type translation slot named
'TEXT_UNICODE'
was introduced.
The 'TEXT_UNICODE'
translators are invoked for all
CHAR
or VARCHAR
fields except those with character
sets NONE
, OCTETS
, or ASCII
.
Due to flaws in the database engine's C API, KInterbasDB is not able to
automatically encode or decode Unicode blobs.
The most convenient way to handle Unicode with KInterbasDB is to combine
the 'TEXT_UNICODE'
slot with the official translator
implementation in the kinterbasdb.typeconv_text_unicode
module.
This can be accomplished either manually, via
[Connection|Cursor].set_type_trans_[in|out]
,
or by loading a predefined set of translators via
kinterbasdb.init(type_conv=...)
.
The table of type_conv
codes
lists several that enable automatic Unicode handling.
For more information, see the translator signature table and this example program.
The Firebird 1.5 Release Notes (ReleaseNotes.pdf
, included with RC5
and later) describe Embedded Firebird as "a DLL that merges a single client
attachment with a Firebird Superserver for building very quick and efficient
stand-alone and briefcase applications."
Practically speaking, Embedded Firebird allows an application to use the
database engine without managing an external server process. This is ideal for
applications that will be distributed to end users, or that must be deployed
on operating systems that support background services poorly, such as Win9x.
The KInterbasDB distribution linked against the Firebird 1.5 client library fbclient.dll (kinterbasdb-V.V.win32-FB1.5-pyV.V) works fine with Embedded Firebird. Only local-protocol connections are supported, of course, and some of the standalone-server-oriented features of the Services API are not supported.
For generic Embedded Firebird configuration instructions, refer to the section
of the Firebird 1.5 Release Notes entitled
"Installing Embedded server from a zip kit"
(page 51 of the ReleaseNotes.pdf
accompanying Firebird 1.5.0).
Below are specific instructions for installing Embedded Firebird 1.5.2 for use with Python 2.3 and KInterbasDB 3.2.
Extract the file
kinterbasdb-3.1.3.win32-all_binaries_pack.zip
to a temporary directory. Copy the kinterbasdb
directory
from kinterbasdb-V.V-win32-all-binaries-pack\firebird-1.5\lib.win32-2.3
to a directory on the PYTHONPATH of the python.exe
you intend
to use Embedded Firebird with
(for example, the-directory-of-python.exe\Lib\site-packages
).
Extract Firebird-1.5.2.4731_embed_win32.zip
to a temporary
directory. In the directory where KInterbasDB resides
(the-directory-of-python.exe\Lib\site-packages\kinterbasdb
, in
this example), create a subdirectory named embedded
.
Copy the files
fbembed.dll
,
firebird.msg
,
and ib_util.dll
to the-directory-of-python.exe\Lib\site-packages\kinterbasdb\embedded
.
If you intend to use character sets other than ASCII, also copy
the intl
subdirectory, which contains fbintl.dll
.
Rename fbembed.dll
to fbclient.dll
.
You should now have the following file structure:
the-directory-of-python.exe\ [python.exe, and other Python-related files] Lib\ site-packages\ kinterbasdb\ embedded\ fbclient.dll firebird.msg ib_util.dll intl\ <--only necessary if using character sets other than ASCII fbintl.dll <--^
Run your KInterbasDB-based Python application. The database engine is "embedded" within the same process as your application; no external server process is necessary, and no code changes are required.
As of Firebird 1.5, the embedded engine supports all aspects of the Services API that it could be expected to, given that the embedded engine runs in the same process as the client and has no central "server" supervising all connections. For example, backup and restore functionality work, but user management (a server-level feature disabled in the embedded engine) does not.
There exist at least three Zope adapters based on KInterbasDB; see the links page.
DECIMAL
/NUMERIC
fields in Firebird.
(This module has been largely superseded by the standard library
Decimal
module, as of Python 2.4.)
Send feedback about this documentation or the KInterbasDB code to the author of the current versions of both, David S. Rushby.