README for DB-All.e Python bindings

The DB-All.e Python bindings provide 2 levels of access to a DB-All.e database: a complete API similar to the Fortran and C++ API, and a high-level API called volnd that allows to automatically export matrices of data out of the database.

Contents

The DB-All.e API

The 'dballe' module has a few global methods:

describe_level(ltype1=None, l1=None, ltype2=None, l2=None)
Return a string description for a level
describe_trange(pind=None, p1=None, p2=None)
Return a string description for a time range
var(varcode[, default])
Query the DB-All.e variable table returning a Var, optionally initialized with a value
varinfo(varcode)
Query the DB-All.e variable table returning a Varinfo

and several classes, documented in their own sections.

dballe.Var

a Var holds a measured value and all available information related to it.

To create a Var, use the method dballe.var.

Its members are:

code
variable code
info
Varinfo for this variable
isset
true if the value is set
enq()
get the value of the variable, as int, float or str according the variable definition
enqc()
get the value of the variable, as a str
enqd()
get the value of the variable, as a float
enqi()
get the value of the variable, as an int
format(default='')
format the value of the variable to a string
get(default=None)
get the value of the variable, with a default if it is unset

Examples:

v = dballe.var("B12101", 32.5)
# v.info returns detailed informations about the variable in a Varinfo object.
print "%s: %s %s %s" % (v.code, str(v), v.info.unit, v.info.desc)

dballe.Varinfo

a Varinfo holds all possible information about a variable, such as its measurement unit, description and number of significant digits.

Its members are:

bit_len
number of bits used to encode the value in BUFR
bit_ref
reference value added after scaling, for BUFR decoding
desc
description
is_string
true if the value is a string
len
number of significant digits
ref
reference value added after scaling
scale
scale of the value as a power of 10
unit
measurement unit
var
variable code

dballe.Record

a Record holds one or more Var variables, together with a range of metadata key=value pairs. The available metadata pairs are documented in the Fortran API documentation.

A Record is used to make queries to the database, and read results.

Its members are:

key
return a var key from the record
clear()
remove all data from the record
clear_vars()
remove all variables from the record, leaving the keywords intact
copy()
return a copy of the Record
date_extremes()
get two datetime objects with the lower and upper bounds of the datetime period in this record
get(key, default=None)
lookup a value, returning a fallback value (None by default) if unset
keys()
return a sequence with all the varcodes of the variables set on the Record. Note that this does not include keys.
set_from_string(str)
set values from a 'key=val' string
set_station_context()
set the date, level and time range values to match the station data context
update(**kwargs)
set many record keys/vars in a single shot, via kwargs
var(code=None)
return a variable from the record. If no varcode is given, use record['var']
vars()
return a sequence with all the variables set on the Record. Note that this does not include keys.

When creating a new record, keyword arguments can be passed and they are set as if Record.update(**kwargs) had been called.

There are 6 extra keys available in the Python API, which can be used as shortcuts to get and set many values in one shot:

date
a datetime.datetime()
datemin
a datetime.datetime()
datemax
a datetime.datetime()
level
a tuple of integers
trange
a tuple of integers
timerange
a tuple of integers

Examples:

rec = Record(lat=44.05, lon=11.03, B12101=22.1)

# Metadata and variables can be accessed via normal lookup
print rec["lat"], rec["B12101"]

# Iterating a record iterates on variable codes, but not metadata
for code in rec:
    print code, rec.get(code, "undefined"), rec.var(code).info.desc

dballe.DB

a DB is used to access the database.

Its members are:

query_summary
Query the summary of the results of a query; returns a Cursor
attr_insert(varcode, attrs, reference_id=None, replace=True)
Insert new attributes into the database
attr_remove(varcode, reference_id, attrs=None)
Remove attributes
connect(dsn, user='', password='')
Create a DB connecting to an ODBC source
connect_from_file(filename)
Create a DB connecting to a SQLite file
connect_from_url(url)
Create a DB as defined in an URL-like string
connect_test()
Create a DB for running the test suite, as configured in the test environment
disappear()
Remove all our traces from the database, if applicable.
export_to_file(query, format, filename, generic=False)
Export data matching a query as bulletins to a named file
insert(record, can_replace=False, can_add_stations=False)
Insert a record in the database
is_url(string)
Checks if a string looks like a DB url
query_attrs(varcode, reference_id, attrs=None)
Query attributes
query_data(query)
Query the variables in the database; returns a Cursor
query_stations(query)
Query the station archive in the database; returns a Cursor
remove(query)
Remove records from the database
reset([repinfo_filename])
Reset the database, removing all existing Db-All.e tables and re-creating them empty.
vacuum()
Perform database cleanup operations

Examples:

# Connect to a database and run a query
db = dballe.DB.connect_from_file("db.sqlite")
query = dballe.Record(latmin=44.0, latmax=45.0, lonmin=11.0, lonmax=12.0)

# The result is a dballe.Cursor, which can be iterated to get results as
# dballe.Record objects.
# The results always point to the same Record to avoid creating a new one
# for every iteration: if you need to store them, use Record.copy()
for rec in db.query_data(query):
    print rec["lat"], rec["lon"], rec["var"], rec.var().format("undefined")

# Insert 2 new variables in the database
rec = dballe.Record(
    lat=44.5, lon=11.4,
    level=(1,),
    trange=(254,),
    date=datetime.datetime(2013, 4, 25, 12, 0, 0),
    B11101=22.4,
    B12103=17.2,
)
db.insert(rec)

dballe.Cursor

a Cursor is the result of database queries. It is generally not used explicitly and just iterated, but it does have a few members:

remaining
number of results still to be returned
next()
x.next() -> the next value, or raise StopIteration
query_attrs(attrs=None)
Query attributes for the current variable

The volnd API

volnd is an easy way of extracting entire matrixes of data out of a DB-All.e database.

This module allows to extract multidimensional matrixes of data given a list of dimension definitions. Every dimension definition defines what kind of data goes along that dimension.

Dimension definitions can be shared across different extracted matrixes and multiple extractions, allowing to have different matrixes whose indexes have the same meaning.

This example code extracts temperatures in a station by datetime matrix:

query = dballe.Record()
query["var"] = "B12001"
query["rep_memo"] = "synop"
query["level"] = (105, 2)
query["trange"] = (0,)
vars = read(self.db.query(query), (AnaIndex(), DateTimeIndex()))
data = vars["B12001"]
# Data is now a 2-dimensional Masked Array with the data
#
# Information about what values correspond to an index in the various
# directions can be accessed in data.dims, which contains one list per
# dimension with all the information corresponding to every index.
print "Ana dimension is", len(data.dims[0]), "items long"
print "Datetime dimension is", len(data.dims[1]), "items long"
print "First 10 stations along the Ana dimension:", data.dims[0][:10]
print "First 10 datetimes along the DateTime dimension:", data.dims[1][:10]

This is the list of dimensions supported by dballe.volnd:

AnaIndex

Index for stations, as they come out of the database.

The constructor syntax is: AnaIndex(shared=True, frozen=False, start=None).

The index saves all stations as AnaIndexEntry tuples, in the same order as they come out of the database.

NetworkIndex

Index for networks, as they come out of the database.

The constructor syntax is: NetworkIndex(shared=True, frozen=False, start=None).

The index saves all networks as NetworkIndexEntry tuples, in the same order as they come out of the database.

LevelIndex

Index for levels, as they come out of the database

The constructor syntax is: LevelIndex(shared=True, frozen=False), start=None.

The index saves all levels as dballe.Level tuples, in the same order as they come out of the database.

TimeRangeIndex

Index for time ranges, as they come out of the database.

The constructor syntax is: TimeRangeIndex(shared=True, frozen=False, start=None).

The index saves all time ranges as dballe.TimeRange tuples, in the same order as they come out of the database.

DateTimeIndex

Index for datetimes, as they come out of the database.

The constructor syntax is: DateTimeIndex(shared=True, frozen=False, start=None).

The index saves all datetime values as datetime.datetime objects, in the same order as they come out of the database.

IntervalIndex

Index by fixed time intervals: index points are at fixed time intervals, and data is acquired in one point only if it is within a given tolerance from the interval.

The constructor syntax is: IntervalIndex(start, step, tolerance=0, end=None, shared=True, frozen=False).

start is a datetime.datetime object giving the starting time of the time interval of this index.

step is a datetime.timedelta object with the interval between sampling points.

tolerance is a datetime.timedelta object specifying the maximum allowed interval between a datum datetime and the sampling step. If the interval is bigger than the tolerance, the data is discarded.

end is an optional datetime.datetime object giving the ending time of the time interval of the index. If omitted, the index will end at the latest accepted datum coming out of the database.

The data objects used by AnaIndex and NetworkIndex are:

AnaIndexEntry

AnaIndex entry, with various data about a single station.

It is a tuple of 4 values:
  • station id
  • latitude
  • longitude
  • mobile station identifier, or None
NetworkIndexEntry

NetworkIndex entry, with various data about a single station.

It is a tuple of 2 values:
  • network code
  • network name

The extraction is done using the dballe.volnd.read function:

read(cursor, dims, filter=None, checkConflicts=True, attributes=None)

cursor is a dballe.Cursor resulting from a dballe query

dims is the sequence of indexes to use for shaping the data matrixes

filter is an optional filter function that can be used to discard values from the query: if filter is not None, it will be called for every output record and if it returns False, the record will be discarded

checkConflicts tells if we should raise an exception if two values from the database would fill in the same position in the matrix

attributes tells if we should read attributes as well: if it is None, no attributes will be read; if it is True, all attributes will be read; if it is a sequence, then it is the sequence of attributes that should be read.

The result of dballe.volnd.read is a dict mapping output variable names to a dballe.volnd.Data object with the results. All the Data objects share their indexes unless the xxx-Index definitions have been created with shared=False.

This is the dballe.volnd.Data class documentation:

Data

Container for collecting variable data. It contains the variable data array and the dimension indexes.

If v is a Data object, you can access the tuple with the dimensions as v.dims, and the masked array with the values as v.vals.

The methods of dballe.volnd.Data are:

append

Collect a new value from the given dballe record.

You need to call finalise() before the values can be used.

appendAttrs

Collect attributes to append to the record.

You need to call finalise() before the values can be used.

finalise
Stop collecting values and create a masked array with all the values collected so far.