Software Applications in Plant Collections
The Plant Collections project is intended to provide a mechanism for participants to share their garden accession information with other partners and the public with the intention of improving general knowledge about garden collection and their content by combining the information from multiple sources.
Source Databases
Botanic gardens and arboreta use a variety of different software applications to maintain a list of their accessions along with various attributes describing each accession, or more generally, the group of organisms know by the same name (e.g. species, variety, genus, family). Different gardens may also choose to utilise different models for representing this information in their database. For example, the field containing the family name for an accession may be labelled "family" or "family_name". More importantly, the representation of content may also differ for the same attribute. For example, a date may be stored as three fields (year, month, day), could be a "date" type in the database, or could be a string.
Main Data Store
All these differences for storing data by different institutions impedes sharing of information, and so requires a mechanism to translate the contents to a common model. The common model for the Plant Collections project is described at http://plants.ecoforge.net/wiki/schema. For the initial implementation of the Plant Collections project, it was decided to utilize Google Base as a common data store. By adopting Google Base, the cost of initial implementation was dramatically reduced, and the open standards used for communicating with google Base ensures that application developed by the Plant Collections project can be easily ported to an alternative database system as necessary. All communication with Google Base in this project is performed using the Atom and JSON protocols.
A replacement for the Google Base data store is currently under development at the Biodiversity Research Center, and is currently in prototype test mode. It is expected that the new service will be a direct replacement for Google Base that can be adopted by the Plant Collections project with little change to existing software tools. It will provide significant benefits to the project though, by providing complete independence from a commercial entity and the opportunity to fine tune the data store as necessary for project requirements.
Synchronizer
The role of the synchronizer is to translate content from a source database (i.e. an institution's collection database) and upload content to the main data store (currently Google Base). The synchronizer is a Python application that is cross platform and should connect with any collection database. Whilst the synchronizer is cumbersome to configure, once that task is completed, the actual process of uploading new and changed content to the central store is essentially an automatic, hands-off process.
Portal
The portal application provides a user interface to the main data store used by the project. It is implemented using the Python language, and is cross platform, so should run on any system that has an internet connection and supports the Python language. The look and feel of the portal is easily customized, and it is intended to be installed by any Plant Collections partner that would like to provide access to the project data from their institution.
The portal makes use of a number of open source libraries, including Turbogears, sqlite, and the gdata library.
The current implementation of the portal is a prototype, and is expected to change significantly as the requirements of the project and it's participants become more tightly defined.
