Provides an API for storing, retrieving and deleting streams of bits in
a transactionally safe fashion. The main class is
BitstreamStorageManager.
Using the Bitstore API
An example use of the Bitstore API is shown below:
// Create or obtain a context object
Context context;
// Stream to store
InputStream stream;
try
{
// Store the stream
int id = BitstreamStorageManager.store (context, stream);
// Retrieve it
InputStream retrieved = BitstreamStorageManager.retrieve(context, id);
// Delete it
BitstreamStorageManager.delete(context, id);
// Complete the context object so changes are written
}
// Error with I/O operations
catch (IOException ioe)
{
}
// Database error
catch (SQLException sqle)
{
}
Storage mechanism
The BitstreamStorageManager stores files in one or more asset store
directories. These can be configured in dspace.cfg . For
example:
assetstore.dir = /dspace/assetstore
The above example specifies a single asset store.
assetstore.dir = /dspace/assetstore_0
assetstore.dir.1 = /mnt/other_filesystem/assetstore_1
The above example specifies two asset stores. assetstore.dir
specifies the asset store number 0 (zero); after that use
assetstore.dir.1 , assetstore.dir.2 and so on. The
particular asset store a bitstream is stored in is held in the database, so
don't move bitstreams between asset stores, and don't renumber them.
By default, newly created bitstreams are put in asset store 0 (i.e. the one specified by the
assetstore.dir property.) To change this, for example when asset
store 0 is getting full, add a line to dspace.cfg like:
assetstore.incoming = 1
Then restart DSpace (Tomcat). New bitstreams will be written to the asset
store specified by assetstore.dir.1 , which is
/mnt/other_filesystem/assetstore_1 in the above example.
Moving an Asset Store
You can move an asset store as a whole to a new location in the file system; stop DSpace
(Tomcat), move all of the contents to the new location, change the appropriate
line in dspace.cfg , and restart DSpace (Tomcat).
We will be providing administration tools for more sophisticated management
of these asset stores in the future.
When given a stream of bits to store, the BitstreamStorageManager
generates a unique key for the stream. The key takes the form of
a long sequence of digits, which is transformed into a file path.
The BitstreamStorageManager stores the contents of the stream in
this path, creating parent directories as necessary.
The Bitstore and Transactions
The bitstore is carefully engineered to prevent data loss, using
transactional flags in the database. Before a bitstream is
actually stored, a metadata entry with the unique bitstream id is
committed to the database. If the storage operation fails or is
aborted, the deleted flag remains. The bitstore API then ensures
that the bitstream cannot be retrieved, and after an hour, the
bitstream is eligible for cleanup. The bitstream is accessible only
after all database operations have been successfully committed.
Similarly, bitstreams are deleted by simply setting the deleted
flag. If an deletion operation is rolled back, the bitstream is still
present in the asset store.
Cleaning up the Asset Store
As noted above, sometimes files will be physically present in the
Asset Store even though they are marked deleted in the database.
You can use the command-line utility class
org.dspace.storage.bitstore.Cleanup (which is invoked via
/dspace/bin/cleanup )
to remove the bitstreams which are marked deleted from the Asset Store.
To prevent accidental deletion of bitstreams which are in the process
of being stored, cleanup only removes bitstreams which are more than
an hour old.
|