| |
Package Name | Comment | org.dspace.administer |
Provides classes and methods for administrative functions that fall outside
the regular use of other subsystems.
| org.dspace.app.checker |
org.dspace.app.checker Package Documentation
org.dspace.app.checker provides user interfaces to the org.dspace.checker
package. Command line options are detailed in the ChecksumChecker javadoc.
| org.dspace.app.dav | | org.dspace.app.dav.client | | org.dspace.app.didl | | org.dspace.app.itemexport | | org.dspace.app.itemimport | | org.dspace.app.mediafilter | | org.dspace.app.mets | | org.dspace.app.oai | | org.dspace.app.packager | | org.dspace.app.sitemap | | org.dspace.app.statistics | | org.dspace.app.util | | org.dspace.app.webui.components | | org.dspace.app.webui.filter | | org.dspace.app.webui.jsptag | | org.dspace.app.webui.servlet | | org.dspace.app.webui.servlet.admin | | org.dspace.app.webui.submit | | org.dspace.app.webui.submit.step | | org.dspace.app.webui.util | | org.dspace.app.xmlui.aspect.administrative | | org.dspace.app.xmlui.aspect.administrative.authorization | | org.dspace.app.xmlui.aspect.administrative.collection | | org.dspace.app.xmlui.aspect.administrative.community | | org.dspace.app.xmlui.aspect.administrative.eperson | | org.dspace.app.xmlui.aspect.administrative.group | | org.dspace.app.xmlui.aspect.administrative.item | | org.dspace.app.xmlui.aspect.administrative.mapper | | org.dspace.app.xmlui.aspect.administrative.registries | | org.dspace.app.xmlui.aspect.artifactbrowser | | org.dspace.app.xmlui.aspect.eperson | | org.dspace.app.xmlui.aspect.general | | org.dspace.app.xmlui.aspect.submission | | org.dspace.app.xmlui.aspect.submission.submit | | org.dspace.app.xmlui.aspect.submission.workflow | | org.dspace.app.xmlui.aspect.xmltest | | org.dspace.app.xmlui.cocoon | | org.dspace.app.xmlui.configuration | | org.dspace.app.xmlui.objectmanager | | org.dspace.app.xmlui.utils | | org.dspace.app.xmlui.wing | | org.dspace.app.xmlui.wing.element | | org.dspace.authenticate |
End-user authentication manager, interface and implementations.
| org.dspace.authorize |
Handles permissions for DSpace content.
Philosophy
DSpace's authorization system follows the classical "police state"
philosophy of security - the user can do nothing, unless it is
specifically allowed. Those permissions are spelled out with
ResourcePolicy objects, stored in the resourcepolicy table in the
database.
Policies are attached to Content
Policies are attached to Content
Resource Policies get assigned to all of the content objects in
DSpace - collections, communities, items, bundles, and bitstreams.
(Currently they are not attached to non-content objects such as EPerson
or Group. But they could be, hence the name ResourcePolicy instead of
ContentPolicy.)
Policies are tuples
Authorization is based on evaluating the tuple of (object, action, who),
such as (ITEM, READ, EPerson John Smith) to check if the EPerson "John Smith"
can read an item. ResourcePolicy objects are pretty simple, describing a single instance of
(object, action, who). If multiple who's are desired, such as Groups 10, 11, and
12 are allowed to READ Item 13, you simply create a ResourcePolicy for each
group.
Special Groups
The install process should create two special groups - group 0, for
anonymous/public access, and group 1 for administrators.
Group 0 (public/anonymous) allows anyone access, even if they are not
authenticated. Group 1's (admin) members have super-user rights, and
are allowed to do any action to any object.
Unused ResourcePolicy attributes
ResourcePolicies have a few attributes that are currently unused,
but are included with the intent that they will be used someday.
One is start and end dates, for when policies will be active, so that
permissions for content can change over time. The other is the EPerson -
policies could apply to only a single EPerson, but for ease of
administration currently a Group is the recommended unit to use to
describe 'who'.
| org.dspace.browse |
Provides classes and mehtods for browsing Items in DSpace by whatever
is specified in the configuration. The standard method by which you
would perform a browse is as follows:
- Create a BrowserScope object. This object holds all of the parameters
of your browse request
- Pass the BrowserScope object into the BrowseEngine object.
This object should be invoked through either the browse() method or the browseMini() method
- The BrowseEngine will pass back a BrowseInfo object which contains all the relevant details of your request
Browses only return archived Items; other Items (eg, those
in the workflow system) are ignored.
Using the Browse API
An example use of the Browse API is shown below:
// Create or obtain a context object
Context context = new Context();
// Create a BrowseScope object within the context
BrowserScope scope = new BrowserScope(context);
// The browse is limited to the test collection
Collection test = Collection.find(context, someID);
scope.setBrowseContainer(test);
// Set the focus
scope.setFocus("Test Title");
// A maximum of 30 items will be returned
scope.setResultsPerPage(30);
// set ordering to DESC
scope.setOrder("DESC");
// now execute the browse
BrowseEngine be = new BrowseEngine();
BrowseInfo results = be.browse(scope);
In this case, the results might be Items with titles like:
Tehran, City of the Ages
Ten Little Indians
Tenchi Universe
Tension
Tennessee Williams
Test Title (the focus)
Thematic Alignment
Thesis and Antithesis
...
Browse Indexes
The Browse API uses database tables to index Items based on the supplied
configuration. When an Item is added to DSpace, modified or removed via the
Content Management API, the
indexes are automatically updated.
To rebuild the database tables for the browse (on configuration change), or
to re-index just the contents of the existing tables, use the following
commands from IndexBrowse:
A complete rebuild of the database and the indices:
[dspace]/dsrun org.dspace.browse.IndexBrowse -f -r
A complete re-index of the archive contents:
[dspace]/dsrun org.dspace.browse.IndexBrowse -i
| org.dspace.checker |
org.dspace.checker Package Documentation
Provides content fixity checking (using checksums)
for bitstreams stored in DSpace software.
The main access point to org.dspace.checker is on the command line
via {@link org.dspace.app.checker.ChecksumChecker#main(String[])},
but it is also simple to get programmatic access to ChecksumChecker
if you wish, via a {@link org.dspace.checker.CheckerCommand} object.
CheckerCommand is a simple Command object. You initalize it with
a strategy for iterating through bitstreams to check (an implementation of
{@link org.dspace.checker.BitstreamDispatcher}), and a object to collect
the results (an implementation of @link org.dspace.checker.ChecksumResultsCollector})
, and then call {@link org.dspace.checker.CheckerCommand#process()}
to begin the processing. CheckerCommand handles the calculation of bitstream
checksums and iteration between bitstreams.
BitstreamDispatcher
The order in which bitstreams are checked and when a checking run terminates
is controlled by implementations of BitstreamDispatcher, and you can extend the
functionality of the package by writing your own implementatio of this simple
interface, although the package includes several useful implementations that will
probably suffice in most cases: -
Dispatchers that generate bitstream ordering: -
- {@link org.dspace.checker.ListDispatcher}
- {@link org.dspace.checker.SimpleDispatcher}
Dispatchers that modify the behaviour of other Dispatchers: -
- {@link org.dspace.checker.LimitedCountDispatcher}
- {@link org.dspace.checker.LimitedDurationDispatcher}
ChecksumResultsCollector
The default implementation of ChecksumResultsCollector
({@link org.dspace.checker.ResultsLogger}) logs checksum checking to the db,
but it would be simple to write your own implementation to log to LOG4J logs,
text files, JMS queues etc.
Results Pruner
The results pruner is responsible for trimming the archived Checksum logs,
which can grow large otherwise. The retention period of stored check results
can be configured per checksum result code. This allows you, for example, to
retain records for all failures for auditing purposes, whilst discarding the
storage of successful checks. The pruner uses a default configuration from
dspace.cfg, but can take in alternative configurations from other properties
files.
Design notes
All interaction between the checker package and the database is abstracted
behind DataAccessObjects. Where practicable dependencies on DSpace code are
minimized, the rationale being that it may be errors in DSpace code that
have caused fixity problems.
| org.dspace.content |
Provides an API for reading and manipulating content in the DSpace system.
The DSpace Data Model
Data in DSpace is stored in the model below. Multiple inclusion is permitted
at every level; the documentation for each class describes the system's
behaviour for coping with this.
Community |
Communities correspond to organisational units within an
institution. |
Collection |
Collections are groupings of related content. Each collection may
have an associated workflow; this is the review process that
submissions go through before being included in the archive. |
|
Item |
Items are the basic archival units. An item corresponds to a single
logical piece of content and associated metadata. |
Bundle |
Bundles are groupings of Bitstreams that make no sense in isolation;
for example, the files making up an HTML document would all go in one
Bundle. A PDF version of the same Item, or a dataset stored with the Item,
would go in a separate Bundle. |
Bitstream |
Bitstreams are sequences of bits, typically files, that make up the
raw content of Items. |
Additionally, each Bitstream is associated with one Bitstream
Format; this describes information about the format and encoding of
the Bitstream, including a name (for example "Adobe PDF"), a MIME type and a
support level.
Submissions are created as Workspace Items. A Workspace Item
is an Item in progress. Once item assembly is complete, one of two things may
happen:
- If the Collection being submitted to has an associated workflow, it is
started. At this point the Workspace Item becomes a Workflow
Item.
- If the Collection has no associated workflow, the Workspace Item is
removed and the assembled Item is included in the Collection.
Workspace Items and Workflow Items may both be manipulated as In Progress
Submissions.
Using the Content Management API
The general paradigm for using DSpace is to create a Context; this is
akin to opening a connection to a database (which, coincidentally, is one of the
things that happens.)
The classes in this package are then used to create in-memory snapshots that
represent the corresponding logical objects stored in the system. When the
reading or manipulating is done, the Context may either be aborted, in
which case any changes made are discarded, or completed, in which case
any changes made are committed to main DSpace storage.
If any error occurs if you are making changes, you should abort the
current context, since the in-memory snapshots might be in an inconsistent
state.
Typically, when changing a particular object in the system, the changes will
not be written to main DSpace storage unless update is called on
the object prior to Context completion. Where this is not the case, it is
stated in the method documentation.
Instances of the classes in this package are tied to that Context; when the
Context has been finished with the objects essentially become invalid.
An example use of the Content Management API is shown below:
try
{
// Create a DSpace context
context = new org.dspace.core.Context();
// Set the current user
context.setCurrentUser(authenticatedUser)
// Create my new collection
Collection c = Collection.create(context);
c.setMetadata("name", "My New Collection");
c.update(); // Updates the metadata within the context
// Find an item
item = Item.find(context, 1234);
// Remove it from its old collections
Collection[] colls = item.getCollections();
colls[0].removeItem(item);
// Add it to my new collection
c.addItem(item);
// All went well; complete the context so changes are written
context.complete();
}
catch (SQLException se)
{
// Something went wrong with the database; abort the context so
// no changes are written
context.abort();
}
catch (AuthorizeException ae)
{
// authenticatedUser does not have permission to perform one of the
// above actions, so no changes should be made at all.
context.abort();
}
// The context will have been completed or aborted here, so it may
// no longer be used, nor any objects that were created with it (e.g. item)
@see org.dspace.authorize
@see org.dspace.core.Context
| org.dspace.content.crosswalk |
Provides an API and implementations of metadata crosswalks, which are directional mappings from one schema to another, performed in the context of Item ingestion or dissemination. Most crosswalks are driven by a mapping
in a file, which reside in config/crosswalks .
Crosswalk Interfaces
The principle interfaces are for ingest and dissemination contexts, i.e.
the IngestionCrosswalk interface consists of the methods:
public void ingest(Context context, DSpaceObject dso, List metadata)
public void ingest(Context context, DSpaceObject dso, Element root)
The DisseminationCrosswalk interface has methods:
public Namespace[] getNamespaces()
public String getSchemaLocation()
public boolean canDisseminate(DSpaceObject dso)
public List disseminateList(DSpaceObject dso)
public Element disseminateElement(DSpaceObject dso)
Crosswalk Implementations
Crosswalks exist for many formats, includings DC, QDC, METs, MODs, Premis, and a general
implementation employing an XSLT stylesheet.
| org.dspace.content.dao | | org.dspace.content.packager |
Provides an API and implementations of content packages, used in the context of ingest (SIP), or dissemination (DIP)
Packaging Interfaces
The principle interfaces are for ingesters and disseminators.
The PackageIngester interface consists of the method:
WorkspaceItem ingest(Context context, Collection collection, InputStream in,
PackageParameters params, String license)
There is also a 'replace' method, but this is not implemented consistently.
The PackageDisseminator interface consists of the method:
void disseminate(Context context, DSpaceObject object,
PackageParameters params, OutputStream out)
Packaging Implementations
Ingester and disseminator implementations exist for METs and PDF packages, and the
classes are designed to be extended for different profiles.
@see org.dspace.content.packager.AbstractMETSIngester
@see org.dspace.content.packager.AbstractMETSDisseminator
| org.dspace.content.service | | org.dspace.core |
Provides some basic functionality required throughout the DSpace system.
| org.dspace.eperson |
Provides classes representing e-people and groups of e-people.
| org.dspace.event | | org.dspace.handle |
Provides classes and methods to interface with the
CNRI Handle System.
The HandleManager class acts as the
main entry point.
The HandlePlugin class is intended to be
loaded into the CNRI Handle Server. It acts as an adapter, translating
Handle Server API calls into DSpace ones.
Using the Handle API
An example use of the Handle API is shown below:
Item item;
// Create or obtain a context object
Context context;
// Create a Handle for an Item
String handle = HandleManager.createHandle(context, item);
// The canonical form, which can be used for citations
String canonical = HandleManager.getCanonicalForm(handle);
// A URL pointing to the Item
String url = HandleManager.resolveToURL(context, handle);
// Resolve the handle back to an object
Item resolvedItem = (Item) HandleManager.resolveToObject(context, handle);
// From the object, find its handle
String rhandle = HandleManager.findHandle(context, resolvedItem);
Using the HandlePlugin with CNRI Handle Server
In the CNRI Handle Server configuration file, set storage_type to
CUSTOM and storage_class to
org.dspace.handle.HandlePlugin.
| org.dspace.license | | org.dspace.plugin | | org.dspace.search |
Interface to the Lucene search engine, and the 'harvest' API for retrieving items modified within a given date range.
DSpace uses the Jakarta project's Lucene search engine.
Official Lucene Web Site
| org.dspace.sort | | org.dspace.storage.bitstore |
Provides an API for storing, retrieving and deleting streams of bits in
a transactionally safe fashion. The main class is
BitstreamStorageManager.
Using the Bitstore API
An example use of the Bitstore API is shown below:
// Create or obtain a context object
Context context;
// Stream to store
InputStream stream;
try
{
// Store the stream
int id = BitstreamStorageManager.store (context, stream);
// Retrieve it
InputStream retrieved = BitstreamStorageManager.retrieve(context, id);
// Delete it
BitstreamStorageManager.delete(context, id);
// Complete the context object so changes are written
}
// Error with I/O operations
catch (IOException ioe)
{
}
// Database error
catch (SQLException sqle)
{
}
Storage mechanism
The BitstreamStorageManager stores files in one or more asset store
directories. These can be configured in dspace.cfg . For
example:
assetstore.dir = /dspace/assetstore
The above example specifies a single asset store.
assetstore.dir = /dspace/assetstore_0
assetstore.dir.1 = /mnt/other_filesystem/assetstore_1
The above example specifies two asset stores. assetstore.dir
specifies the asset store number 0 (zero); after that use
assetstore.dir.1 , assetstore.dir.2 and so on. The
particular asset store a bitstream is stored in is held in the database, so
don't move bitstreams between asset stores, and don't renumber them.
By default, newly created bitstreams are put in asset store 0 (i.e. the one specified by the
assetstore.dir property.) To change this, for example when asset
store 0 is getting full, add a line to dspace.cfg like:
assetstore.incoming = 1
Then restart DSpace (Tomcat). New bitstreams will be written to the asset
store specified by assetstore.dir.1 , which is
/mnt/other_filesystem/assetstore_1 in the above example.
Moving an Asset Store
You can move an asset store as a whole to a new location in the file system; stop DSpace
(Tomcat), move all of the contents to the new location, change the appropriate
line in dspace.cfg , and restart DSpace (Tomcat).
We will be providing administration tools for more sophisticated management
of these asset stores in the future.
When given a stream of bits to store, the BitstreamStorageManager
generates a unique key for the stream. The key takes the form of
a long sequence of digits, which is transformed into a file path.
The BitstreamStorageManager stores the contents of the stream in
this path, creating parent directories as necessary.
The Bitstore and Transactions
The bitstore is carefully engineered to prevent data loss, using
transactional flags in the database. Before a bitstream is
actually stored, a metadata entry with the unique bitstream id is
committed to the database. If the storage operation fails or is
aborted, the deleted flag remains. The bitstore API then ensures
that the bitstream cannot be retrieved, and after an hour, the
bitstream is eligible for cleanup. The bitstream is accessible only
after all database operations have been successfully committed.
Similarly, bitstreams are deleted by simply setting the deleted
flag. If an deletion operation is rolled back, the bitstream is still
present in the asset store.
Cleaning up the Asset Store
As noted above, sometimes files will be physically present in the
Asset Store even though they are marked deleted in the database.
You can use the command-line utility class
org.dspace.storage.bitstore.Cleanup (which is invoked via
/dspace/bin/cleanup )
to remove the bitstreams which are marked deleted from the Asset Store.
To prevent accidental deletion of bitstreams which are in the process
of being stored, cleanup only removes bitstreams which are more than
an hour old.
| org.dspace.storage.rdbms |
Provides an API for accessing a relational database management system.
The main class is {@link org.dspace.storage.rdbms.DatabaseManager}, which
executes SQL queries and returns {@link org.dspace.storage.rdbms.TableRow}
or {@link org.dspace.storage.rdbms.TableRowIterator} objects.
The {@link org.dspace.storage.rdbms.InitializeDatabase} class is used to
load SQL into the RDBMS via JDBC.
Using the Database API
An example use of the Database API is shown below. Note that in most
cases, direct use of the Database API is discouraged; you should use
the Content API, which
encapsulates use of the database.
The query and querySingle have two sets of
parameters. If you are merely reading data, and will not be changing any
values, you can use the forms without the table parameter. This
allows you to perform queries with results pulled from multiple tables, for
example:
TableRowIterator readOnlyRows = DatabaseManager.query(context,
"SELECT handle.handle, item.submitter_id FROM handle, item WHERE
handle.resource_id=item.item_id");
If you do wish to update the rows, you'll need to use the forms including the
table parameter, for example:
TableRow updateable = DatabaseManager.querySingle(context,
"item",
"SELECT * FROM item WHERE item_id=1234");
updateable.setColumn("submitter_id", 5678);
DatabaseManager.update(context, updateable);
More example usage:
// Create or obtain a context object
Context context;
try
{
// Run an arbitrary query
// Each object returned by the iterator is a TableRow,
// with values obtained from the results of the query
TableRowIterator iterator = DatabaseManager.query(context,
"community",
"SELECT * FROM Community WHERE name LIKE 'T%'");
// Find a single row, using an arbitrary query
// If no rows are found, then null is returned.
TableRow row = DatabaseManager.querySingle(context,
"SELECT * FROM EPerson WHERE email = 'pbreton@mit.edu'");
// Run an insert, update or delete SQL command
// Returns the number of rows affected.
int rowsAffected = DatabaseManager.updateQuery(context,
"DELETE FROM EPersonGroup WHERE name LIKE 'collection_100_%'");
// Find a row in a particular table
// This will return the row in the eperson table with id 1, or null
// if no such row exists
TableRow epersonrow = DatabaseManager.find(context, "eperson", 1);
// Create a new row, and assign a primary key
TableRow newrow = DatabaseManager.create(context, "Collection");
newrow.setColumn("name", "Test Collection for example code");
newrow.setColumn("provenance_description", "Created via test program");
// Save changes to the database
DatabaseManager.update(context, newrow);
// Delete the row
DatabaseManager.delete(context, newrow);
// Make sure all changes are committed
context.complete();
}
catch (SQLException sqle)
{
// Handle database error
}
Database IDs
All tables in the DSpace system have a single primary key, which is an
integer. The primary key column is named for the table; for example,
the EPerson table has eperson_id.
Assigning database IDs is done by invoking the SQL function
getnextid with the table name as a single parameter. The database
backend may implement this in any suitable way; it should be robust to access
via multiple simultaneous clients and transactions.
| org.dspace.submit | | org.dspace.submit.step | | org.dspace.sword | | org.dspace.text.filter | | org.dspace.workflow |
DSpace's workflow system
DSpace has a simple workflow system, which models the workflows
as 5 steps: SUBMIT, three intermediate steps (STEP1, STEP2, STEP3), and ARCHIVE.
When an item is submitted to DSpace, it is in the SUBMIT state. If there
are no intermediate states defined, then it proceeds directly to ARCHIVE and
is put into the main DSpace archive.
EPerson groups may be assigned to the three possible intermediate steps,
where they are expected to act on the item at those steps. For example,
if a Collection's owners desire a review step, they would create a Group
of reviewers, and assign that Group to step 1. The members of step 1's
Group will receive emails asking them to reiview the submission, and
will need to perform an action on the item before it can be rejected
back to the submitter or placed in the archive.
| org.purl.sword.base | | org.purl.sword.server | | org.purl.sword.test | | org.w3.atom | |
|