org.dspace.checker Package Documentation
Provides content fixity checking (using checksums)
for bitstreams stored in DSpace software.
The main access point to org.dspace.checker is on the command line
via {@link org.dspace.app.checker.ChecksumChecker#main(String[])},
but it is also simple to get programmatic access to ChecksumChecker
if you wish, via a {@link org.dspace.checker.CheckerCommand} object.
CheckerCommand is a simple Command object. You initalize it with
a strategy for iterating through bitstreams to check (an implementation of
{@link org.dspace.checker.BitstreamDispatcher}), and a object to collect
the results (an implementation of @link org.dspace.checker.ChecksumResultsCollector})
, and then call {@link org.dspace.checker.CheckerCommand#process()}
to begin the processing. CheckerCommand handles the calculation of bitstream
checksums and iteration between bitstreams.
BitstreamDispatcher
The order in which bitstreams are checked and when a checking run terminates
is controlled by implementations of BitstreamDispatcher, and you can extend the
functionality of the package by writing your own implementatio of this simple
interface, although the package includes several useful implementations that will
probably suffice in most cases: -
Dispatchers that generate bitstream ordering: -
- {@link org.dspace.checker.ListDispatcher}
- {@link org.dspace.checker.SimpleDispatcher}
Dispatchers that modify the behaviour of other Dispatchers: -
- {@link org.dspace.checker.LimitedCountDispatcher}
- {@link org.dspace.checker.LimitedDurationDispatcher}
ChecksumResultsCollector
The default implementation of ChecksumResultsCollector
({@link org.dspace.checker.ResultsLogger}) logs checksum checking to the db,
but it would be simple to write your own implementation to log to LOG4J logs,
text files, JMS queues etc.
Results Pruner
The results pruner is responsible for trimming the archived Checksum logs,
which can grow large otherwise. The retention period of stored check results
can be configured per checksum result code. This allows you, for example, to
retain records for all failures for auditing purposes, whilst discarding the
storage of successful checks. The pruner uses a default configuration from
dspace.cfg, but can take in alternative configurations from other properties
files.
Design notes
All interaction between the checker package and the database is abstracted
behind DataAccessObjects. Where practicable dependencies on DSpace code are
minimized, the rationale being that it may be errors in DSpace code that
have caused fixity problems.
|