Contains utility classes and interfaces for sequences (lists), arrays, and trees.
The iteration and position framework
A position points between two elements in its "containing"
sequence, or it points before the first element (if any), or after the last.
Sometimes, we loosely say that the position refers to or points to
an element of the sequence; in that case we mean the position is
immediately before the element.
An iterator is an object that has a current position,
but that can be moved so it gets a new position.
What needs to be emphasized is that all
Sequence implementations.
use the same SeqPosition
class to implement positions and iteration.
In other "collections frameworks" each sequence class has its corresponding
iterator class, but in this framework you instead add the methods
that handle iteration to the sequence class itself, extending
the AbstractSequence
class.
A consequence of this is that if you have an unused
SeqPosition
object, you can
use it to iterate over any Sequence
you choose.
This potentially saves memory allocation, which gains you the most
when iterating over a nested sequence structure, like a tree of XML data.
We do this by requiring that any position be representable using a pair
of an int
and an Object
.
In other words, the state of an iterator is
restricted to be an (int
, Object
) pair.
Of course that is all you could need, since if you need more
state you can always create a helper object,
and store the extra state there. The assumption we make though is that for
most Sequence
s, an (int
, Object
)
would be enough, without creating any new objects (beyond,
sometimes, the SeqPosition
itself).
The int
part of a position-state is normally
named ipos
, and the Object
part of a
position-state is named xpos
.
Neither of these values have any meaning in themselves, they only have
meaning as interpreted by the specific Sequence
.
For example, ipos
might be the offset of the position in
the sequence, but it might also some "magic cookie".
If a Sequence
supports insertions and deletions,
and you want positions to be automatically adjusted, then a position
has to be a magic cookie rather than an offset.
(A typical implementation is for such a position to be an index
into a table of positions managed by the Sequence
itself.)
So a complete position is actually a (int
, Object
,
AbstractSequence
) triple, where the AbstractSequence
component is the object that interprets the (int
,
Object
) pair.
Normally, the AbstractSequence
part of a position triple is the
Sequence
"containing" the position, but it can be any
AbstractSequence
that implements the various methods that provide the semantics
for the (int
, Object
) pair.
The PositionContainer
interface is a more abstract version of SeqPosition
, in
that it can represents more than one position.
For example, instead of an array of separately allocated
SeqPosition
objects, you might use some data structure
that implements PositionContainer
,
which is abstractly a list of (int
, Object
,
Sequence
)-triples.
The consumer protocol
You typically use an iteration framework it process the elements
of a sequence in such a way that the data consumer is active
and in control, while the sequence itself (the data producer) is a
passive object. The Consumer
works the other way round: The data producer is active and in control,
while the data consumer is passive, consuming elements when requested
by the data producer. The Consumer
is a more abstract
variant of SAX's
ContentHandler
interface for processing "events";
however Consumer
is intended for general value sequences,
and is less tied to XML.
Per Bothner
Last modified: Sun Jun 17 10:32:21 CST 2001