Provides a (Replay)CharSequence view on recorded stream bytes (a prefix
buffer and overflow backing file).
Treats the byte stream as 8-bit.
Uses a wraparound rolling buffer of the last windowSize bytes read
from disk in memory; as long as the 'random access' of a CharSequence
user stays within this window, access should remain fairly efficient.
(So design any regexps pointed at these CharSequences to work within
that range!)
When rereading of a location is necessary, the whole window is
recentered around the location requested. (TODO: More research
into whether this is the best strategy.)
An implementation of a ReplayCharSequence done with ByteBuffers -- one
to wrap the passed prefix buffer and the second, a memory-mapped
ByteBuffer view into the backing file -- was consistently slower: ~10%.
My tests did the following. Made a buffer filled w/ regular content.
This buffer was used as the prefix buffer. The buffer content was
written MULTIPLER times to a backing file. I then did accesses w/ the
following pattern: Skip forward 32 bytes, then back 16 bytes, and then
read forward from byte 16-32. Repeat. Though I varied the size of the
buffer to the size of the backing file,from 3-10, the difference of 10%
or so seemed to persist. Same if I tried to favor get() over get(index).
I used a profiler, JMP, to study times taken (St.Ack did above comment).
TODO determine in memory mapped files is better way to do this;
probably not -- they don't offer the level of control over
total memory used that this approach does.
author: Gordon Mohr version: $Revision: 5027 $, $Date: 2007-03-29 00:30:33 +0000 (Thu, 29 Mar 2007) $ |