| java.lang.Object org.apache.lucene.analysis.TokenStream
All known Subclasses: org.apache.lucene.index.RepeatingTokenStream, org.apache.lucene.analysis.Tokenizer, org.apache.lucene.analysis.TokenFilter,
TokenStream | abstract public class TokenStream (Code) | | A TokenStream enumerates the sequence of tokens, either from
fields of a document or from query text.
This is an abstract class. Concrete subclasses are:
-
Tokenizer , a TokenStream
whose input is a Reader; and
-
TokenFilter , a TokenStream
whose input is another TokenStream.
NOTE: subclasses must override at least one of
TokenStream.next() or
TokenStream.next(Token) .
|
Method Summary | |
public void | close() Releases resources associated with this stream. | public Token | next() Returns the next token in the stream, or null at EOS.
The returned Token is a "full private copy" (not
re-used across calls to next()) but will be slower
than calling
TokenStream.next(Token) instead.. | public Token | next(Token result) Returns the next token in the stream, or null at EOS.
When possible, the input Token should be used as the
returned Token (this gives fastest tokenization
performance), but this is not required and a new Token
may be returned. | public void | reset() Resets this stream to the beginning. |
close | public void close() throws IOException(Code) | | Releases resources associated with this stream.
|
next | public Token next() throws IOException(Code) | | Returns the next token in the stream, or null at EOS.
The returned Token is a "full private copy" (not
re-used across calls to next()) but will be slower
than calling
TokenStream.next(Token) instead..
|
next | public Token next(Token result) throws IOException(Code) | | Returns the next token in the stream, or null at EOS.
When possible, the input Token should be used as the
returned Token (this gives fastest tokenization
performance), but this is not required and a new Token
may be returned. Callers may re-use a single Token
instance for successive calls to this method.
This implicitly defines a "contract" between
consumers (callers of this method) and
producers (implementations of this method
that are the source for tokens):
- A consumer must fully consume the previously
returned Token before calling this method again.
- A producer must call
Token.clear before setting the fields in it & returning it
Note that a
TokenFilter is considered a consumer.
Parameters: result - a Token that may or may not be used to return next token in the stream or null if end-of-stream was hit |
reset | public void reset() throws IOException(Code) | | Resets this stream to the beginning. This is an
optional operation, so subclasses may or may not
implement this method. Reset() is not needed for
the standard indexing process. However, if the Tokens
of a TokenStream are intended to be consumed more than
once, it is necessary to implement reset().
|
|
|