| java.lang.Object de.susebox.jtopas.AbstractTokenizerProperties de.susebox.jtopas.StandardTokenizerProperties
CHARFLAG_SEPARATOR | final public static int CHARFLAG_SEPARATOR(Code) | | character flag for whitespaces
|
CHARFLAG_WHITESPACE | final public static int CHARFLAG_WHITESPACE(Code) | | character flag for whitespaces
|
MAX_NONFREE_MATCHLEN | final public static short MAX_NONFREE_MATCHLEN(Code) | | Maximum length of a non-free pattern match. These are patterns that dont
have the
TokenizerProperties.F_FREE_PATTERN flag set. A common
example are number patterns.
|
_charFlags | protected int _charFlags(Code) | | array containing the flags for whitespaces and separators
|
_extCharFlags | protected HashMap _extCharFlags(Code) | | Map with flags for characters beyond 256;
|
_patterns | protected ArrayList _patterns(Code) | | This array contains the patterns
|
_separatorsCase | protected String _separatorsCase(Code) | | current separator characters including character ranges.
|
_separatorsNoCase | protected String _separatorsNoCase(Code) | | current separator characters including character ranges. Case is ignored.
|
_whitespacesCase | protected String _whitespacesCase(Code) | | current whitespace characters including character ranges.
|
_whitespacesNoCase | protected String _whitespacesNoCase(Code) | | current whitespace characters including character ranges. Case is ignored.
|
StandardTokenizerProperties | public StandardTokenizerProperties()(Code) | | Default constructor that intitializes an instance with the default whitespaces
and separator sets.
Tokenizer instances using this StandardTokenizerProperties
object, split text between spaces, tabs and line ending sequences as well
as between punctuation characters.
|
addSpecialSequence | protected TokenizerProperty addSpecialSequence(TokenizerProperty property)(Code) | | This method adds or replaces strings, comments and ordinary special sequences.
The method assumes that the given special sequence property has been checked
for not being null, having a non-empty imagesand normalized flags
(
AbstractTokenizerProperties.normalizeFlags ).
Parameters: property - the description of the new sequence the replaced special sequence property or null |
doGetProperty | protected TokenizerProperty doGetProperty(int type, String startImage)(Code) | | Retrieving a property by a given type and image. See the method description
in
AbstractTokenizerProperties for details.
Parameters: type - the type the returned property should have Parameters: startImage - the (starting) image the token description for the image or null |
doSetSeparators | protected String doSetSeparators(String separators)(Code) | | Setting a new separator set. See the method description in
AbstractTokenizerProperties for details.
Parameters: separators - the set of separators including ranges the replaced separator set or null |
doSetWhitespaces | protected String doSetWhitespaces(String whitespaces)(Code) | | Setting a new whitespace set. See the method description in
AbstractTokenizerProperties for details.
Parameters: whitespaces - the set of whitespaces including ranges the replaced whitespace set or null |
getSequenceMaxLength | public int getSequenceMaxLength()(Code) | | This method returns the length of the longest special sequence, comment or
string prefix that is known to this SequenceHandler . When
calling
StandardTokenizerProperties.startsWithSequenceCommentOrString , the passed
DataProvider parameter will supply at least this number of characters (see
DataProvider.getLength ).
If less characters are provided, EOF is reached.
the number of characters needed in the worst case to identify a special sequence |
hasKeywords | public boolean hasKeywords()(Code) | | This method can be used by a
de.susebox.jtopas.Tokenizer implementation
for a fast detection if keyword matching must be performed at all. If the method
returns false time-consuming preparations can be skipped.
true if there actually are pattern that can be testedfor a match, false otherwise. |
hasPattern | public boolean hasPattern()(Code) | | This method can be used by a
de.susebox.jtopas.Tokenizer implementation
for a fast detection if pattern matching must be performed at all. If the method
returns false time-consuming preparations can be skipped.
true if there actually are pattern that can be testedfor a match, false otherwise. |
hasSequenceCommentOrString | public boolean hasSequenceCommentOrString()(Code) | | This method can be used by a
de.susebox.jtopas.Tokenizer implementation
for a fast detection if special sequence checking must be performed at all.
If the method returns false time-consuming preparations can be
skipped.
true if there actually are pattern that can be testedfor a match, false otherwise. |
isSeparator | public boolean isSeparator(char testChar)(Code) | | This method checks the given character if it is a separator.
Parameters: testChar - check this character true if the given character is a separator,false otherwise |
isWhitespace | public boolean isWhitespace(char testChar)(Code) | | This method checks if the character is a whitespace. Implement Your own
code for situations where this default implementation is not fast enough
or otherwise not really good.
Parameters: testChar - check this character true if the given character is a whitespace,false otherwise |
newlineIsWhitespace | public boolean newlineIsWhitespace()(Code) | | If a
Tokenizer performs line counting, it is often nessecary to
know if newline characters is considered to be a whitespace. See
WhitespaceHandler for details.
true if newline characters are in the current whitespace set,false otherwise |
startsWithSequenceCommentOrString | public TokenizerProperty startsWithSequenceCommentOrString(DataProvider dataProvider) throws TokenizerException, NullPointerException(Code) | | This method checks if a given range of data starts with a special sequence,
a comment or a string. These three types of token are testet together since
both comment and string prefixes are ordinary special sequences. Only the
actions preformed after a string or comment has been detected,
are different.
The method returns null if no special sequence, comment or string
could matches the the leading part of the data range given through the
DataProvider .
In cases of strings or comments, the return value contains the description
for the introducing character sequence, NOT the whole
string or comment. The reading of the rest of the string or comment is done
by the calling
de.susebox.jtopas.Tokenizer .
Parameters: dataProvider - the source to get the data range from a de.susebox.jtopas.TokenizerProperty if a special sequence, comment or string could be detected, null otherwise throws: TokenizerException - failure while reading more data throws: NullPointerException - if no DataProvider is given |
Fields inherited from de.susebox.jtopas.AbstractTokenizerProperties | protected int _flags(Code)(Java Doc)
|
Methods inherited from de.susebox.jtopas.AbstractTokenizerProperties | public void addBlockComment(String start, String end) throws IllegalArgumentException(Code)(Java Doc) public void addBlockComment(String start, String end, Object companion) throws IllegalArgumentException(Code)(Java Doc) public void addBlockComment(String start, String end, Object companion, int flags) throws IllegalArgumentException(Code)(Java Doc) public void addBlockComment(String start, String end, Object companion, int flags, int flagMask) throws IllegalArgumentException(Code)(Java Doc) public void addKeyword(String keyword) throws IllegalArgumentException(Code)(Java Doc) public void addKeyword(String keyword, Object companion) throws IllegalArgumentException(Code)(Java Doc) public void addKeyword(String keyword, Object companion, int flags) throws IllegalArgumentException(Code)(Java Doc) public void addKeyword(String keyword, Object companion, int flags, int flagMask) throws IllegalArgumentException(Code)(Java Doc) public void addLineComment(String lineComment) throws IllegalArgumentException(Code)(Java Doc) public void addLineComment(String lineComment, Object companion) throws IllegalArgumentException(Code)(Java Doc) public void addLineComment(String lineComment, Object companion, int flags) throws IllegalArgumentException(Code)(Java Doc) public void addLineComment(String lineComment, Object companion, int flags, int flagMask) throws IllegalArgumentException(Code)(Java Doc) public void addPattern(String pattern) throws IllegalArgumentException(Code)(Java Doc) public void addPattern(String pattern, Object companion) throws IllegalArgumentException(Code)(Java Doc) public void addPattern(String pattern, Object companion, int flags) throws IllegalArgumentException(Code)(Java Doc) public void addPattern(String pattern, Object companion, int flags, int flagMask) throws IllegalArgumentException(Code)(Java Doc) public void addProperty(TokenizerProperty property) throws IllegalArgumentException(Code)(Java Doc) public void addSeparators(String separators) throws IllegalArgumentException(Code)(Java Doc) public void addSpecialSequence(String specSeq) throws IllegalArgumentException(Code)(Java Doc) public void addSpecialSequence(String specSeq, Object companion) throws IllegalArgumentException(Code)(Java Doc) public void addSpecialSequence(String specSeq, Object companion, int flags) throws IllegalArgumentException(Code)(Java Doc) public void addSpecialSequence(String specSeq, Object companion, int flags, int flagMask) throws IllegalArgumentException(Code)(Java Doc) public void addString(String start, String end, String escape) throws IllegalArgumentException(Code)(Java Doc) public void addString(String start, String end, String escape, Object companion) throws IllegalArgumentException(Code)(Java Doc) public void addString(String start, String end, String escape, Object companion, int flags) throws IllegalArgumentException(Code)(Java Doc) public void addString(String start, String end, String escape, Object companion, int flags, int flagMask) throws IllegalArgumentException(Code)(Java Doc) public void addTokenizerPropertyListener(TokenizerPropertyListener listener)(Code)(Java Doc) public void addWhitespaces(String whitespaces) throws IllegalArgumentException(Code)(Java Doc) public boolean blockCommentExists(String start)(Code)(Java Doc) protected void checkArgument(String arg, String name) throws IllegalArgumentException(Code)(Java Doc) protected void checkPropertyArgument(TokenizerProperty property) throws IllegalArgumentException(Code)(Java Doc) abstract protected TokenizerProperty doAddProperty(TokenizerProperty property)(Code)(Java Doc) abstract protected TokenizerProperty doGetProperty(int type, String startImage)(Code)(Java Doc) abstract protected TokenizerProperty doRemoveProperty(TokenizerProperty property)(Code)(Java Doc) abstract protected String doSetSeparators(String separators)(Code)(Java Doc) abstract protected String doSetWhitespaces(String whitespaces)(Code)(Java Doc) public TokenizerProperty getBlockComment(String start) throws IllegalArgumentException(Code)(Java Doc) public Object getBlockCommentCompanion(String start) throws IllegalArgumentException(Code)(Java Doc) public TokenizerProperty getKeyword(String keyword) throws IllegalArgumentException(Code)(Java Doc) public Object getKeywordCompanion(String keyword) throws IllegalArgumentException(Code)(Java Doc) public TokenizerProperty getLineComment(String lineComment) throws IllegalArgumentException(Code)(Java Doc) public Object getLineCommentCompanion(String lineComment) throws IllegalArgumentException(Code)(Java Doc) public int getParseFlags()(Code)(Java Doc) public TokenizerProperty getPattern(String pattern) throws IllegalArgumentException(Code)(Java Doc) public Object getPatternCompanion(String pattern) throws IllegalArgumentException(Code)(Java Doc) public TokenizerProperty getSpecialSequence(String specSeq) throws IllegalArgumentException(Code)(Java Doc) public Object getSpecialSequenceCompanion(String specSeq) throws IllegalArgumentException(Code)(Java Doc) public TokenizerProperty getString(String start) throws IllegalArgumentException(Code)(Java Doc) public Object getStringCompanion(String start) throws IllegalArgumentException(Code)(Java Doc) protected void handleEvent(int type, String newValue, String oldValue)(Code)(Java Doc) public boolean isFlagSet(int flag)(Code)(Java Doc) public boolean isFlagSet(TokenizerProperty prop, int flag) throws NullPointerException(Code)(Java Doc) public boolean keywordExists(String keyword)(Code)(Java Doc) public boolean lineCommentExists(String lineComment)(Code)(Java Doc) protected void notifyListeners(TokenizerPropertyEvent event)(Code)(Java Doc) public boolean patternExists(String pattern)(Code)(Java Doc) public boolean propertyExists(TokenizerProperty property)(Code)(Java Doc) public void removeBlockComment(String start) throws IllegalArgumentException(Code)(Java Doc) public void removeKeyword(String keyword) throws IllegalArgumentException(Code)(Java Doc) public void removeLineComment(String lineComment) throws IllegalArgumentException(Code)(Java Doc) public void removePattern(String pattern) throws IllegalArgumentException(Code)(Java Doc) public void removeProperty(TokenizerProperty property) throws IllegalArgumentException(Code)(Java Doc) public void removeSeparators(String separators) throws IllegalArgumentException(Code)(Java Doc) public void removeSpecialSequence(String specSeq) throws IllegalArgumentException(Code)(Java Doc) public void removeString(String start) throws IllegalArgumentException(Code)(Java Doc) public void removeTokenizerPropertyListener(TokenizerPropertyListener listener)(Code)(Java Doc) public void removeWhitespaces(String whitespaces) throws IllegalArgumentException(Code)(Java Doc) public void setParseFlags(int flags)(Code)(Java Doc) public void setSeparators(String separators)(Code)(Java Doc) public void setWhitespaces(String whitespaces)(Code)(Java Doc) public boolean specialSequenceExists(String specSeq)(Code)(Java Doc) public boolean stringExists(String start)(Code)(Java Doc)
|
|
|