| org.archive.util.iterator.RegexpLineIterator
RegexpLineIterator | public class RegexpLineIterator extends TransformingIteratorWrapper (Code) | | Utility class providing an Iterator interface over line-oriented
text input. By providing regexps indicating lines to ignore
(such as pure whitespace or comments), lines to consider input, and
what to return from the input lines (such as a whitespace-trimmed
non-whitespace token with optional trailing comment), this can
be configured to handle a number of formats.
The public static members provide pattern configurations that will
be helpful in a wide variety of contexts.
author: gojomo |
NONWHITESPACE_ENTRY_TRAILING_COMMENT | final public static String NONWHITESPACE_ENTRY_TRAILING_COMMENT(Code) | | |
TRIMMED_ENTRY_TRAILING_COMMENT | final public static String TRIMMED_ENTRY_TRAILING_COMMENT(Code) | | |
transform | protected String transform(String line)(Code) | | Loads next item into lookahead spot, if available. Skips
lines matching ignoreLine; extracts desired portion of
lines matching extractLine; informationally reports any
lines matching neither.
whether any item was loaded into next field |
|
|