| java.lang.Object org.apache.oro.text.regex.Perl5Matcher
Perl5Matcher | final public class Perl5Matcher implements PatternMatcher(Code) | | The Perl5Matcher class is used to match regular expressions
(conforming to the Perl5 regular expression syntax) generated by
Perl5Compiler.
Perl5Compiler and Perl5Matcher are designed with the intent that
you use a separate instance of each per thread to avoid the overhead
of both synchronization and concurrent access (e.g., a match that takes
a long time in one thread will block the progress of another thread with
a shorter match). If you want to use a single instance of each
in a concurrent program, you must appropriately protect access to
the instances with critical sections. If you want to share Perl5Pattern
instances between concurrently executing instances of Perl5Matcher, you
must compile the patterns with
Perl5Compiler.READ_ONLY_MASK .
version: @version@ since: 1.0 See Also: PatternMatcher See Also: Perl5Compiler |
Method Summary | |
char[] | _toLower(char[] input) | public boolean | contains(String input, Pattern pattern) Determines if a string contains a pattern. | public boolean | contains(char[] input, Pattern pattern) Determines if a string (represented as a char[]) contains a pattern.
If the pattern is
matched by some substring of the input, a MatchResult instance
representing the first such match is made acessible via
Perl5Matcher.getMatch() . | public boolean | contains(PatternMatcherInput input, Pattern pattern) Determines if the contents of a PatternMatcherInput, starting from the
current offset of the input contains a pattern.
If a pattern match is found, a MatchResult
instance representing the first such match is made acessible via
Perl5Matcher.getMatch() . | public MatchResult | getMatch() Fetches the last match found by a call to a matches() or contains()
method. | public boolean | isMultiline() | public boolean | matches(char[] input, Pattern pattern) Determines if a string (represented as a char[]) exactly
matches a given pattern. | public boolean | matches(String input, Pattern pattern) Determines if a string exactly matches a given pattern. | public boolean | matches(PatternMatcherInput input, Pattern pattern) Determines if the contents of a PatternMatcherInput instance
exactly matches a given pattern. | public boolean | matchesPrefix(char[] input, Pattern pattern, int offset) Determines if a prefix of a string (represented as a char[])
matches a given pattern, starting from a given offset into the string.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() .
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The char[] to test for a prefix match. Parameters: pattern - The Pattern to be matched. Parameters: offset - The offset at which to start searching for the prefix. | public boolean | matchesPrefix(char[] input, Pattern pattern) Determines if a prefix of a string (represented as a char[])
matches a given pattern.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() .
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The char[] to test for a prefix match. Parameters: pattern - The Pattern to be matched. | public boolean | matchesPrefix(String input, Pattern pattern) Determines if a prefix of a string matches a given pattern.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() .
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The String to test for a prefix match. Parameters: pattern - The Pattern to be matched. | public boolean | matchesPrefix(PatternMatcherInput input, Pattern pattern) Determines if a prefix of a PatternMatcherInput instance
matches a given pattern. | public void | setMultiline(boolean multiline) Set whether or not subsequent calls to
Perl5Matcher.matches matches() or
Perl5Matcher.contains contains() should treat the input as
consisting of multiple lines. |
_toLower | char[] _toLower(char[] input)(Code) | | |
contains | public boolean contains(String input, Pattern pattern)(Code) | | Determines if a string contains a pattern. If the pattern is
matched by some substring of the input, a MatchResult instance
representing the first such match is made acessible via
Perl5Matcher.getMatch() . If you want to access
subsequent matches you should either use a PatternMatcherInput object
or use the offset information in the MatchResult to create a substring
representing the remaining input. Using the MatchResult offset
information is the recommended method of obtaining the parts of the
string preceeding the match and following the match.
The pattern must be a Perl5Pattern instance, otherwise a
ClassCastException will be thrown. You are not required to, and
indeed should NOT try to (for performance reasons), catch a
ClassCastException because it will never be thrown as long as you use
a Perl5Pattern as the pattern parameter.
Parameters: input - The String to test for a match. Parameters: pattern - The Perl5Pattern to be matched. True if the input contains a pattern match, false otherwise. exception: ClassCastException - If a Pattern instance other than aPerl5Pattern is passed as the pattern parameter. |
contains | public boolean contains(char[] input, Pattern pattern)(Code) | | Determines if a string (represented as a char[]) contains a pattern.
If the pattern is
matched by some substring of the input, a MatchResult instance
representing the first such match is made acessible via
Perl5Matcher.getMatch() . If you want to access
subsequent matches you should either use a PatternMatcherInput object
or use the offset information in the MatchResult to create a substring
representing the remaining input. Using the MatchResult offset
information is the recommended method of obtaining the parts of the
string preceeding the match and following the match.
The pattern must be a Perl5Pattern instance, otherwise a
ClassCastException will be thrown. You are not required to, and
indeed should NOT try to (for performance reasons), catch a
ClassCastException because it will never be thrown as long as you use
a Perl5Pattern as the pattern parameter.
Parameters: input - The char[] to test for a match. Parameters: pattern - The Perl5Pattern to be matched. True if the input contains a pattern match, false otherwise. exception: ClassCastException - If a Pattern instance other than aPerl5Pattern is passed as the pattern parameter. |
contains | public boolean contains(PatternMatcherInput input, Pattern pattern)(Code) | | Determines if the contents of a PatternMatcherInput, starting from the
current offset of the input contains a pattern.
If a pattern match is found, a MatchResult
instance representing the first such match is made acessible via
Perl5Matcher.getMatch() . The current offset of the
PatternMatcherInput is set to the offset corresponding to the end
of the match, so that a subsequent call to this method will continue
searching where the last call left off. You should remember that the
region between the begin and end offsets of the PatternMatcherInput are
considered the input to be searched, and that the current offset
of the PatternMatcherInput reflects where a search will start from.
Matches extending beyond the end offset of the PatternMatcherInput
will not be matched. In other words, a match must occur entirely
between the begin and end offsets of the input. See
PatternMatcherInput for more details.
As a side effect, if a match is found, the PatternMatcherInput match
offset information is updated. See the
PatternMatcherInput.setMatchOffsets(intint) method for more details.
The pattern must be a Perl5Pattern instance, otherwise a
ClassCastException will be thrown. You are not required to, and
indeed should NOT try to (for performance reasons), catch a
ClassCastException because it will never be thrown as long as you use
a Perl5Pattern as the pattern parameter.
This method is usually used in a loop as follows:
PatternMatcher matcher;
PatternCompiler compiler;
Pattern pattern;
PatternMatcherInput input;
MatchResult result;
compiler = new Perl5Compiler();
matcher = new Perl5Matcher();
try {
pattern = compiler.compile(somePatternString);
} catch(MalformedPatternException e) {
System.err.println("Bad pattern.");
System.err.println(e.getMessage());
return;
}
input = new PatternMatcherInput(someStringInput);
while(matcher.contains(input, pattern)) {
result = matcher.getMatch();
// Perform whatever processing on the result you want.
}
Parameters: input - The PatternMatcherInput to test for a match. Parameters: pattern - The Pattern to be matched. True if the input contains a pattern match, false otherwise. exception: ClassCastException - If a Pattern instance other than aPerl5Pattern is passed as the pattern parameter. |
getMatch | public MatchResult getMatch()(Code) | | Fetches the last match found by a call to a matches() or contains()
method. If you plan on modifying the original search input, you
must call this method BEFORE you modify the original search input,
as a lazy evaluation technique is used to create the MatchResult.
This reduces the cost of pattern matching when you don't care about
the actual match and only care if the pattern occurs in the input.
Otherwise, a MatchResult would be created for every match found,
whether or not the MatchResult was later used by a call to getMatch().
A MatchResult instance containing the pattern match foundby the last call to any one of the matches() or contains()methods. If no match was found by the last call, returnsnull. |
isMultiline | public boolean isMultiline()(Code) | | True if the matcher is treating input as consisting of multiplelines with respect to the ^ and $ metacharacters,false otherwise. |
matches | public boolean matches(char[] input, Pattern pattern)(Code) | | Determines if a string (represented as a char[]) exactly
matches a given pattern. If
there is an exact match, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() . The pattern must be
a Perl5Pattern instance, otherwise a ClassCastException will
be thrown. You are not required to, and indeed should NOT try to
(for performance reasons), catch a ClassCastException because it
will never be thrown as long as you use a Perl5Pattern as the pattern
parameter.
Note: matches() is not the same as sticking a ^ in front of
your expression and a $ at the end of your expression in Perl5
and using the =~ operator, even though in many cases it will be
equivalent. matches() literally looks for an exact match according
to the rules of Perl5 expression matching. Therefore, if you have
a pattern foo|foot and are matching the input foot
it will not produce an exact match. But foot|foo will
produce an exact match for either foot or foo.
Remember, Perl5 regular expressions do not match the longest
possible match. From the perlre manpage:
Alternatives are tried from left to right, so the first
alternative found for which the entire expression matches,
is the one that is chosen. This means that alternatives
are not necessarily greedy. For example: when matching
foo|foot against "barefoot", only the "foo" part will
match, as that is the first alternative tried, and it
successfully matches the target string.
Parameters: input - The char[] to test for an exact match. Parameters: pattern - The Perl5Pattern to be matched. True if input matches pattern, false otherwise. exception: ClassCastException - If a Pattern instance other than aPerl5Pattern is passed as the pattern parameter. |
matches | public boolean matches(String input, Pattern pattern)(Code) | | Determines if a string exactly matches a given pattern. If
there is an exact match, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() . The pattern must be
a Perl5Pattern instance, otherwise a ClassCastException will
be thrown. You are not required to, and indeed should NOT try to
(for performance reasons), catch a ClassCastException because it
will never be thrown as long as you use a Perl5Pattern as the pattern
parameter.
Note: matches() is not the same as sticking a ^ in front of
your expression and a $ at the end of your expression in Perl5
and using the =~ operator, even though in many cases it will be
equivalent. matches() literally looks for an exact match according
to the rules of Perl5 expression matching. Therefore, if you have
a pattern foo|foot and are matching the input foot
it will not produce an exact match. But foot|foo will
produce an exact match for either foot or foo.
Remember, Perl5 regular expressions do not match the longest
possible match. From the perlre manpage:
Alternatives are tried from left to right, so the first
alternative found for which the entire expression matches,
is the one that is chosen. This means that alternatives
are not necessarily greedy. For example: when matching
foo|foot against "barefoot", only the "foo" part will
match, as that is the first alternative tried, and it
successfully matches the target string.
Parameters: input - The String to test for an exact match. Parameters: pattern - The Perl5Pattern to be matched. True if input matches pattern, false otherwise. exception: ClassCastException - If a Pattern instance other than aPerl5Pattern is passed as the pattern parameter. |
matches | public boolean matches(PatternMatcherInput input, Pattern pattern)(Code) | | Determines if the contents of a PatternMatcherInput instance
exactly matches a given pattern. If
there is an exact match, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() . Unlike the
Perl5Matcher.contains(PatternMatcherInput,Pattern) method, the current offset of the PatternMatcherInput argument
is not updated. You should remember that the region between
the begin (NOT the current) and end offsets of the PatternMatcherInput
will be tested for an exact match.
The pattern must be a Perl5Pattern instance, otherwise a
ClassCastException will be thrown. You are not required to, and
indeed should NOT try to (for performance reasons), catch a
ClassCastException because it will never be thrown as long as you use
a Perl5Pattern as the pattern parameter.
Note: matches() is not the same as sticking a ^ in front of
your expression and a $ at the end of your expression in Perl5
and using the =~ operator, even though in many cases it will be
equivalent. matches() literally looks for an exact match according
to the rules of Perl5 expression matching. Therefore, if you have
a pattern foo|foot and are matching the input foot
it will not produce an exact match. But foot|foo will
produce an exact match for either foot or foo.
Remember, Perl5 regular expressions do not match the longest
possible match. From the perlre manpage:
Alternatives are tried from left to right, so the first
alternative found for which the entire expression matches,
is the one that is chosen. This means that alternatives
are not necessarily greedy. For example: when matching
foo|foot against "barefoot", only the "foo" part will
match, as that is the first alternative tried, and it
successfully matches the target string.
Parameters: input - The PatternMatcherInput to test for a match. Parameters: pattern - The Perl5Pattern to be matched. True if input matches pattern, false otherwise. exception: ClassCastException - If a Pattern instance other than aPerl5Pattern is passed as the pattern parameter. |
matchesPrefix | public boolean matchesPrefix(char[] input, Pattern pattern, int offset)(Code) | | Determines if a prefix of a string (represented as a char[])
matches a given pattern, starting from a given offset into the string.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() .
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The char[] to test for a prefix match. Parameters: pattern - The Pattern to be matched. Parameters: offset - The offset at which to start searching for the prefix. True if input matches pattern, false otherwise. |
matchesPrefix | public boolean matchesPrefix(char[] input, Pattern pattern)(Code) | | Determines if a prefix of a string (represented as a char[])
matches a given pattern.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() .
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The char[] to test for a prefix match. Parameters: pattern - The Pattern to be matched. True if input matches pattern, false otherwise. |
matchesPrefix | public boolean matchesPrefix(String input, Pattern pattern)(Code) | | Determines if a prefix of a string matches a given pattern.
If a prefix of the string matches the pattern, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() .
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The String to test for a prefix match. Parameters: pattern - The Pattern to be matched. True if input matches pattern, false otherwise. |
matchesPrefix | public boolean matchesPrefix(PatternMatcherInput input, Pattern pattern)(Code) | | Determines if a prefix of a PatternMatcherInput instance
matches a given pattern. If there is a match, a MatchResult instance
representing the match is made accesible via
Perl5Matcher.getMatch() . Unlike the
Perl5Matcher.contains(PatternMatcherInput,Pattern) method, the current offset of the PatternMatcherInput argument
is not updated. However, unlike the
Perl5Matcher.matches matches(PatternMatcherInput, Pattern) method,
matchesPrefix() will start its search from the current offset
rather than the begin offset of the PatternMatcherInput.
This method is useful for certain common token identification tasks
that are made more difficult without this functionality.
Parameters: input - The PatternMatcherInput to test for a prefix match. Parameters: pattern - The Pattern to be matched. True if input matches pattern, false otherwise. |
setMultiline | public void setMultiline(boolean multiline)(Code) | | Set whether or not subsequent calls to
Perl5Matcher.matches matches() or
Perl5Matcher.contains contains() should treat the input as
consisting of multiple lines. The default behavior is for
input to be treated as consisting of multiple lines. This method
should only be called if the Perl5Pattern used for a match was
compiled without either of the Perl5Compiler.MULTILINE_MASK or
Perl5Compiler.SINGLELINE_MASK flags, and you want to alter the
behavior of how the ^, $, and . metacharacters are
interpreted on the fly. The compilation options used when compiling
a pattern ALWAYS override the behavior specified by setMultiline(). See
Perl5Compiler for more details.
Parameters: multiline - If set to true treats the input as consisting ofmultiple lines with respect to the ^ and $ metacharacters. If set to false treats the input as consistingof a single line with respect to the ^ and $ metacharacters. |
|
|