| java.lang.Object jregex.Pattern
All known Subclasses: jregex.util.io.PathPattern, jregex.WildcardPattern,
Pattern | public class Pattern implements Serializable,REFlags(Code) | | A handle for a precompiled regular expression.
To match a regular expression myExpr against a text myString one should first create a Pattern object:
Pattern p=new Pattern(myExpr);
then obtain a Matcher object:
Matcher matcher=p.matcher(myText);
The latter is an automaton that actually performs a search. It provides the following methods:
search for matching substrings : matcher.find() or matcher.findAll();
test whether the text matches the whole pattern : matcher.matches();
test whether the text matches the beginning of the pattern : matcher.matchesPrefix();
search with custom options : matcher.find(int options)
Flags
Flags (see REFlags interface) change the meaning of some regular expression elements at compiletime.
These flags may be passed both as string(see Pattern(String,String)) and as bitwise OR of:
REFlags.IGNORE_CASE - enables case insensitivity
REFlags.MULTILINE - forces "^" and "$" to match both at the start and the end of line;
REFlags.DOTALL - forces "." to match eols('\r' and '\n' in ASCII);
REFlags.IGNORE_SPACES - literal spaces in expression are ignored for better readability;
REFlags.UNICODE - the predefined classes('\w','\d',etc) are referenced to Unicode;
REFlags.XML_SCHEMA - permits XML Schema regular expressions syntax extentions.
Multithreading
Pattern instances are thread-safe, i.e. the same Pattern object may be used
by any number of threads simultaniously. On the other hand, the Matcher objects
are NOT thread safe, so, given a Pattern instance, each thread must obtain
and use its own Matcher.
See Also: REFlags See Also: Matcher See Also: Matcher.setTarget(java.lang.String) See Also: Matcher.setTarget(java.lang.Stringintint) See Also: Matcher.setTarget(char[]intint) See Also: Matcher.setTarget(java.io.Readerint) See Also: MatchResult See Also: MatchResult.group(int) See Also: MatchResult.start(int) See Also: MatchResult.end(int) See Also: MatchResult.length(int) See Also: MatchResult.charAt(intint) See Also: MatchResult.prefix See Also: MatchResult.suffix |
Constructor Summary | |
protected | Pattern() | public | Pattern(String regex) Compiles an expression with default flags. | public | Pattern(String regex, String flags) Compiles a regular expression using Perl5-style flags.
The flag string should consist of letters 'i','m','s','x','u','X'(the case is significant) and a hyphen.
The meaning of letters:
- i - case insensitivity, corresponds to REFLlags.IGNORE_CASE;
- m - multiline treatment(BOLs and EOLs affect the '^' and '$'), corresponds to REFLlags.MULTILINE flag;
- s - single line treatment('.' matches \r's and \n's),corresponds to REFLlags.DOTALL;
- x - extended whitespace comments (spaces and eols in the expression are ignored), corresponds to REFLlags.IGNORE_SPACES.
- u - predefined classes are regarded as belonging to Unicode, corresponds to REFLlags.UNICODE; this may yield some performance penalty.
- X - compatibility with XML Schema, corresponds to REFLlags.XML_SCHEMA.
regex the Perl5-compatible regular expression string. | public | Pattern(String regex, int flags) Compiles a regular expression using REFlags.
The flags parameter is a bitwise OR of the folloing values:
- REFLlags.IGNORE_CASE - case insensitivity, corresponds to 'i' letter;
- REFLlags.MULTILINE - multiline treatment(BOLs and EOLs affect the '^' and '$'), corresponds to 'm';
- REFLlags.DOTALL - single line treatment('.' matches \r's and \n's),corresponds to 's';
- REFLlags.IGNORE_SPACES - extended whitespace comments (spaces and eols in the expression are ignored), corresponds to 'x'.
- REFLlags.UNICODE - predefined classes are regarded as belonging to Unicode, corresponds to 'u'; this may yield some performance penalty.
- REFLlags.XML_SCHEMA - compatibility with XML Schema, corresponds to 'X'.
regex the Perl5-compatible regular expression string. |
Method Summary | |
protected void | compile(String regex, int flags) | public int | groupCount() | public Integer | groupId(String name) Get numeric id for a group name. | public Matcher | matcher() Returns a targetless matcher. | public Matcher | matcher(String s) Returns a matcher for a specified string. | public Matcher | matcher(char[] data, int start, int end) Returns a matcher for a specified region. | public Matcher | matcher(MatchResult res, int groupId) Returns a matcher for a match result (in a performance-friendly way). | public Matcher | matcher(MatchResult res, String groupName) Just as above, yet with symbolic group name. | public Matcher | matcher(Reader text, int length) Returns a matcher taking a text stream as target.
Note that this is not a true POSIX-style stream matching, i.e. | public boolean | matches(String s) | static int | parseFlags(String flags) | static int | parseFlags(char[] data, int start, int len) | public Replacer | replacer(String expr) Returns a replacer of a pattern by specified perl-like expression. | public Replacer | replacer(Substitution model) Returns a replacer will substitute all occurences of a pattern
through applying a user-defined substitution model. | public boolean | startsWith(String s) | public String | toString() | public String | toString_d() Returns a less or more readable representation of a bytecode for the pattern. | public RETokenizer | tokenizer(String text) Tokenizes a text by an occurences of the pattern. | public RETokenizer | tokenizer(char[] data, int off, int len) Tokenizes a specified region by an occurences of the pattern. | public RETokenizer | tokenizer(Reader in, int length) Tokenizes a specified region by an occurences of the pattern. |
lookaheads | int lookaheads(Code) | | |
Pattern | public Pattern(String regex, String flags) throws PatternSyntaxException(Code) | | Compiles a regular expression using Perl5-style flags.
The flag string should consist of letters 'i','m','s','x','u','X'(the case is significant) and a hyphen.
The meaning of letters:
- i - case insensitivity, corresponds to REFLlags.IGNORE_CASE;
- m - multiline treatment(BOLs and EOLs affect the '^' and '$'), corresponds to REFLlags.MULTILINE flag;
- s - single line treatment('.' matches \r's and \n's),corresponds to REFLlags.DOTALL;
- x - extended whitespace comments (spaces and eols in the expression are ignored), corresponds to REFLlags.IGNORE_SPACES.
- u - predefined classes are regarded as belonging to Unicode, corresponds to REFLlags.UNICODE; this may yield some performance penalty.
- X - compatibility with XML Schema, corresponds to REFLlags.XML_SCHEMA.
regex the Perl5-compatible regular expression string. flags the Perl5-compatible flags. exception: PatternSyntaxException - if the argument doesn't correspond to perl5 regex syntax.see REFlags |
Pattern | public Pattern(String regex, int flags) throws PatternSyntaxException(Code) | | Compiles a regular expression using REFlags.
The flags parameter is a bitwise OR of the folloing values:
- REFLlags.IGNORE_CASE - case insensitivity, corresponds to 'i' letter;
- REFLlags.MULTILINE - multiline treatment(BOLs and EOLs affect the '^' and '$'), corresponds to 'm';
- REFLlags.DOTALL - single line treatment('.' matches \r's and \n's),corresponds to 's';
- REFLlags.IGNORE_SPACES - extended whitespace comments (spaces and eols in the expression are ignored), corresponds to 'x'.
- REFLlags.UNICODE - predefined classes are regarded as belonging to Unicode, corresponds to 'u'; this may yield some performance penalty.
- REFLlags.XML_SCHEMA - compatibility with XML Schema, corresponds to 'X'.
regex the Perl5-compatible regular expression string. flags the Perl5-compatible flags. exception: PatternSyntaxException - if the argument doesn't correspond to perl5 regex syntax.see REFlags |
groupCount | public int groupCount()(Code) | | How many capturing groups this expression includes?
|
matcher | public Matcher matcher()(Code) | | Returns a targetless matcher.
Don't forget to supply a target.
|
matcher | public Matcher matcher(char[] data, int start, int end)(Code) | | Returns a matcher for a specified region.
|
matcher | public Matcher matcher(MatchResult res, int groupId)(Code) | | Returns a matcher for a match result (in a performance-friendly way).
groupId parameter specifies which group is a target.
Parameters: groupId - which group is a target; either positive integer(group id), or one of MatchResult.MATCH,MatchResult.PREFIX,MatchResult.SUFFIX,MatchResult.TARGET. |
matcher | public Matcher matcher(Reader text, int length) throws IOException(Code) | | Returns a matcher taking a text stream as target.
Note that this is not a true POSIX-style stream matching, i.e. the whole length of the text is preliminary read and stored in a char array.
Parameters: text - a text stream Parameters: len - the length to read from a stream; if len is -1 , the whole stream is read in. exception: IOException - indicates an IO problem exception: OutOfMemoryException - if a stream is too lengthy |
replacer | public Replacer replacer(String expr)(Code) | | Returns a replacer of a pattern by specified perl-like expression.
Such replacer will substitute all occurences of a pattern by an evaluated expression
("$&" and "$0" will substitute by the whole match, "$1" will substitute by group#1, etc).
Example:
String text="The quick brown fox jumped over the lazy dog";
Pattern word=new Pattern("\\w+");
System.out.println(word.replacer("[$&]").replace(text));
//prints "[The] [quick] [brown] [fox] [jumped] [over] [the] [lazy] [dog]"
Pattern swap=new Pattern("(fox|dog)(.*?)(fox|dog)");
System.out.println(swap.replacer("$3$2$1").replace(text));
//prints "The quick brown dog jumped over the lazy fox"
Pattern scramble=new Pattern("(\\w+)(.*?)(\\w+)");
System.out.println(scramble.replacer("$3$2$1").replace(text));
//prints "quick The fox brown over jumped lazy the dog"
Parameters: expr - a perl-like expression, the "$&" and "${&}" standing for whole match, the "$N" and "${N}" standing for group#N, and "${Foo}" standing for named group Foo. See Also: Replacer |
replacer | public Replacer replacer(Substitution model)(Code) | | Returns a replacer will substitute all occurences of a pattern
through applying a user-defined substitution model.
Parameters: model - a Substitution object which is in charge for match substitution See Also: Replacer |
startsWith | public boolean startsWith(String s)(Code) | | A shorthand for Pattern.matcher(String).matchesPrefix().
Parameters: s - the target true if the entire target matches the beginning of the pattern See Also: Matcher.matchesPrefix |
toString_d | public String toString_d()(Code) | | Returns a less or more readable representation of a bytecode for the pattern.
|
|
|