Java Doc for Perl5Util.java in  » Library » jakarta-oro-2.0.8 » org » apache » oro » text » perl » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Library » jakarta oro 2.0.8 » org.apache.oro.text.perl 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   org.apache.oro.text.perl.Perl5Util

Perl5Util
final public class Perl5Util implements MatchResult(Code)
This is a utility class implementing the 3 most common Perl5 operations involving regular expressions:
  • [m]/pattern/[i][m][s][x],
  • s/pattern/replacement/[g][i][m][o][s][x],
  • and split().
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

The objective of the class is to minimize the amount of code a Java programmer using Jakarta-ORO has to write to achieve the same results as Perl by transparently handling regular expression compilation, caching, and matching. A second objective is to use the same Perl pattern matching syntax to ease the task of Perl programmers transitioning to Java (this also reduces the number of parameters to a method). All the state affecting methods are synchronized to avoid the maintenance of explicit locks in multithreaded programs. This philosophy differs from the org.apache.oro.text.regex package, where you are expected to either maintain explicit locks, or more preferably create separate compiler and matcher instances for each thread.

To use this class, first create an instance using the default constructor or initialize the instance with a PatternCache of your choosing using the alternate constructor. The default cache used by Perl5Util is a PatternCacheLRU of capacity GenericPatternCache.DEFAULT_CAPACITY. You may want to create a cache with a different capacity, a different cache replacement policy, or even devise your own PatternCache implementation. The PatternCacheLRU is probably the best general purpose pattern cache, but your specific application may be better served by a different cache replacement policy. You should remember that you can front-load a cache with all the patterns you will be using before initializing a Perl5Util instance, or you can just let Perl5Util fill the cache as you use it.

You might use the class as follows:

 Perl5Util util = new Perl5Util();
 String line;
 DataInputStream input;
 PrintStream output;
 // Initialization of input and output omitted
 while((line = input.readLine()) != null) {
 // First find the line with the string we want to substitute because
 // it is cheaper than blindly substituting each line.
 if(util.match("/HREF=\"description1.html\"/")) {
 line = util.substitute("s/description1\\.html/about1.html/", line);
 }
 output.println(line);
 }
 

A couple of things to remember when using this class are that the Perl5Util.match match() methods have the same meaning as org.apache.oro.text.regex.Perl5Matcher.containsPerl5Matcher.contains() and =~ m/pattern/ in Perl. The methods are named match to more closely associate them with Perl and to differentiate them from org.apache.oro.text.regex.Perl5Matcher.matchesPerl5Matcher.matches() . A further thing to keep in mind is that the MalformedPerl5PatternException class is derived from RuntimeException which means you DON'T have to catch it. The reasoning behind this is that you will detect your regular expression mistakes as you write and debug your program when a MalformedPerl5PatternException is thrown during a test run. However, we STRONGLY recommend that you ALWAYS catch MalformedPerl5PatternException whenever you deal with a DYNAMICALLY created pattern. Relying on a fatal MalformedPerl5PatternException being thrown to detect errors while debugging is only useful for dealing with static patterns, that is, actual pregenerated strings present in your program. Patterns created from user input or some other dynamic method CANNOT be relied upon to be correct and MUST be handled by catching MalformedPerl5PatternException for your programs to be robust.

Finally, as a convenience Perl5Util implements the org.apache.oro.text.regex.MatchResult MatchResult interface. The methods are merely wrappers which call the corresponding method of the last org.apache.oro.text.regex.MatchResult MatchResult found (which can be accessed with Perl5Util.getMatch() ) by a match or substitution (or even a split, but this isn't particularly useful). At the moment, the org.apache.oro.text.regex.MatchResult MatchResult returned by Perl5Util.getMatch() is not stored in a thread-local variable. Therefore concurrent calls to Perl5Util.getMatch() will produce unpredictable results. So if your concurrent program requires the match results, you must protect the matching and the result retrieval in a critical section. If you do not need match results, you don't need to do anything special. If you feel the J2SE implementation of Perl5Util.getMatch() should use a thread-local variable and obviate the need for a critical section, please express your views on the oro-dev mailing list.
version:
   @version@
since:
   1.0
See Also:   MalformedPerl5PatternException
See Also:   org.apache.oro.text.PatternCache
See Also:   org.apache.oro.text.PatternCacheLRU
See Also:   org.apache.oro.text.regex.MatchResult



Field Summary
final public static  intSPLIT_ALL
     A constant passed to the Perl5Util.split split() methods indicating that all occurrences of a pattern should be used to split a string.

Constructor Summary
public  Perl5Util(PatternCache cache)
     A secondary constructor for Perl5Util.
public  Perl5Util()
     Default constructor for Perl5Util.

Method Summary
public synchronized  intbegin(int group)
     Returns the begin offset of the subgroup of the last match found relative the beginning of the match.


Parameters:
  group - The pattern subgroup.

public synchronized  intbeginOffset(int group)
     Returns an offset marking the beginning of the last pattern match found relative to the beginning of the input from which the match was extracted.


Parameters:
  group - The pattern subgroup.

public synchronized  intend(int group)
     Returns the end offset of the subgroup of the last match found relative the beginning of the match.


Parameters:
  group - The pattern subgroup.

public synchronized  intendOffset(int group)
     Returns an offset marking the end of the last pattern match found relative to the beginning of the input from which the match was extracted.


Parameters:
  group - The pattern subgroup.

public synchronized  MatchResultgetMatch()
     Returns the last match found by a call to a match(), substitute(), or split() method.
public synchronized  Stringgroup(int group)
     Returns the contents of the parenthesized subgroups of the last match found according to the behavior dictated by the MatchResult interface.


Parameters:
  group - The pattern subgroup to return.

public synchronized  intgroups()
     The number of groups contained in the last match found.This number includes the 0th group.
public synchronized  intlength()
     Returns the length of the last match found.
public synchronized  booleanmatch(String pattern, char[] input)
     Searches for the first pattern match somewhere in a character array taking a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

If the input contains the pattern, the org.apache.oro.text.regex.MatchResult can be obtained by calling Perl5Util.getMatch() . However, Perl5Util implements the MatchResult interface as a wrapper around the last MatchResult found, so you can call its methods to access match information.


Parameters:
  pattern - The pattern to search for.
Parameters:
  input - The char[] input to search.

public synchronized  booleanmatch(String pattern, String input)
     Searches for the first pattern match in a String taking a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

If the input contains the pattern, the org.apache.oro.text.regex.MatchResult MatchResult can be obtained by calling Perl5Util.getMatch() . However, Perl5Util implements the MatchResult interface as a wrapper around the last MatchResult found, so you can call its methods to access match information.


Parameters:
  pattern - The pattern to search for.
Parameters:
  input - The String input to search.

public synchronized  booleanmatch(String pattern, PatternMatcherInput input)
     Searches for the next pattern match somewhere in a org.apache.oro.text.regex.PatternMatcherInput instance, taking a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

If the input contains the pattern, the org.apache.oro.text.regex.MatchResult MatchResult can be obtained by calling Perl5Util.getMatch() . However, Perl5Util implements the MatchResult interface as a wrapper around the last MatchResult found, so you can call its methods to access match information. After the call to this method, the PatternMatcherInput current offset is advanced to the end of the match, so you can use it to repeatedly search for expressions in the entire input using a while loop as explained in the org.apache.oro.text.regex.PatternMatcherInputPatternMatcherInput documentation.


Parameters:
  pattern - The pattern to search for.
Parameters:
  input - The PatternMatcherInput to search.

public synchronized  StringpostMatch()
     Returns the part of the input following the last match found.
public synchronized  char[]postMatchCharArray()
     Returns the part of the input following the last match found as a char array.
public synchronized  StringpreMatch()
     Returns the part of the input preceding the last match found.
public synchronized  char[]preMatchCharArray()
     Returns the part of the input preceding the last match found as a char array.
public synchronized  voidsplit(Collection results, String pattern, String input, int limit)
     Splits a String into strings that are appended to a List, but no more than a specified limit.
public synchronized  voidsplit(Collection results, String pattern, String input)
    
public synchronized  voidsplit(Collection results, String input)
     Splits input in the default Perl manner, splitting on all whitespace.
public synchronized  Vectorsplit(String pattern, String input, int limit)
     Splits a String into strings contained in a Vector of size no greater than a specified limit.
public synchronized  Vectorsplit(String pattern, String input)
    
public synchronized  Vectorsplit(String input)
     Splits input in the default Perl manner, splitting on all whitespace.
public synchronized  intsubstitute(StringBuffer result, String expression, String input)
     Substitutes a pattern in a given input with a replacement string. The substitution expression is specified in Perl5 native format:
 s/pattern/replacement/[g][i][m][o][s][x]
 
The s prefix is mandatory and the meaning of the optional trailing options are:
g
Substitute all occurrences of pattern with replacement. The default is to replace only the first occurrence.
i
perform a case insensitive match
m
treat the input as consisting of multiple lines
o
If variable interopolation is used, only evaluate the interpolation once (the first time).
public synchronized  Stringsubstitute(String expression, String input)
     Substitutes a pattern in a given input with a replacement string. The substitution expression is specified in Perl5 native format.
Calling this method is the same as:
 String result;
 StringBuffer buffer = new StringBuffer();
 perl.substitute(buffer, expression, input);
 result = buffer.toString();
 

Parameters:
  expression - The Perl5 substitution regular expression.
Parameters:
  input - The input on which to perform substitutions.
public synchronized  StringtoString()
     Returns the same as group(0).

Field Detail
SPLIT_ALL
final public static int SPLIT_ALL(Code)
A constant passed to the Perl5Util.split split() methods indicating that all occurrences of a pattern should be used to split a string.




Constructor Detail
Perl5Util
public Perl5Util(PatternCache cache)(Code)
A secondary constructor for Perl5Util. It initializes the Perl5Matcher used by the class to perform matching operations, but requires the programmer to provide a PatternCache instance for the class to use to compile and store regular expressions. You would want to use this constructor if you want to change the capacity or policy of the cache used. Example uses might be:
 // We know we're going to use close to 50 expressions a whole lot, so
 // we create a cache of the proper size.
 util = new Perl5Util(new PatternCacheLRU(50));
 
or
 // We're only going to use a few expressions and know that second-chance
 // fifo is best suited to the order in which we are using the patterns.
 util = new Perl5Util(new PatternCacheFIFO2(10));
 



Perl5Util
public Perl5Util()(Code)
Default constructor for Perl5Util. This initializes the Perl5Matcher used by the class to perform matching operations and creates a default PatternCacheLRU instance to use to compile and cache regular expressions. The size of this cache is GenericPatternCache.DEFAULT_CAPACITY.




Method Detail
begin
public synchronized int begin(int group)(Code)
Returns the begin offset of the subgroup of the last match found relative the beginning of the match.


Parameters:
  group - The pattern subgroup. The offset into group 0 of the first token in the indicatedpattern subgroup. If a group was never matched or doesnot exist, returns -1. Be aware that a group that matchesthe null string at the end of a match will have an offsetequal to the length of the string, so you shouldn't blindlyuse the offset to index an array or String.




beginOffset
public synchronized int beginOffset(int group)(Code)
Returns an offset marking the beginning of the last pattern match found relative to the beginning of the input from which the match was extracted.


Parameters:
  group - The pattern subgroup. The offset of the first token in the indicatedpattern subgroup. If a group was never matched or doesnot exist, returns -1.




end
public synchronized int end(int group)(Code)
Returns the end offset of the subgroup of the last match found relative the beginning of the match.


Parameters:
  group - The pattern subgroup. Returns one plus the offset into group 0 of the last token inthe indicated pattern subgroup. If a group was never matchedor does not exist, returns -1. A group matching the nullstring will return its start offset.




endOffset
public synchronized int endOffset(int group)(Code)
Returns an offset marking the end of the last pattern match found relative to the beginning of the input from which the match was extracted.


Parameters:
  group - The pattern subgroup. Returns one plus the offset of the last token inthe indicated pattern subgroup. If a group was never matchedor does not exist, returns -1. A group matching the nullstring will return its start offset.




getMatch
public synchronized MatchResult getMatch()(Code)
Returns the last match found by a call to a match(), substitute(), or split() method. This method is only intended for use to retrieve a match found by the last match found by a match() method. This method should be used when you want to save MatchResult instances. Otherwise, for simply accessing match information, it is more convenient to use the Perl5Util methods implementing the MatchResult interface.

The org.apache.oro.text.regex.MatchResult instance containing thelast match found.




group
public synchronized String group(int group)(Code)
Returns the contents of the parenthesized subgroups of the last match found according to the behavior dictated by the MatchResult interface.


Parameters:
  group - The pattern subgroup to return. A string containing the indicated pattern subgroup. Group0 always refers to the entire match. If a group was nevermatched, it returns null. This is not to be confused witha group matching the null string, which will return a Stringof length 0.




groups
public synchronized int groups()(Code)
The number of groups contained in the last match found.This number includes the 0th group. In other words, theresult refers to the number of parenthesized subgroups plusthe entire match itself.



length
public synchronized int length()(Code)
Returns the length of the last match found.

The length of the last match found.




match
public synchronized boolean match(String pattern, char[] input) throws MalformedPerl5PatternException(Code)
Searches for the first pattern match somewhere in a character array taking a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

If the input contains the pattern, the org.apache.oro.text.regex.MatchResult can be obtained by calling Perl5Util.getMatch() . However, Perl5Util implements the MatchResult interface as a wrapper around the last MatchResult found, so you can call its methods to access match information.


Parameters:
  pattern - The pattern to search for.
Parameters:
  input - The char[] input to search. True if the input contains the pattern, false otherwise.
exception:
  MalformedPerl5PatternException - If there is an error inthe pattern. You are not forced to catch this exceptionbecause it is derived from RuntimeException.




match
public synchronized boolean match(String pattern, String input) throws MalformedPerl5PatternException(Code)
Searches for the first pattern match in a String taking a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

If the input contains the pattern, the org.apache.oro.text.regex.MatchResult MatchResult can be obtained by calling Perl5Util.getMatch() . However, Perl5Util implements the MatchResult interface as a wrapper around the last MatchResult found, so you can call its methods to access match information.


Parameters:
  pattern - The pattern to search for.
Parameters:
  input - The String input to search. True if the input contains the pattern, false otherwise.
exception:
  MalformedPerl5PatternException - If there is an error inthe pattern. You are not forced to catch this exceptionbecause it is derived from RuntimeException.




match
public synchronized boolean match(String pattern, PatternMatcherInput input) throws MalformedPerl5PatternException(Code)
Searches for the next pattern match somewhere in a org.apache.oro.text.regex.PatternMatcherInput instance, taking a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

If the input contains the pattern, the org.apache.oro.text.regex.MatchResult MatchResult can be obtained by calling Perl5Util.getMatch() . However, Perl5Util implements the MatchResult interface as a wrapper around the last MatchResult found, so you can call its methods to access match information. After the call to this method, the PatternMatcherInput current offset is advanced to the end of the match, so you can use it to repeatedly search for expressions in the entire input using a while loop as explained in the org.apache.oro.text.regex.PatternMatcherInputPatternMatcherInput documentation.


Parameters:
  pattern - The pattern to search for.
Parameters:
  input - The PatternMatcherInput to search. True if the input contains the pattern, false otherwise.
exception:
  MalformedPerl5PatternException - If there is an error inthe pattern. You are not forced to catch this exceptionbecause it is derived from RuntimeException.




postMatch
public synchronized String postMatch()(Code)
Returns the part of the input following the last match found.

The part of the input following the last match found.




postMatchCharArray
public synchronized char[] postMatchCharArray()(Code)
Returns the part of the input following the last match found as a char array. This method eliminates the extra buffer copying caused by preMatch().toCharArray().

The part of the input following the last match found as a char[].If the result is of zero length, returns null instead of a zerolength array.




preMatch
public synchronized String preMatch()(Code)
Returns the part of the input preceding the last match found.

The part of the input following the last match found.




preMatchCharArray
public synchronized char[] preMatchCharArray()(Code)
Returns the part of the input preceding the last match found as a char array. This method eliminates the extra buffer copying caused by preMatch().toCharArray().

The part of the input preceding the last match found as a char[].If the result is of zero length, returns null instead of a zerolength array.




split
public synchronized void split(Collection results, String pattern, String input, int limit) throws MalformedPerl5PatternException(Code)
Splits a String into strings that are appended to a List, but no more than a specified limit. The String is split using a regular expression as the delimiter. The regular expression is a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

The limit parameter causes the string to be split on at most the first limit - 1 number of pattern occurences.

Of special note is that this split method performs EXACTLY the same as the Perl split() function. In other words, if the split pattern contains parentheses, additional Vector elements are created from each of the matching subgroups in the pattern. Using an example similar to the one from the Camel book:

 split(list, "/([,-])/", "8-12,15,18")
 
produces the Vector containing:
 { "8", "-", "12", ",", "15", ",", "18" }
 
Furthermore, the following Perl behavior is observed: "leading empty fields are preserved, and empty trailing one are deleted." This has the effect that a split on a zero length string returns an empty list. The org.apache.oro.text.regex.Util.split Util.split() method does NOT implement these behaviors because it is intended to be a general self-consistent and predictable split function usable with Pattern instances other than Perl5Pattern.


Parameters:
  results - A Collection to which the substrings of the inputthat occur between the regular expression delimiter occurencesare appended. The input will not be split into any more substringsthan the specified limit. A way of thinking of this is that only the firstlimit - 1matches of the delimiting regular expression will be used to split theinput. The Collection must support theaddAll(Collection) operation.
Parameters:
  pattern - The regular expression to use as a split delimiter.
Parameters:
  input - The String to split.
Parameters:
  limit - The limit on the size of the returned Vector.Values <= 0 produce the same behavior as the SPLIT_ALL constant whichcauses the limit to be ignored and splits to be performed on alloccurrences of the pattern. You should use the SPLIT_ALL constantto achieve this behavior instead of relying on the default behaviorassociated with non-positive limit values.
exception:
  MalformedPerl5PatternException - If there is an error inthe expression. You are not forced to catch this exceptionbecause it is derived from RuntimeException.




split
public synchronized void split(Collection results, String pattern, String input) throws MalformedPerl5PatternException(Code)
This method is identical to calling:
 split(results, pattern, input, SPLIT_ALL);
 



split
public synchronized void split(Collection results, String input) throws MalformedPerl5PatternException(Code)
Splits input in the default Perl manner, splitting on all whitespace. This method is identical to calling:
 split(results, "/\\s+/", input);
 



split
public synchronized Vector split(String pattern, String input, int limit) throws MalformedPerl5PatternException(Code)
Splits a String into strings contained in a Vector of size no greater than a specified limit. The String is split using a regular expression as the delimiter. The regular expression is a pattern specified in Perl5 native format:
 [m]/pattern/[i][m][s][x]
 
The m prefix is optional and the meaning of the optional trailing options are:
i
case insensitive match
m
treat the input as consisting of multiple lines
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes.

The limit parameter causes the string to be split on at most the first limit - 1 number of pattern occurences.

Of special note is that this split method performs EXACTLY the same as the Perl split() function. In other words, if the split pattern contains parentheses, additional Vector elements are created from each of the matching subgroups in the pattern. Using an example similar to the one from the Camel book:

 split("/([,-])/", "8-12,15,18")
 
produces the Vector containing:
 { "8", "-", "12", ",", "15", ",", "18" }
 
The org.apache.oro.text.regex.Util.split Util.split() method does NOT implement this particular behavior because it is intended to be usable with Pattern instances other than Perl5Pattern.

Perl5Util.split(Collection results,String pattern,String input,int limit)
Parameters:
  pattern - The regular expression to use as a split delimiter.
Parameters:
  input - The String to split.
Parameters:
  limit - The limit on the size of the returned Vector.Values <= 0 produce the same behavior as the SPLIT_ALL constant whichcauses the limit to be ignored and splits to be performed on alloccurrences of the pattern. You should use the SPLIT_ALL constantto achieve this behavior instead of relying on the default behaviorassociated with non-positive limit values. A Vector containing the substrings of the inputthat occur between the regular expression delimiter occurences. Theinput will not be split into any more substrings than the specified limit. A way of thinking of this is that only the firstlimit - 1matches of the delimiting regular expression will be used to split theinput.
exception:
  MalformedPerl5PatternException - If there is an error inthe expression. You are not forced to catch this exceptionbecause it is derived from RuntimeException.




split
public synchronized Vector split(String pattern, String input) throws MalformedPerl5PatternException(Code)
This method is identical to calling:
 split(pattern, input, SPLIT_ALL);
 
Perl5Util.split(Collection results,String pattern,String input)



split
public synchronized Vector split(String input) throws MalformedPerl5PatternException(Code)
Splits input in the default Perl manner, splitting on all whitespace. This method is identical to calling:
 split("/\\s+/", input);
 
Perl5Util.split(Collection results,String input)



substitute
public synchronized int substitute(StringBuffer result, String expression, String input) throws MalformedPerl5PatternException(Code)
Substitutes a pattern in a given input with a replacement string. The substitution expression is specified in Perl5 native format:
 s/pattern/replacement/[g][i][m][o][s][x]
 
The s prefix is mandatory and the meaning of the optional trailing options are:
g
Substitute all occurrences of pattern with replacement. The default is to replace only the first occurrence.
i
perform a case insensitive match
m
treat the input as consisting of multiple lines
o
If variable interopolation is used, only evaluate the interpolation once (the first time). This is equivalent to using a numInterpolations argument of 1 in org.apache.oro.text.regex.Util.substitute Util.substitute() . The default is to compute each interpolation independently. See org.apache.oro.text.regex.Util.substitute Util.substitute() and org.apache.oro.text.regex.Perl5Substitution Perl5Substitution for more details on variable interpolation in substitutions.
s
treat the input as consisting of a single line
x
enable extended expression syntax incorporating whitespace and comments
As with Perl, any non-alphanumeric character can be used in lieu of the slashes. This is helpful to avoid backslashing. For example, using slashes you would have to do:
 numSubs = util.substitute(result, "s/foo\\/bar/goo\\/\\/baz/", input);
 
when you could more easily write:
 numSubs = util.substitute(result, "s#foo/bar#goo//baz#", input);
 
where the hashmarks are used instead of slashes.

There is a special case of backslashing that you need to pay attention to. As demonstrated above, to denote a delimiter in the substituted string it must be backslashed. However, this can be a problem when you want to denote a backslash at the end of the substituted string. As of PerlTools 1.3, a new means of handling this situation has been implemented. In previous versions, the behavior was that

"... a double backslash (quadrupled in the Java String) always represents two backslashes unless the second backslash is followed by the delimiter, in which case it represents a single backslash."

The new behavior is that a backslash is always a backslash in the substitution portion of the expression unless it is used to escape a delimiter. A backslash is considered to escape a delimiter if an even number of contiguous backslashes preceed the backslash and the delimiter following the backslash is not the FINAL delimiter in the expression. Therefore, backslashes preceding final delimiters are never considered to escape the delimiter. The following, which used to be an invalid expression and require a special-case extra backslash, will now replace all instances of / with \:

 numSubs = util.substitute(result, "s#/#\\#g", input);
 


Parameters:
  result - The StringBuffer in which to store the result of thesubstitutions. The buffer is only appended to.
Parameters:
  expression - The Perl5 substitution regular expression.
Parameters:
  input - The input on which to perform substitutions. The number of substitutions made.
exception:
  MalformedPerl5PatternException - If there is an error inthe expression. You are not forced to catch this exceptionbecause it is derived from RuntimeException.
since:
   2.0.6




substitute
public synchronized String substitute(String expression, String input) throws MalformedPerl5PatternException(Code)
Substitutes a pattern in a given input with a replacement string. The substitution expression is specified in Perl5 native format.
Calling this method is the same as:
 String result;
 StringBuffer buffer = new StringBuffer();
 perl.substitute(buffer, expression, input);
 result = buffer.toString();
 

Parameters:
  expression - The Perl5 substitution regular expression.
Parameters:
  input - The input on which to perform substitutions. The input as a String after substitutions have been performed.
exception:
  MalformedPerl5PatternException - If there is an error inthe expression. You are not forced to catch this exceptionbecause it is derived from RuntimeException.
since:
   1.0
See Also:   Perl5Util.substitute



toString
public synchronized String toString()(Code)
Returns the same as group(0).

A string containing the entire match.




Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.