| This is a test program demonstrating how to search an input stream
with the jakarta-oro awk package regular expression classes. It
performs a function similar to the Unix strings command,
but is intended to show how matching on a stream is affected by its
character encoding. The most important thing to remember is that
AwkMatcher only matches on 8-bit values. If your input contains
Java characters containing values greater than 255, the pattern
matching process will result in an ArrayIndexOutOfBoundsException.
Therefore, if you want to search a binary file containing arbitrary
bytes, you have to make sure you use an 8-bit character encoding
like ISO-8859-1, so that the mapping between byte-values and character
values will be one to one. Otherwise, the file will be interpreted
as UTF-8 by default, and you will probably wind up with character
values outside of the 8-bit range.
version: @version@ |