| java.lang.Object com.Ostermiller.util.CSVParser
CSVParser | public class CSVParser implements CSVParse(Code) | | Read files in comma separated value format.
More information about this class is available from ostermiller.org.
CSV is a file format used as a portable representation of a database.
Each line is one entry or record and the fields in a record are separated by commas.
Commas may be preceded or followed by arbitrary space and/or tab characters which are
ignored.
If field includes a comma or a new line, the whole field must be surrounded with double quotes.
When the field is in quotes, any quote literals must be escaped by \" Backslash
literals must be escaped by \\. Otherwise a backslash and the character following
will be treated as the following character, IE. "\n" is equivalent to "n". Other escape
sequences may be set using the setEscapes() method. Text that comes after quotes that have
been closed but come before the next comma will be ignored.
Empty fields are returned as as String of length zero: "". The following line has three empty
fields and three non-empty fields in it. There is an empty field on each end, and one in the
middle. One token is returned as a space.
,second,," ",fifth,
Blank lines are always ignored. Other lines will be ignored if they start with a
comment character as set by the setCommentStart() method.
An example of how CVSLexer might be used:
CSVParser shredder = new CSVParser(System.in);
shredder.setCommentStart("#;!");
shredder.setEscapes("nrtf", "\n\r\t\f");
String t;
while ((t = shredder.nextValue()) != null){
System.out.println("" + shredder.lastLineNumber() + " " + t);
}
Some applications do not output CSV according to the generally accepted standards and this parse may
not be able to handle it. One such application is the Microsoft Excel spreadsheet. A
separate class must be use to read
Excel CSV.
See Also: com.Ostermiller.util.ExcelCSVParser author: Stephen Ostermiller http://ostermiller.org/contact.pl?regarding=Java+Utilities since: ostermillerutils 1.00.00 |
Constructor Summary | |
public | CSVParser(InputStream in) Create a parser to parse comma separated values from
an InputStream. | public | CSVParser(InputStream in, char delimiter) Create a parser to parse delimited values from
an InputStream. | public | CSVParser(Reader in) Create a parser to parse comma separated values from
a Reader. | public | CSVParser(Reader in, char delimiter) Create a parser to parse delimited values from
a Reader. | public | CSVParser(InputStream in, char delimiter, String escapes, String replacements, String commentDelims) Create a parser to parse delimited values from
an InputStream. | public | CSVParser(InputStream in, String escapes, String replacements, String commentDelims) Create a parser to parse comma separated values from
an InputStream. | public | CSVParser(Reader in, char delimiter, String escapes, String replacements, String commentDelims) Create a parser to parse delimited values from
a Reader. | public | CSVParser(Reader in, String escapes, String replacements, String commentDelims) Create a parser to parse comma separated values from
a Reader. |
Method Summary | |
public void | changeDelimiter(char newDelim) Change this parser so that it uses a new delimiter. | public void | changeQuote(char newQuote) Change this parser so that it uses a new character for quoting. | public void | close() Close any stream upon which this parser is based. | public String[][] | getAllValues() Get all the values from the file.
If the file has already been partially read, only the
values that have not already been read will be included.
Each line of the file that has at least one value will be
represented. | public int | getLastLineNumber() Get the number of the line from which the last value was retrieved. | public String[] | getLine() Get all the values from a line. | public int | lastLineNumber() Get the line number that the last token came from. | public String | nextValue() get the next value. | public static String[][] | parse(String s) Parse the comma delimited data from a string.
Only escaped backslashes and quotes will be recognized as escape sequences.
The data will be treated as having no comments.
Parameters: s - string with comma delimited data to parse. | public static String[][] | parse(String s, char delimiter) Parse the delimited data from a string. | public static String[][] | parse(String s, String escapes, String replacements, String commentDelims) Parse the comma delimited data from a string.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: s - string with comma delimited data to parse. Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. | public static String[][] | parse(String s, char delimiter, String escapes, String replacements, String commentDelims) Parse the delimited data from a string. | public static String[][] | parse(Reader in, char delimiter) Parse the comma delimited data from a stream. | public static String[][] | parse(Reader in) Parse the delimited data from a stream.
Only escaped backslashes and quotes will be recognized as escape sequences.
The data will be treated as having no comments.
Parameters: in - Reader with comma delimited data to parse. | public static String[][] | parse(Reader in, char delimiter, String escapes, String replacements, String commentDelims) Parse the delimited data from a stream.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: in - Reader with delimited data to parse. Parameters: delimiter - record separator Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. | public static String[][] | parse(Reader in, String escapes, String replacements, String commentDelims) Parse the comma delimited data from a stream.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: in - Reader with comma delimited data to parse. Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. | public void | setCommentStart(String commentDelims) Set the characters that indicate a comment at the beginning of the line.
For example if the string "#;!" were passed in, all of the following lines
would be comments:
# Comment
; Another Comment
! Yet another comment
By default there are no comments in CVS files. | public void | setEscapes(String escapes, String replacements) Specify escape sequences and their replacements.
Escape sequences set here are in addition to \\ and \".
\\ and \" are always valid escape sequences. |
CSVParser | public CSVParser(InputStream in)(Code) | | Create a parser to parse comma separated values from
an InputStream.
Byte to character conversion is done using the platform
default locale.
Parameters: in - stream that contains comma separated values. since: ostermillerutils 1.00.00 |
CSVParser | public CSVParser(InputStream in, char delimiter) throws BadDelimiterException(Code) | | Create a parser to parse delimited values from
an InputStream.
Byte to character conversion is done using the platform
default locale.
Parameters: in - stream that contains comma separated values. Parameters: delimiter - record separator throws: BadDelimiterException - if the specified delimiter cannot be used since: ostermillerutils 1.02.24 |
CSVParser | public CSVParser(Reader in)(Code) | | Create a parser to parse comma separated values from
a Reader.
Parameters: in - reader that contains comma separated values. since: ostermillerutils 1.00.00 |
CSVParser | public CSVParser(Reader in, char delimiter) throws BadDelimiterException(Code) | | Create a parser to parse delimited values from
a Reader.
Parameters: in - reader that contains comma separated values. Parameters: delimiter - record separator throws: BadDelimiterException - if the specified delimiter cannot be used since: ostermillerutils 1.02.24 |
CSVParser | public CSVParser(InputStream in, char delimiter, String escapes, String replacements, String commentDelims) throws BadDelimiterException(Code) | | Create a parser to parse delimited values from
an InputStream.
Byte to character conversion is done using the platform
default locale.
Parameters: in - stream that contains comma separated values. Parameters: escapes - a list of characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. Parameters: delimiter - record separator throws: BadDelimiterException - if the specified delimiter cannot be used since: ostermillerutils 1.02.24 |
CSVParser | public CSVParser(InputStream in, String escapes, String replacements, String commentDelims)(Code) | | Create a parser to parse comma separated values from
an InputStream.
Byte to character conversion is done using the platform
default locale.
Parameters: in - stream that contains comma separated values. Parameters: escapes - a list of characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. since: ostermillerutils 1.00.00 |
CSVParser | public CSVParser(Reader in, char delimiter, String escapes, String replacements, String commentDelims) throws BadDelimiterException(Code) | | Create a parser to parse delimited values from
a Reader.
Parameters: in - reader that contains comma separated values. Parameters: escapes - a list of characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. Parameters: delimiter - record separator throws: BadDelimiterException - if the specified delimiter cannot be used since: ostermillerutils 1.02.24 |
CSVParser | public CSVParser(Reader in, String escapes, String replacements, String commentDelims)(Code) | | Create a parser to parse comma separated values from
a Reader.
Parameters: in - reader that contains comma separated values. Parameters: escapes - a list of characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. since: ostermillerutils 1.00.00 |
changeDelimiter | public void changeDelimiter(char newDelim) throws BadDelimiterException(Code) | | Change this parser so that it uses a new delimiter.
The initial character is a comma, the delimiter cannot be changed
to a quote or other character that has special meaning in CSV.
Parameters: newDelim - delimiter to which to switch. throws: BadDelimiterException - if the character cannot be used as a delimiter. since: ostermillerutils 1.02.08 |
changeQuote | public void changeQuote(char newQuote) throws BadQuoteException(Code) | | Change this parser so that it uses a new character for quoting.
The initial character is a double quote ("), the delimiter cannot be changed
to a comma or other character that has special meaning in CSV.
Parameters: newQuote - character to use for quoting. throws: BadQuoteException - if the character cannot be used as a quote. since: ostermillerutils 1.02.16 |
close | public void close() throws IOException(Code) | | Close any stream upon which this parser is based.
since: ostermillerutils 1.02.22 throws: IOException - if an error occurs while closing the stream. |
getAllValues | public String[][] getAllValues() throws IOException(Code) | | Get all the values from the file.
If the file has already been partially read, only the
values that have not already been read will be included.
Each line of the file that has at least one value will be
represented. Comments and empty lines are ignored.
The resulting double array may be jagged.
all the values from the file or null if there are no more values. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.00.00 |
getLastLineNumber | public int getLastLineNumber()(Code) | | Get the number of the line from which the last value was retrieved.
line number or -1 if no tokens have been returned. since: ostermillerutils 1.00.00 |
getLine | public String[] getLine() throws IOException(Code) | | Get all the values from a line.
If the line has already been partially read, only the
values that have not already been read will be included.
all the values from the line or null if there are no more values. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.00.00 |
lastLineNumber | public int lastLineNumber()(Code) | | Get the line number that the last token came from.
New line breaks that occur in the middle of a token are no
counted in the line number count.
line number or -1 if no tokens have been returned yet. since: ostermillerutils 1.00.00 |
nextValue | public String nextValue() throws IOException(Code) | | get the next value.
the next value or null if there are no more values. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.00.00 |
parse | public static String[][] parse(String s)(Code) | | Parse the comma delimited data from a string.
Only escaped backslashes and quotes will be recognized as escape sequences.
The data will be treated as having no comments.
Parameters: s - string with comma delimited data to parse. parsed data. since: ostermillerutils 1.02.03 |
parse | public static String[][] parse(String s, char delimiter) throws BadDelimiterException(Code) | | Parse the delimited data from a string.
Only escaped backslashes and quotes will be recognized as escape sequences.
The data will be treated as having no comments.
Parameters: s - string with delimited data to parse. Parameters: delimiter - record separator parsed data. throws: BadDelimiterException - if the character cannot be used as a delimiter. since: ostermillerutils 1.02.24 |
parse | public static String[][] parse(String s, String escapes, String replacements, String commentDelims)(Code) | | Parse the comma delimited data from a string.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: s - string with comma delimited data to parse. Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. parsed data. since: ostermillerutils 1.02.03 |
parse | public static String[][] parse(String s, char delimiter, String escapes, String replacements, String commentDelims) throws BadDelimiterException(Code) | | Parse the delimited data from a string.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: s - string with delimited data to parse. Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. Parameters: delimiter - record separator parsed data. throws: BadDelimiterException - if the character cannot be used as a delimiter. since: ostermillerutils 1.02.24 |
parse | public static String[][] parse(Reader in, char delimiter) throws IOException, BadDelimiterException(Code) | | Parse the comma delimited data from a stream.
Only escaped backslashes and quotes will be recognized as escape sequences.
The data will be treated as having no comments.
Parameters: in - Reader with comma delimited data to parse. Parameters: delimiter - record separator parsed data. throws: BadDelimiterException - if the character cannot be used as a delimiter. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.02.24 |
parse | public static String[][] parse(Reader in) throws IOException(Code) | | Parse the delimited data from a stream.
Only escaped backslashes and quotes will be recognized as escape sequences.
The data will be treated as having no comments.
Parameters: in - Reader with comma delimited data to parse. parsed data. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.02.03 |
parse | public static String[][] parse(Reader in, char delimiter, String escapes, String replacements, String commentDelims) throws IOException, BadDelimiterException(Code) | | Parse the delimited data from a stream.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: in - Reader with delimited data to parse. Parameters: delimiter - record separator Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. parsed data. throws: BadDelimiterException - if the character cannot be used as a delimiter. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.02.24 |
parse | public static String[][] parse(Reader in, String escapes, String replacements, String commentDelims) throws IOException(Code) | | Parse the comma delimited data from a stream.
Escaped backslashes and quotes will always recognized as escape sequences.
Parameters: in - Reader with comma delimited data to parse. Parameters: escapes - a list of additional characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. Parameters: commentDelims - list of characters a comment line may start with. parsed data. throws: IOException - if an error occurs while reading. since: ostermillerutils 1.02.03 |
setCommentStart | public void setCommentStart(String commentDelims)(Code) | | Set the characters that indicate a comment at the beginning of the line.
For example if the string "#;!" were passed in, all of the following lines
would be comments:
# Comment
; Another Comment
! Yet another comment
By default there are no comments in CVS files. Commas and quotes may not be
used to indicate comment lines.
Parameters: commentDelims - list of characters a comment line may start with. since: ostermillerutils 1.00.00 |
setEscapes | public void setEscapes(String escapes, String replacements)(Code) | | Specify escape sequences and their replacements.
Escape sequences set here are in addition to \\ and \".
\\ and \" are always valid escape sequences. This method
allows standard escape sequenced to be used. For example
"\n" can be set to be a newline rather than an 'n'.
A common way to call this method might be:
setEscapes("nrtf", "\n\r\t\f");
which would set the escape sequences to be the Java escape
sequences. Characters that follow a \ that are not escape
sequences will still be interpreted as that character.
The two arguments to this method must be the same length. If
they are not, the longer of the two will be truncated.
Parameters: escapes - a list of characters that will represent escape sequences. Parameters: replacements - the list of replacement characters for those escape sequences. since: ostermillerutils 1.00.00 |
|
|