| java.lang.Object org.archive.io.ArchiveReader
All known Subclasses: org.archive.io.warc.v10.WARCReader, org.archive.io.warc.WARCReader, org.archive.io.arc.ARCReader,
ArchiveReader | abstract public class ArchiveReader implements ArchiveFileConstants(Code) | | Reader for an Archive file of Archive
ArchiveRecord s.
author: stack version: $Date: 2007-03-13 00:08:58 +0000 (Tue, 13 Mar 2007) $ $Version$ |
Inner Class :protected class RandomAccessBufferedInputStream extends BufferedInputStream implements RepositionableStream | |
MAX_ALLOWED_RECOVERABLES | final public static int MAX_ALLOWED_RECOVERABLES(Code) | | Maximum amount of recoverable exceptions in a row.
If more than this amount in a row, we'll let out the exception rather
than go back in for yet another retry.
|
ArchiveReader | protected ArchiveReader()(Code) | | |
cleanupCurrentRecord | protected void cleanupCurrentRecord() throws IOException(Code) | | Cleanout the current record if there is one.
throws: IOException - |
createArchiveRecord | abstract protected ArchiveRecord createArchiveRecord(InputStream is, long offset) throws IOException(Code) | | Return an Archive Record homed on offset into
is .
Parameters: is - Stream to read Record from. Parameters: offset - Offset to find Record at. ArchiveRecord instance. throws: IOException - |
get | public ArchiveRecord get(long offset) throws IOException(Code) | | Get record at passed offset .
Parameters: offset - Byte index into file at which a record starts. An Archive Record reference. throws: IOException - |
getDeleteFileOnCloseReader | abstract public ArchiveReader getDeleteFileOnCloseReader(File f)(Code) | | an ArchiveReader that will delete a local file on close. Usedwhen we bring Archive files local and need to clean up afterward. |
getDotFileExtension | abstract public String getDotFileExtension()(Code) | | |
getFileExtension | abstract public String getFileExtension()(Code) | | |
getFileName | public String getFileName()(Code) | | short name of Archive file. |
getInputStream | protected InputStream getInputStream(File f, long offset) throws IOException(Code) | | Convenience method for constructors.
Parameters: f - File to read. Parameters: offset - Offset at which to start reading. InputStream to read from. throws: IOException - If failed open or fail to get a memorymapped byte buffer on file. |
getOptions | protected static Options getOptions()(Code) | | Base Options object filled out with help, digest, strict, etc.options. |
getReaderIdentifier | public String getReaderIdentifier()(Code) | | |
getStrippedFileName | public String getStrippedFileName()(Code) | | short name of Archive file. |
getStrippedFileName | public static String getStrippedFileName(String name, String dotFileExtension)(Code) | | Parameters: name - Name of ARCFile. Parameters: dotFileExtension - '.arc' or '.warc', etc. short name of Archive file. |
getTrueOrFalse | protected static boolean getTrueOrFalse(String value)(Code) | | Parameters: value - Value to test. True if value is 'true', else false. |
getVersion | public String getVersion()(Code) | | Version of this Archive file. |
gotoEOR | abstract protected void gotoEOR(ArchiveRecord record) throws IOException(Code) | | Skip over any trailing new lines at end of the record so we're lined up
ready to read the next.
Parameters: record - throws: IOException - |
initialize | protected void initialize(String i)(Code) | | Convenience method used by subclass constructors.
Parameters: i - Identifier for Archive file this reader goes against. |
isCompressed | public boolean isCompressed()(Code) | | |
isDigest | public boolean isDigest()(Code) | | True if we're digesting as we read. |
isStrict | public boolean isStrict()(Code) | | Returns the strict. |
isValid | public boolean isValid()(Code) | | Test Archive file is valid.
Assumes the stream is at the start of the file. Be aware that this
method makes a pass over the whole file.
True if file can be successfully parsed. |
iterator | public Iterator<ArchiveRecord> iterator()(Code) | | Returns an ArchiveRecord iterator.
Of note, on IOException, especially if ZipException reading compressed
ARCs, rather than fail the iteration, try moving to the next record.
If
ArchiveReader.strict is not set, this will usually succeed.
An iterator over ARC records. |
logStdErr | public void logStdErr(Level level, String message)(Code) | | Log on stderr.
Logging should go via the logging system. This method
bypasses the logging system going direct to stderr.
Should not generally be used. Its used for rare messages
that come of cmdline usage of ARCReader ERRORs and WARNINGs.
Override if using ARCReader in a context where no stderr or
where you'd like to redirect stderr to other than System.err.
Parameters: level - Level to log message at. Parameters: message - Message to log. |
outputRecord | public boolean outputRecord(String format) throws IOException(Code) | | Output passed record using passed format specifier.
Parameters: format - What format to use outputting. throws: IOException - True if handled. |
outputRecord | protected static void outputRecord(ArchiveReader r, String format) throws IOException(Code) | | Output passed record using passed format specifier.
Parameters: r - ARCReader instance to output. Parameters: format - What format to use outputting. throws: IOException - |
rewind | protected void rewind() throws IOException(Code) | | Rewinds stream to start of the Archive file.
throws: IOException - if stream is not resettable. |
setCompressed | protected void setCompressed(boolean compressed)(Code) | | |
setDigest | public void setDigest(boolean d)(Code) | | Parameters: d - True if we're to digest. |
setReaderIdentifier | protected void setReaderIdentifier(String i)(Code) | | |
setStrict | public void setStrict(boolean s)(Code) | | Parameters: s - The strict to set. |
validate | public List validate() throws IOException(Code) | | Validate the Archive file.
This method iterates over the file throwing exception if it fails
to successfully parse any record.
Assumes the stream is at the start of the file.
List of all read Archive Headers. throws: IOException - |
validate | public List validate(int noRecords) throws IOException(Code) | | Validate the Archive file.
This method iterates over the file throwing exception if it fails
to successfully parse.
We start validation from whereever we are in the stream.
Parameters: noRecords - Number of records expected. Pass -1 if number isunknown. List of all read metadatas. As we validate records, we adda reference to the read metadata. throws: IOException - |
|
|