Java Doc for ArchiveReader.java in  » Web-Crawler » heritrix » org » archive » io » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Web Crawler » heritrix » org.archive.io 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   org.archive.io.ArchiveReader

All known Subclasses:   org.archive.io.warc.v10.WARCReader,  org.archive.io.warc.WARCReader,  org.archive.io.arc.ARCReader,
ArchiveReader
abstract public class ArchiveReader implements ArchiveFileConstants(Code)
Reader for an Archive file of Archive ArchiveRecord s.
author:
   stack
version:
   $Date: 2007-03-13 00:08:58 +0000 (Tue, 13 Mar 2007) $ $Version$

Inner Class :protected class RandomAccessBufferedInputStream extends BufferedInputStream implements RepositionableStream
Inner Class :protected class ArchiveRecordIterator implements Iterator<ArchiveRecord>

Field Summary
final public static  intMAX_ALLOWED_RECOVERABLES
     Maximum amount of recoverable exceptions in a row.

Constructor Summary
protected  ArchiveReader()
    

Method Summary
protected  voidcdxOutput(boolean toFile)
    
protected  voidcleanupCurrentRecord()
     Cleanout the current record if there is one.
public  voidclose()
    
abstract protected  ArchiveRecordcreateArchiveRecord(InputStream is, long offset)
     Return an Archive Record homed on offset into is.
Parameters:
  is - Stream to read Record from.
Parameters:
  offset - Offset to find Record at.
protected  ArchiveRecordcurrentRecord(ArchiveRecord currentRecord)
    
abstract public  voiddump(boolean compress)
    
public  ArchiveRecordget(long offset)
     Get record at passed offset.
Parameters:
  offset - Byte index into file at which a record starts.
public  ArchiveRecordget()
    
protected  ArchiveRecordgetCurrentRecord()
    
abstract public  ArchiveReadergetDeleteFileOnCloseReader(File f)
     an ArchiveReader that will delete a local file on close.
abstract public  StringgetDotFileExtension()
    
abstract public  StringgetFileExtension()
    
public  StringgetFileName()
    
protected  InputStreamgetIn()
    
protected  InputStreamgetInputStream(File f, long offset)
     Convenience method for constructors.
Parameters:
  f - File to read.
Parameters:
  offset - Offset at which to start reading.
protected  InputStreamgetInputStream()
    
protected  LoggergetLogger()
    
protected static  OptionsgetOptions()
    
public  StringgetReaderIdentifier()
    
public  StringgetStrippedFileName()
    
public static  StringgetStrippedFileName(String name, String dotFileExtension)
    
Parameters:
  name - Name of ARCFile.
Parameters:
  dotFileExtension - '.arc' or '.warc', etc.
protected static  booleangetTrueOrFalse(String value)
    
Parameters:
  value - Value to test.
public  StringgetVersion()
    
abstract protected  voidgotoEOR(ArchiveRecord record)
     Skip over any trailing new lines at end of the record so we're lined up ready to read the next.
protected  voidinitialize(String i)
     Convenience method used by subclass constructors.
public  booleanisCompressed()
    
public  booleanisDigest()
    
public  booleanisStrict()
    
public  booleanisValid()
     Test Archive file is valid. Assumes the stream is at the start of the file.
public  Iterator<ArchiveRecord>iterator()
     Returns an ArchiveRecord iterator.
public  voidlogStdErr(Level level, String message)
     Log on stderr. Logging should go via the logging system.
protected  booleanoutput(String format)
    
public  booleanoutputRecord(String format)
     Output passed record using passed format specifier.
protected static  voidoutputRecord(ArchiveReader r, String format)
     Output passed record using passed format specifier.
protected  voidrewind()
     Rewinds stream to start of the Archive file.
protected  voidsetCompressed(boolean compressed)
    
public  voidsetDigest(boolean d)
    
protected  voidsetIn(InputStream in)
    
protected  voidsetReaderIdentifier(String i)
    
public  voidsetStrict(boolean s)
    
protected  voidsetVersion(String version)
    
protected static  StringstripExtension(String name, String ext)
    
public  Listvalidate()
     Validate the Archive file.
public  Listvalidate(int noRecords)
     Validate the Archive file. This method iterates over the file throwing exception if it fails to successfully parse.

We start validation from whereever we are in the stream.
Parameters:
  noRecords - Number of records expected.


Field Detail
MAX_ALLOWED_RECOVERABLES
final public static int MAX_ALLOWED_RECOVERABLES(Code)
Maximum amount of recoverable exceptions in a row. If more than this amount in a row, we'll let out the exception rather than go back in for yet another retry.




Constructor Detail
ArchiveReader
protected ArchiveReader()(Code)




Method Detail
cdxOutput
protected void cdxOutput(boolean toFile) throws IOException(Code)



cleanupCurrentRecord
protected void cleanupCurrentRecord() throws IOException(Code)
Cleanout the current record if there is one.
throws:
  IOException -



close
public void close() throws IOException(Code)



createArchiveRecord
abstract protected ArchiveRecord createArchiveRecord(InputStream is, long offset) throws IOException(Code)
Return an Archive Record homed on offset into is.
Parameters:
  is - Stream to read Record from.
Parameters:
  offset - Offset to find Record at. ArchiveRecord instance.
throws:
  IOException -



currentRecord
protected ArchiveRecord currentRecord(ArchiveRecord currentRecord)(Code)



dump
abstract public void dump(boolean compress) throws IOException, java.text.ParseException(Code)
Dump this file on STDOUT
throws:
  compress - True if dumped output is compressed.
throws:
  IOException -
throws:
  java.text.ParseException -



get
public ArchiveRecord get(long offset) throws IOException(Code)
Get record at passed offset.
Parameters:
  offset - Byte index into file at which a record starts. An Archive Record reference.
throws:
  IOException -



get
public ArchiveRecord get() throws IOException(Code)
Return Archive Record created against current offset.
throws:
  IOException -



getCurrentRecord
protected ArchiveRecord getCurrentRecord()(Code)
The current ARC record or null if none.After construction has the arcfile header record.
See Also:   ArchiveReader.get()



getDeleteFileOnCloseReader
abstract public ArchiveReader getDeleteFileOnCloseReader(File f)(Code)
an ArchiveReader that will delete a local file on close. Usedwhen we bring Archive files local and need to clean up afterward.



getDotFileExtension
abstract public String getDotFileExtension()(Code)



getFileExtension
abstract public String getFileExtension()(Code)



getFileName
public String getFileName()(Code)
short name of Archive file.



getIn
protected InputStream getIn()(Code)



getInputStream
protected InputStream getInputStream(File f, long offset) throws IOException(Code)
Convenience method for constructors.
Parameters:
  f - File to read.
Parameters:
  offset - Offset at which to start reading. InputStream to read from.
throws:
  IOException - If failed open or fail to get a memorymapped byte buffer on file.



getInputStream
protected InputStream getInputStream()(Code)



getLogger
protected Logger getLogger()(Code)



getOptions
protected static Options getOptions()(Code)
Base Options object filled out with help, digest, strict, etc.options.



getReaderIdentifier
public String getReaderIdentifier()(Code)



getStrippedFileName
public String getStrippedFileName()(Code)
short name of Archive file.



getStrippedFileName
public static String getStrippedFileName(String name, String dotFileExtension)(Code)

Parameters:
  name - Name of ARCFile.
Parameters:
  dotFileExtension - '.arc' or '.warc', etc. short name of Archive file.



getTrueOrFalse
protected static boolean getTrueOrFalse(String value)(Code)

Parameters:
  value - Value to test. True if value is 'true', else false.



getVersion
public String getVersion()(Code)
Version of this Archive file.



gotoEOR
abstract protected void gotoEOR(ArchiveRecord record) throws IOException(Code)
Skip over any trailing new lines at end of the record so we're lined up ready to read the next.
Parameters:
  record -
throws:
  IOException -



initialize
protected void initialize(String i)(Code)
Convenience method used by subclass constructors.
Parameters:
  i - Identifier for Archive file this reader goes against.



isCompressed
public boolean isCompressed()(Code)



isDigest
public boolean isDigest()(Code)
True if we're digesting as we read.



isStrict
public boolean isStrict()(Code)
Returns the strict.



isValid
public boolean isValid()(Code)
Test Archive file is valid. Assumes the stream is at the start of the file. Be aware that this method makes a pass over the whole file. True if file can be successfully parsed.



iterator
public Iterator<ArchiveRecord> iterator()(Code)
Returns an ArchiveRecord iterator. Of note, on IOException, especially if ZipException reading compressed ARCs, rather than fail the iteration, try moving to the next record. If ArchiveReader.strict is not set, this will usually succeed. An iterator over ARC records.



logStdErr
public void logStdErr(Level level, String message)(Code)
Log on stderr. Logging should go via the logging system. This method bypasses the logging system going direct to stderr. Should not generally be used. Its used for rare messages that come of cmdline usage of ARCReader ERRORs and WARNINGs. Override if using ARCReader in a context where no stderr or where you'd like to redirect stderr to other than System.err.
Parameters:
  level - Level to log message at.
Parameters:
  message - Message to log.



output
protected boolean output(String format) throws IOException, java.text.ParseException(Code)

Parameters:
  format - Format to use outputting.
throws:
  IOException -
throws:
  java.text.ParseException - True if handled.



outputRecord
public boolean outputRecord(String format) throws IOException(Code)
Output passed record using passed format specifier.
Parameters:
  format - What format to use outputting.
throws:
  IOException - True if handled.



outputRecord
protected static void outputRecord(ArchiveReader r, String format) throws IOException(Code)
Output passed record using passed format specifier.
Parameters:
  r - ARCReader instance to output.
Parameters:
  format - What format to use outputting.
throws:
  IOException -



rewind
protected void rewind() throws IOException(Code)
Rewinds stream to start of the Archive file.
throws:
  IOException - if stream is not resettable.



setCompressed
protected void setCompressed(boolean compressed)(Code)



setDigest
public void setDigest(boolean d)(Code)

Parameters:
  d - True if we're to digest.



setIn
protected void setIn(InputStream in)(Code)



setReaderIdentifier
protected void setReaderIdentifier(String i)(Code)



setStrict
public void setStrict(boolean s)(Code)

Parameters:
  s - The strict to set.



setVersion
protected void setVersion(String version)(Code)



stripExtension
protected static String stripExtension(String name, String ext)(Code)



validate
public List validate() throws IOException(Code)
Validate the Archive file. This method iterates over the file throwing exception if it fails to successfully parse any record.

Assumes the stream is at the start of the file. List of all read Archive Headers.
throws:
  IOException -




validate
public List validate(int noRecords) throws IOException(Code)
Validate the Archive file. This method iterates over the file throwing exception if it fails to successfully parse.

We start validation from whereever we are in the stream.
Parameters:
  noRecords - Number of records expected. Pass -1 if number isunknown. List of all read metadatas. As we validate records, we adda reference to the read metadata.
throws:
  IOException -




Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.