Java Doc for WARCConstants.java in  » Web-Crawler » heritrix » org » archive » io » warc » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Web Crawler » heritrix » org.archive.io.warc 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


org.archive.io.warc.WARCConstants

All known Subclasses:   org.archive.io.warc.v10.WARCReaderFactory,  org.archive.io.warc.v10.WARCReader,  org.archive.io.warc.ExperimentalWARCWriterTest,  org.archive.crawler.writer.ExperimentalWARCWriterProcessor,  org.archive.io.warc.v10.ExperimentalWARCWriter,  org.archive.io.warc.WARCReader,  org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor,  org.archive.io.warc.ExperimentalWARCWriter,  org.archive.io.warc.WARCReaderFactory,  org.archive.io.warc.v10.ExperimentalWARCWriterTest,  org.archive.io.warc.WARCRecord,  org.archive.io.warc.v10.WARCRecord,
WARCConstants
public interface WARCConstants extends ArchiveFileConstants(Code)
WARC Constants used by WARC readers and writers. Below constants are used by version 0.10 and 0.12 of WARC Reader/Writer.
author:
   stack
version:
   $Revision: 4976 $ $Date: 2007-03-09 13:59:07 +0000 (Fri, 09 Mar 2007) $


Field Summary
final public static  StringCOLON_SPACE
    
final public static  StringCOMPRESSED_WARC_FILE_EXTENSION
     Compressed WARC file extension.
final public static  StringCONTENT_DESCRIPTION
    
final public static  StringCONTENT_ID
    
final public static  StringCONTENT_LENGTH
    
final public static  StringCONTENT_TYPE
    
final public static  StringCONTINUATION
    
final public static  intCONTINUATION_INDEX
    
final public static  StringCONVERSION
    
final public static  intCONVERSION_INDEX
    
final public static  StringDEFAULT_ENCODING
     Encoding to use getting bytes from strings. Specify an encoding rather than leave it to chance: i.e whatever the JVMs encoding.
final public static  intDEFAULT_MAX_WARC_FILE_SIZE
     Default maximum WARC file size.
final public static  StringDOT_COMPRESSED_FILE_EXTENSION
    
final public static  StringDOT_COMPRESSED_WARC_FILE_EXTENSION
     Compressed dot WARC file extension.
final public static  StringDOT_WARC_FILE_EXTENSION
     Dot WARC file extension.
final public static  String[]HEADER_FIELD_KEYS
    
final public static  charHEADER_FIELD_SEPARATOR
     Header field seperator character.
final public static  StringHEADER_KEY_CHECKSUM
    
final public static  StringHEADER_KEY_CONCURRENT_TO
    
final public static  StringHEADER_KEY_DATE
    
final public static  StringHEADER_KEY_ETAG
    
final public static  StringHEADER_KEY_FILENAME
    
final public static  StringHEADER_KEY_IP
    
final public static  StringHEADER_KEY_LAST_MODIFIED
    
final public static  StringHEADER_KEY_PROFILE
    
final public static  StringHEADER_KEY_TRUNCATED
    
final public static  StringHEADER_KEY_TYPE
    
final public static  StringHEADER_KEY_URI
    
final public static  StringHEADER_LINE_ENCODING
    
final public static  StringHTTP_REQUEST_MIMETYPE
     To be safe, lets use application type rather than message.
final public static  StringHTTP_RESPONSE_MIMETYPE
    
final public static  intMAX_LINE_LENGTH
    
final public static  intMAX_WARC_HEADER_LINE_LENGTH
     Assumed maximum size of a Header Line.
final public static  StringMETADATA
    
final public static  intMETADATA_INDEX
    
final public static  StringMIME_VERSION
    
final public static  StringNAMED_FIELD_CHECKSUM_LABEL
    
final public static  StringNAMED_FIELD_DESCRIPTION
    
final public static  StringNAMED_FIELD_FILEDESC
    
final public static  StringNAMED_FIELD_IP_LABEL
    
final public static  StringNAMED_FIELD_RELATED_LABEL
    
final public static  StringNAMED_FIELD_TRUNCATED
    
final public static  StringNAMED_FIELD_TRUNCATED_VALUE_HEAD
    
final public static  StringNAMED_FIELD_TRUNCATED_VALUE_LEN
    
final public static  StringNAMED_FIELD_TRUNCATED_VALUE_TIME
    
final public static  StringNAMED_FIELD_TRUNCATED_VALUE_UNSPECIFIED
    
final public static  StringNAMED_FIELD_WARCFILENAME
    
final public static  StringPLACEHOLDER_RECORD_LENGTH_STRING
     Placeholder for length in Header line. Placeholder is same size as the fixed field size allocated for length, 12 characters.
final public static  StringPROFILE_CONVERSION_SOFTWARE_COMMAND
    
final public static  StringPROFILE_REVISIT_IDENTICAL_DIGEST
    
final public static  StringPROFILE_REVISIT_NOT_MODIFIED
    
final public static  StringREQUEST
    
final public static  intREQUEST_INDEX
    
final public static  StringRESOURCE
    
final public static  intRESOURCE_INDEX
    
final public static  StringRESPONSE
    
final public static  intRESPONSE_INDEX
    
final public static  StringREVISIT
    
final public static  intREVISIT_INDEX
    
final public static  StringTRUNCATED_VALUE_UNSPECIFIED
    
final public static  StringTYPE
    
final public static  String[]TYPES
    
final public static  ListTYPES_LIST
    
final public static  StringWARCINFO
     WARC Record Types.
final public static  intWARCINFO_INDEX
    
final public static  StringWARC_010_ID
    
final public static  StringWARC_010_MAGIC
    
final public static  StringWARC_FILE_EXTENSION
     WARC file extention.
final public static  StringWARC_HEADER_ENCODING
    
final public static  StringWARC_ID
    
final public static  StringWARC_MAGIC
     WARC MAGIC WARC files and records begin with this sequence.
final public static  StringWARC_VERSION
     Hard-coded version for WARC files made with this code. Setting to 0.10 because differs from 0.9 spec.
final public static  Character[]WSP
     WSP One of a space or horizontal tab character. TODO: WSP undefined.



Field Detail
COLON_SPACE
final public static String COLON_SPACE(Code)



COMPRESSED_WARC_FILE_EXTENSION
final public static String COMPRESSED_WARC_FILE_EXTENSION(Code)
Compressed WARC file extension.



CONTENT_DESCRIPTION
final public static String CONTENT_DESCRIPTION(Code)



CONTENT_ID
final public static String CONTENT_ID(Code)



CONTENT_LENGTH
final public static String CONTENT_LENGTH(Code)



CONTENT_TYPE
final public static String CONTENT_TYPE(Code)



CONTINUATION
final public static String CONTINUATION(Code)



CONTINUATION_INDEX
final public static int CONTINUATION_INDEX(Code)



CONVERSION
final public static String CONVERSION(Code)



CONVERSION_INDEX
final public static int CONVERSION_INDEX(Code)



DEFAULT_ENCODING
final public static String DEFAULT_ENCODING(Code)
Encoding to use getting bytes from strings. Specify an encoding rather than leave it to chance: i.e whatever the JVMs encoding. Use an encoding that gets the stream as bytes, not chars.

TODO: ARC uses ISO-8859-1. In general, we should use UTF-8 but we probably need a single byte encoding if we're out for preserving the binary data as received over the net (We probably don't want to transform the supra-ASCII characters to UTF-8 before storing in ARC). For now, till we figure it, DEFAULT_ENCODING is single-byte charset -- same as ARCs.




DEFAULT_MAX_WARC_FILE_SIZE
final public static int DEFAULT_MAX_WARC_FILE_SIZE(Code)
Default maximum WARC file size. 1Gig.



DOT_COMPRESSED_FILE_EXTENSION
final public static String DOT_COMPRESSED_FILE_EXTENSION(Code)



DOT_COMPRESSED_WARC_FILE_EXTENSION
final public static String DOT_COMPRESSED_WARC_FILE_EXTENSION(Code)
Compressed dot WARC file extension.



DOT_WARC_FILE_EXTENSION
final public static String DOT_WARC_FILE_EXTENSION(Code)
Dot WARC file extension.



HEADER_FIELD_KEYS
final public static String[] HEADER_FIELD_KEYS(Code)



HEADER_FIELD_SEPARATOR
final public static char HEADER_FIELD_SEPARATOR(Code)
Header field seperator character.



HEADER_KEY_CHECKSUM
final public static String HEADER_KEY_CHECKSUM(Code)



HEADER_KEY_CONCURRENT_TO
final public static String HEADER_KEY_CONCURRENT_TO(Code)



HEADER_KEY_DATE
final public static String HEADER_KEY_DATE(Code)



HEADER_KEY_ETAG
final public static String HEADER_KEY_ETAG(Code)



HEADER_KEY_FILENAME
final public static String HEADER_KEY_FILENAME(Code)



HEADER_KEY_IP
final public static String HEADER_KEY_IP(Code)



HEADER_KEY_LAST_MODIFIED
final public static String HEADER_KEY_LAST_MODIFIED(Code)



HEADER_KEY_PROFILE
final public static String HEADER_KEY_PROFILE(Code)



HEADER_KEY_TRUNCATED
final public static String HEADER_KEY_TRUNCATED(Code)



HEADER_KEY_TYPE
final public static String HEADER_KEY_TYPE(Code)



HEADER_KEY_URI
final public static String HEADER_KEY_URI(Code)



HEADER_LINE_ENCODING
final public static String HEADER_LINE_ENCODING(Code)



HTTP_REQUEST_MIMETYPE
final public static String HTTP_REQUEST_MIMETYPE(Code)
To be safe, lets use application type rather than message. Regards 'message/http', RFC says "...provided that it obeys the MIME restrictions for all 'message' types regarding line length and encodings." This usually means lines of 1000 octets max (unless a 'Content-Transfer-Encoding: binary' mime header is present).
See Also:    rfc2616 section 19.1



HTTP_RESPONSE_MIMETYPE
final public static String HTTP_RESPONSE_MIMETYPE(Code)



MAX_LINE_LENGTH
final public static int MAX_LINE_LENGTH(Code)



MAX_WARC_HEADER_LINE_LENGTH
final public static int MAX_WARC_HEADER_LINE_LENGTH(Code)
Assumed maximum size of a Header Line. This 100k which seems massive but its the same as the LINE_LENGTH from alexa/include/a_arcio.h:
 #define LINE_LENGTH     (100*1024)
 



METADATA
final public static String METADATA(Code)



METADATA_INDEX
final public static int METADATA_INDEX(Code)



MIME_VERSION
final public static String MIME_VERSION(Code)



NAMED_FIELD_CHECKSUM_LABEL
final public static String NAMED_FIELD_CHECKSUM_LABEL(Code)



NAMED_FIELD_DESCRIPTION
final public static String NAMED_FIELD_DESCRIPTION(Code)



NAMED_FIELD_FILEDESC
final public static String NAMED_FIELD_FILEDESC(Code)



NAMED_FIELD_IP_LABEL
final public static String NAMED_FIELD_IP_LABEL(Code)



NAMED_FIELD_RELATED_LABEL
final public static String NAMED_FIELD_RELATED_LABEL(Code)



NAMED_FIELD_TRUNCATED
final public static String NAMED_FIELD_TRUNCATED(Code)



NAMED_FIELD_TRUNCATED_VALUE_HEAD
final public static String NAMED_FIELD_TRUNCATED_VALUE_HEAD(Code)



NAMED_FIELD_TRUNCATED_VALUE_LEN
final public static String NAMED_FIELD_TRUNCATED_VALUE_LEN(Code)



NAMED_FIELD_TRUNCATED_VALUE_TIME
final public static String NAMED_FIELD_TRUNCATED_VALUE_TIME(Code)



NAMED_FIELD_TRUNCATED_VALUE_UNSPECIFIED
final public static String NAMED_FIELD_TRUNCATED_VALUE_UNSPECIFIED(Code)



NAMED_FIELD_WARCFILENAME
final public static String NAMED_FIELD_WARCFILENAME(Code)



PLACEHOLDER_RECORD_LENGTH_STRING
final public static String PLACEHOLDER_RECORD_LENGTH_STRING(Code)
Placeholder for length in Header line. Placeholder is same size as the fixed field size allocated for length, 12 characters. 12 characters allows records of size almost 1TB.



PROFILE_CONVERSION_SOFTWARE_COMMAND
final public static String PROFILE_CONVERSION_SOFTWARE_COMMAND(Code)



PROFILE_REVISIT_IDENTICAL_DIGEST
final public static String PROFILE_REVISIT_IDENTICAL_DIGEST(Code)



PROFILE_REVISIT_NOT_MODIFIED
final public static String PROFILE_REVISIT_NOT_MODIFIED(Code)



REQUEST
final public static String REQUEST(Code)



REQUEST_INDEX
final public static int REQUEST_INDEX(Code)



RESOURCE
final public static String RESOURCE(Code)



RESOURCE_INDEX
final public static int RESOURCE_INDEX(Code)



RESPONSE
final public static String RESPONSE(Code)



RESPONSE_INDEX
final public static int RESPONSE_INDEX(Code)



REVISIT
final public static String REVISIT(Code)



REVISIT_INDEX
final public static int REVISIT_INDEX(Code)



TRUNCATED_VALUE_UNSPECIFIED
final public static String TRUNCATED_VALUE_UNSPECIFIED(Code)



TYPE
final public static String TYPE(Code)



TYPES
final public static String[] TYPES(Code)



TYPES_LIST
final public static List TYPES_LIST(Code)



WARCINFO
final public static String WARCINFO(Code)
WARC Record Types.



WARCINFO_INDEX
final public static int WARCINFO_INDEX(Code)



WARC_010_ID
final public static String WARC_010_ID(Code)



WARC_010_MAGIC
final public static String WARC_010_MAGIC(Code)



WARC_FILE_EXTENSION
final public static String WARC_FILE_EXTENSION(Code)
WARC file extention.



WARC_HEADER_ENCODING
final public static String WARC_HEADER_ENCODING(Code)



WARC_ID
final public static String WARC_ID(Code)
WARC-ID



WARC_MAGIC
final public static String WARC_MAGIC(Code)
WARC MAGIC WARC files and records begin with this sequence.



WARC_VERSION
final public static String WARC_VERSION(Code)
Hard-coded version for WARC files made with this code. Setting to 0.10 because differs from 0.9 spec. See accompanying package documentation.



WSP
final public static Character[] WSP(Code)
WSP One of a space or horizontal tab character. TODO: WSP undefined. Fix.





www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.