Java Doc for CrawlOrder.java in  » Web-Crawler » heritrix » org » archive » crawler » datamodel » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Web Crawler » heritrix » org.archive.crawler.datamodel 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


org.archive.crawler.settings.ModuleType
   org.archive.crawler.datamodel.CrawlOrder

CrawlOrder
public class CrawlOrder extends ModuleType implements Serializable(Code)
Represents the 'root' of the settings hierarchy. Contains those settings that do not belong to any specific module, but rather relate to the crawl as a whole (much of this is used by the CrawlController directly or indirectly).
See Also:   org.archive.crawler.settings.ModuleType


Field Summary
final public static  StringATTR_BDB_CACHE_PERCENT
    
final public static  StringATTR_CHECKPOINTS_PATH
    
final public static  StringATTR_CHECKPOINT_COPY_BDBJE_LOGS
     When checkpointing, copy the bdb logs. Default is true.
final public static  StringATTR_DISK_PATH
    
final public static  StringATTR_EXTRACT_PROCESSORS
    
final public static  StringATTR_FETCH_PROCESSORS
    
final public static  StringATTR_FROM
    
final public static  StringATTR_HTTP_HEADERS
    
final public static  StringATTR_LOGGERS
    
final public static  StringATTR_LOGS_PATH
    
final public static  StringATTR_MAX_BYTES_DOWNLOAD
    
final public static  StringATTR_MAX_DOCUMENT_DOWNLOAD
    
final public static  StringATTR_MAX_TIME_SEC
    
final public static  StringATTR_MAX_TOE_THREADS
    
final public static  StringATTR_NAME
    
final public static  StringATTR_POST_PROCESSORS
    
final public static  StringATTR_PRE_FETCH_PROCESSORS
    
final public static  StringATTR_RECORDER_IN_BUFFER
    
final public static  StringATTR_RECORDER_OUT_BUFFER
    
final public static  StringATTR_RECOVER_PATH
    
final public static  StringATTR_RECOVER_RETAIN_FAILURES
    
final public static  StringATTR_RULES
    
final public static  StringATTR_SCRATCH_PATH
    
final public static  StringATTR_SETTINGS_DIRECTORY
    
final public static  StringATTR_STATE_PATH
    
final public static  StringATTR_USER_AGENT
    
final public static  StringATTR_WRITE_PROCESSORS
    
final public static  BooleanDEFAULT_CHECKPOINT_COPY_BDBJE_LOGS
    

Constructor Summary
public  CrawlOrder()
     Construct a CrawlOrder.

Method Summary
public  voidcheckUserAgentAndFrom()
     Checks if the User Agent and From field are set 'correctly' in the specified Crawl Order.
public  FilegetCheckpointsDirectory()
    
public  CrawlControllergetController()
    
public  StringgetCrawlOrderName()
     Get the name of the order file.
public  StringgetFrom(CrawlURI curi)
    
public  MapTypegetLoggers()
     Returns the Map of the StatisticsTracking modules that are included in the configuration that the current instance of this class is representing.
public  intgetMaxToes()
     Returns the set number of maximum toe threads.
public  RobotsHonoringPolicygetRobotsHonoringPolicy()
     This method gets the RobotsHonoringPolicy object from the orders file.
public  FilegetSettingsDir(String key)
     Return fullpath to the directory named by key in settings. If directory does not exist, it and all intermediary dirs will be created.
Parameters:
  key - Key to use going to settings.
public  StringgetUserAgent(CrawlURI curi)
    
public  voidsetController(CrawlController controller)
    

Field Detail
ATTR_BDB_CACHE_PERCENT
final public static String ATTR_BDB_CACHE_PERCENT(Code)
Percentage of heap to allocate to bdb cache



ATTR_CHECKPOINTS_PATH
final public static String ATTR_CHECKPOINTS_PATH(Code)



ATTR_CHECKPOINT_COPY_BDBJE_LOGS
final public static String ATTR_CHECKPOINT_COPY_BDBJE_LOGS(Code)
When checkpointing, copy the bdb logs. Default is true. If false, then we do not copy logs on checkpoint AND we tell bdbje never to delete log files; instead it renames files-to-delete with a '.del' extension. Assumption is that when this setting is false, an external process is managing the removing of bdbje log files and that come time to recover from a checkpoint, the files that comprise a checkpoint are manually assembled.



ATTR_DISK_PATH
final public static String ATTR_DISK_PATH(Code)



ATTR_EXTRACT_PROCESSORS
final public static String ATTR_EXTRACT_PROCESSORS(Code)



ATTR_FETCH_PROCESSORS
final public static String ATTR_FETCH_PROCESSORS(Code)



ATTR_FROM
final public static String ATTR_FROM(Code)



ATTR_HTTP_HEADERS
final public static String ATTR_HTTP_HEADERS(Code)



ATTR_LOGGERS
final public static String ATTR_LOGGERS(Code)



ATTR_LOGS_PATH
final public static String ATTR_LOGS_PATH(Code)



ATTR_MAX_BYTES_DOWNLOAD
final public static String ATTR_MAX_BYTES_DOWNLOAD(Code)



ATTR_MAX_DOCUMENT_DOWNLOAD
final public static String ATTR_MAX_DOCUMENT_DOWNLOAD(Code)



ATTR_MAX_TIME_SEC
final public static String ATTR_MAX_TIME_SEC(Code)



ATTR_MAX_TOE_THREADS
final public static String ATTR_MAX_TOE_THREADS(Code)



ATTR_NAME
final public static String ATTR_NAME(Code)



ATTR_POST_PROCESSORS
final public static String ATTR_POST_PROCESSORS(Code)



ATTR_PRE_FETCH_PROCESSORS
final public static String ATTR_PRE_FETCH_PROCESSORS(Code)



ATTR_RECORDER_IN_BUFFER
final public static String ATTR_RECORDER_IN_BUFFER(Code)



ATTR_RECORDER_OUT_BUFFER
final public static String ATTR_RECORDER_OUT_BUFFER(Code)



ATTR_RECOVER_PATH
final public static String ATTR_RECOVER_PATH(Code)



ATTR_RECOVER_RETAIN_FAILURES
final public static String ATTR_RECOVER_RETAIN_FAILURES(Code)



ATTR_RULES
final public static String ATTR_RULES(Code)



ATTR_SCRATCH_PATH
final public static String ATTR_SCRATCH_PATH(Code)



ATTR_SETTINGS_DIRECTORY
final public static String ATTR_SETTINGS_DIRECTORY(Code)



ATTR_STATE_PATH
final public static String ATTR_STATE_PATH(Code)



ATTR_USER_AGENT
final public static String ATTR_USER_AGENT(Code)



ATTR_WRITE_PROCESSORS
final public static String ATTR_WRITE_PROCESSORS(Code)



DEFAULT_CHECKPOINT_COPY_BDBJE_LOGS
final public static Boolean DEFAULT_CHECKPOINT_COPY_BDBJE_LOGS(Code)




Constructor Detail
CrawlOrder
public CrawlOrder()(Code)
Construct a CrawlOrder.




Method Detail
checkUserAgentAndFrom
public void checkUserAgentAndFrom() throws FatalConfigurationException(Code)
Checks if the User Agent and From field are set 'correctly' in the specified Crawl Order.
throws:
  FatalConfigurationException -



getCheckpointsDirectory
public File getCheckpointsDirectory()(Code)
Checkpoint directory.



getController
public CrawlController getController()(Code)
The crawl controller.



getCrawlOrderName
public String getCrawlOrderName()(Code)
Get the name of the order file. the name of the order file.



getFrom
public String getFrom(CrawlURI curi)(Code)

Parameters:
  curi - from header value to use



getLoggers
public MapType getLoggers()(Code)
Returns the Map of the StatisticsTracking modules that are included in the configuration that the current instance of this class is representing. Map of the StatisticsTracking modules



getMaxToes
public int getMaxToes()(Code)
Returns the set number of maximum toe threads. Number of maximum toe threads



getRobotsHonoringPolicy
public RobotsHonoringPolicy getRobotsHonoringPolicy()(Code)
This method gets the RobotsHonoringPolicy object from the orders file. the new RobotsHonoringPolicy



getSettingsDir
public File getSettingsDir(String key) throws AttributeNotFoundException(Code)
Return fullpath to the directory named by key in settings. If directory does not exist, it and all intermediary dirs will be created.
Parameters:
  key - Key to use going to settings. Full path to directory named by key.
throws:
  AttributeNotFoundException -



getUserAgent
public String getUserAgent(CrawlURI curi)(Code)

Parameters:
  curi - user-agent header value to use



setController
public void setController(CrawlController controller)(Code)

Parameters:
  controller -



Methods inherited from org.archive.crawler.settings.ModuleType
public Type addElement(CrawlerSettings settings, Type type) throws InvalidAttributeValueException(Code)(Java Doc)
protected void listUsedFiles(List<String> list)(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.