Java Doc for UriUniqFilter.java in  » Web-Crawler » heritrix » org » archive » crawler » datamodel » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Web Crawler » heritrix » org.archive.crawler.datamodel 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


org.archive.crawler.datamodel.UriUniqFilter

All known Subclasses:   org.archive.crawler.util.SetBasedUriUniqFilter,  org.archive.crawler.util.BdbUriUniqFilterTest,  org.archive.crawler.util.BloomUriUniqFilterTest,  org.archive.crawler.util.BenchmarkUriUniqFilters,  org.archive.crawler.util.FPUriUniqFilterTest,  org.archive.crawler.util.FPMergeUriUniqFilter,
UriUniqFilter
public interface UriUniqFilter (Code)
A UriUniqFilter passes URI objects to a destination (receiver) if the passed URI object has not been previously seen. If already seen, the passed URI object is dropped.

For efficiency in comparison against a large history of seen URIs, URI objects may not be passed immediately, unless the addNow() is used or a flush() is forced.
author:
   gojomo
version:
   $Date: 2005-12-16 03:10:54 +0000 (Fri, 16 Dec 2005) $, $Revision: 4036 $


Inner Class :public interface HasUriReceiver



Method Summary
public  voidadd(String key, CandidateURI value)
     Add given uri, if not already present.
public  voidaddForce(String key, CandidateURI value)
     Add given uri, all the way through to underlying destination, even if already present. (Sometimes a URI must be fetched, or refetched, for example when DNS or robots info expires or the operator forces a refetch.
public  voidaddNow(String key, CandidateURI value)
     Immediately add uri.
public  voidclose()
     Close down any allocated resources.
public  longcount()
    
public  voidforget(String key, CandidateURI value)
    
public  voidnote(String key)
     Note item as seen, without passing through to receiver.
public  longpending()
     Count of items added, but not yet filtered in or out.
public  longrequestFlush()
     Request that any pending items be added/dropped.
public  voidsetDestination(HasUriReceiver receiver)
     Receiver of uniq URIs. Items that have not been seen before are pass through to this object.
Parameters:
  receiver - Object that will be passed items.
public  voidsetProfileLog(File logfile)
     Set a File to receive a log for replay profiling.



Method Detail
add
public void add(String key, CandidateURI value)(Code)
Add given uri, if not already present.
Parameters:
  key - Usually a canonicalized version of value.This is the key used doing lookups, forgets and insertions on thealready included list.
Parameters:
  value - item to add.



addForce
public void addForce(String key, CandidateURI value)(Code)
Add given uri, all the way through to underlying destination, even if already present. (Sometimes a URI must be fetched, or refetched, for example when DNS or robots info expires or the operator forces a refetch. A normal add() or addNow() would drop the URI without forwarding on once it is determmined to already be in the filter.)
Parameters:
  key - Usually a canonicalized version of uri.This is the key used doing lookups, forgets and insertions on thealready included list.
Parameters:
  value - item to add.



addNow
public void addNow(String key, CandidateURI value)(Code)
Immediately add uri.
Parameters:
  key - Usually a canonicalized version of uri.This is the key used doing lookups, forgets and insertions on thealready included list.
Parameters:
  value - item to add.



close
public void close()(Code)
Close down any allocated resources. Makes sense calling this when checkpointing.



count
public long count()(Code)
Count of already seen URIs.



forget
public void forget(String key, CandidateURI value)(Code)
Forget item was seen
Parameters:
  key - Usually a canonicalized version of an URI.This is the key used doing lookups, forgets and insertions on thealready included list.
Parameters:
  value - item to add.



note
public void note(String key)(Code)
Note item as seen, without passing through to receiver.
Parameters:
  key - Usually a canonicalized version of an URI.This is the key used doing lookups, forgets and insertions on thealready included list.



pending
public long pending()(Code)
Count of items added, but not yet filtered in or out. Some implementations may buffer up large numbers of pending items to be evaluated in a later large batch/scan/merge with disk files. Count of items added not yet evaluated



requestFlush
public long requestFlush()(Code)
Request that any pending items be added/dropped. Implementors may ignore the request if a flush would be too expensive/too soon. Number added.



setDestination
public void setDestination(HasUriReceiver receiver)(Code)
Receiver of uniq URIs. Items that have not been seen before are pass through to this object.
Parameters:
  receiver - Object that will be passed items. Must implementHasUriReceiver interface.



setProfileLog
public void setProfileLog(File logfile)(Code)
Set a File to receive a log for replay profiling.



www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.