| java.lang.Object org.archive.crawler.frontier.WorkQueue
All known Subclasses: org.archive.crawler.frontier.BdbWorkQueue,
Method Summary | |
public void | clearHeld() | final public int | compareTo(Object obj) | abstract protected void | deleteItem(WorkQueueFrontier frontier, CrawlURI item) Removes the given item from the queue. | public long | deleteMatching(WorkQueueFrontier frontier, String match) Delete URIs matching the given pattern from this queue. | abstract protected long | deleteMatchingFromQueue(WorkQueueFrontier frontier, String match) Delete URIs matching the given pattern from this queue. | public synchronized void | dequeue(WorkQueueFrontier frontier) Remove the peekItem from the queue and adjusts the count. | public synchronized void | enqueue(WorkQueueFrontier frontier, CrawlURI curi) Add the given CrawlURI, noting its addition in running count. | public int | expend(int amount) Decrease the internal running budget by the given amount. | public String | getClassKey() | public synchronized long | getCount() | public String[] | getReports() | public int | getSessionBalance() | public CrawlSubstats | getSubstats() | public long | getTotalExpenditure() | public long | getWakeTime() | public int | incrementSessionBalance(int amount) | abstract protected void | insertItem(WorkQueueFrontier frontier, CrawlURI curi, boolean expectedPresent) Insert the given curi, whether it is already present or not.
Hook for subclasses. | public boolean | isHeld() | public boolean | isOverBudget() Check whether queue has temporarily or permanently exceeded
its budget. | public boolean | isRetired() | public void | noteError(int penalty) Note an error and assess an extra penalty. | public CrawlURI | peek(WorkQueueFrontier frontier) Return the topmost queue item -- and remember it,
such that even later higher-priority inserts don't
change it. | abstract protected CrawlURI | peekItem(WorkQueueFrontier frontier) | public int | refund(int amount) | public void | reportTo(PrintWriter writer) | public void | reportTo(String name, PrintWriter writer) | protected void | resume(WorkQueueFrontier frontier) Resumes this WorkQueue. | public void | setActive(WorkQueueFrontier frontier, boolean b) | public void | setHeld() | public void | setRetired(boolean b) Set the retired status of this queue. | public void | setSessionBalance(int balance) | public void | setTotalBudget(long budget) Set the total expenditure level allowable before queue is
considered inherently 'over-budget'. | public void | setWakeTime(long l) | public String | singleLineLegend() | public String | singleLineReport() | public void | singleLineReportTo(PrintWriter writer) | protected void | suspend(WorkQueueFrontier frontier) Suspends this WorkQueue. | public void | unpeek() Forgive the peek, allowing a subsequent peek to
return a different item. | public void | update(WorkQueueFrontier frontier, CrawlURI curi) Update the given CrawlURI, which should already be present. |
clearHeld | public void clearHeld()(Code) | | Clear isHeld to false
|
deleteItem | abstract protected void deleteItem(WorkQueueFrontier frontier, CrawlURI item) throws IOException(Code) | | Removes the given item from the queue.
This is only used to remove the first item in the queue,
so it is not necessary to implement a random-access queue.
Parameters: frontier - Work queues manager. throws: IOException - if there was a problem while deleting the item |
deleteMatching | public long deleteMatching(WorkQueueFrontier frontier, String match)(Code) | | Delete URIs matching the given pattern from this queue.
Parameters: frontier - Parameters: match - count of deleted URIs |
deleteMatchingFromQueue | abstract protected long deleteMatchingFromQueue(WorkQueueFrontier frontier, String match) throws IOException(Code) | | Delete URIs matching the given pattern from this queue.
Parameters: frontier - WorkQueues manager. Parameters: match - the pattern to match count of deleted URIs throws: IOException - if there was a problem while deleting |
dequeue | public synchronized void dequeue(WorkQueueFrontier frontier)(Code) | | Remove the peekItem from the queue and adjusts the count.
Parameters: frontier - Work queues manager. |
enqueue | public synchronized void enqueue(WorkQueueFrontier frontier, CrawlURI curi)(Code) | | Add the given CrawlURI, noting its addition in running count. (It
should not already be present.)
Parameters: frontier - Work queues manager. Parameters: curi - CrawlURI to insert. |
expend | public int expend(int amount)(Code) | | Decrease the internal running budget by the given amount.
Parameters: amount - tp decrement updated budget value |
getClassKey | public String getClassKey()(Code) | | classKey, the 'identifier', for this queue. |
getCount | public synchronized long getCount()(Code) | | Returns the count. |
getSessionBalance | public int getSessionBalance()(Code) | | Return current session 'activity budget balance'
session balance |
getTotalExpenditure | public long getTotalExpenditure()(Code) | | Return the tally of all expenditures on this queue
total amount expended on this queue |
getWakeTime | public long getWakeTime()(Code) | | wakeTime |
incrementSessionBalance | public int incrementSessionBalance(int amount)(Code) | | Increase the internal running budget to be used before
deactivating the queue
Parameters: amount - amount to increment updated budget value |
insertItem | abstract protected void insertItem(WorkQueueFrontier frontier, CrawlURI curi, boolean expectedPresent) throws IOException(Code) | | Insert the given curi, whether it is already present or not.
Hook for subclasses.
Parameters: frontier - WorkQueueFrontier. Parameters: curi - CrawlURI to insert. throws: IOException - if there was a problem while inserting the item |
isHeld | public boolean isHeld()(Code) | | Whether the queue is already in a lifecycle stage --
such as ready, in-progress, snoozed -- and thus should
not be redundantly inserted to readyClassQueues
isHeld |
isOverBudget | public boolean isOverBudget()(Code) | | Check whether queue has temporarily or permanently exceeded
its budget.
true if queue is over its set budget(s) |
isRetired | public boolean isRetired()(Code) | | |
noteError | public void noteError(int penalty)(Code) | | Note an error and assess an extra penalty.
Parameters: penalty - additional amount to deduct |
peek | public CrawlURI peek(WorkQueueFrontier frontier)(Code) | | Return the topmost queue item -- and remember it,
such that even later higher-priority inserts don't
change it.
TODO: evaluate if this is really necessary
Parameters: frontier - Work queues manager topmost queue item, or null |
refund | public int refund(int amount)(Code) | | A URI should not have been charged against queue (eg
it was disregarded); return the amount expended
Parameters: amount - to return updated budget value |
setHeld | public void setHeld()(Code) | | Set isHeld to true
|
setRetired | public void setRetired(boolean b)(Code) | | Set the retired status of this queue.
Parameters: b - new value for retired status |
setSessionBalance | public void setSessionBalance(int balance)(Code) | | Set the session 'activity budget balance' to the given value
Parameters: balance - to use |
setTotalBudget | public void setTotalBudget(long budget)(Code) | | Set the total expenditure level allowable before queue is
considered inherently 'over-budget'.
Parameters: budget - |
setWakeTime | public void setWakeTime(long l)(Code) | | Parameters: l - |
unpeek | public void unpeek()(Code) | | Forgive the peek, allowing a subsequent peek to
return a different item.
|
update | public void update(WorkQueueFrontier frontier, CrawlURI curi)(Code) | | Update the given CrawlURI, which should already be present. (This
is not checked.) Equivalent to an enqueue without affecting the count.
Parameters: frontier - Work queues manager. Parameters: curi - CrawlURI to update. |
|
|