| websphinx.Classifier
All known Subclasses: websphinx.StandardClassifier,
Classifier | public interface Classifier extends java.io.Serializable(Code) | | Classifier interface. A classifier is a helper object that annotates
pages and links with labels (using Page.setLabel() and Link.setLabel()).
When a page is retrieved by a crawler, it is passed to the classify()
method of every Classifier registered with the crawler. Here are some
typical uses for classifiers:
- classifying links into categories like child or parent (see
websphinx.StandardClassifier);
- classifying pages into categories like biology or computers;
- recognizing and parsing pages formatted in a particular style, such as
AltaVista, Yahoo, or latex2html (e.g., the search engine classifiers
in websphinx.searchengine)
-
|
Method Summary | |
abstract public void | classify(Page page) Classify a page. | public float | getPriority() Get priority of this classifier. |
classify | abstract public void classify(Page page)(Code) | | Classify a page. Typically, the classifier calls page.setLabel() and
page.setField() to mark up the page. The classifier may also look
through the page's links and call link.setLabel() to mark them up.
Parameters: page - Page to classify |
getPriority | public float getPriority()(Code) | | Get priority of this classifier. Lower priorities execute first.
A classifier should also define a public constant priority
so that classifiers that depend on it can compute their
priorities statically. For example, if your classifier
depends on FooClassifier and BarClassifier, you might set your
priority as:
public static final float priority = Math.max (FooClassifier, BarClassifier) + 1;
public float getPriority () { return priority; }
priority of this classifier |
|
|