| it.unimi.dsi.mg4j.index.TermProcessor
All known Subclasses: it.unimi.dsi.mg4j.index.DowncaseTermProcessor, it.unimi.dsi.mg4j.index.snowball.AbstractSnowballTermProcessor, it.unimi.dsi.mg4j.index.NullTermProcessor,
TermProcessor | public interface TermProcessor extends Serializable,FlyweightPrototype<TermProcessor>(Code) | | A term processor, implementing term/prefix transformation and possibly term/prefix filtering.
Index contruction requires sometimes modifications of
the given terms: downcasing, stemming, and so on. The same
transformation must be applied to terms in a query. This
interface provides a uniform way to perform arbitrary term
transformations.
Index construction requires also term filtering:
TermProcessor.processTerm(MutableString) may
return false, indicating that the term should not
be processed at all (e.g., because it is a stopword).
Additionally, the method
TermProcessor.processPrefix(MutableString) may
process analogously a prefix (used for prefix queries).
Implementation are encouraged to expose a singleton, when
possible, by means of the static factory method getInstance() .
Warning: implementations of this class are not required
to be thread-safe, but they provide
it.unimi.dsi.lang.FlyweightPrototype flyweight copies .
The
TermProcessor.copy() method is strengthened so to return a instance of this class.
This interface was originally suggested by Fabien Campagne.
|
Method Summary | |
public TermProcessor | copy() | public boolean | processPrefix(MutableString prefix) Processes the given prefix, leaving the result in the same mutable string.
This method is not used during the indexing phase, but rather at query
time. | public boolean | processTerm(MutableString term) Processes the given term, leaving the result in the same mutable string.
Parameters: term - a mutable string containing the term to be processed, or null . |
processPrefix | public boolean processPrefix(MutableString prefix)(Code) | | Processes the given prefix, leaving the result in the same mutable string.
This method is not used during the indexing phase, but rather at query
time. If the user wants to specify a prefix query, it is sometimes necessary
to transform the prefix
(e.g.,
downcasing it).
It is of course unlikely that this method returns false, as it is usually not
possible to foresee which are the prefixes of indexable words. In case no natural
transformation applies, this method should leave its argument unchanged.
Parameters: prefix - a mutable string containing a prefix to be processed, or null . true if the prefix is not null and there might be an indexedword starting with prefix , false otherwise. |
processTerm | public boolean processTerm(MutableString term)(Code) | | Processes the given term, leaving the result in the same mutable string.
Parameters: term - a mutable string containing the term to be processed, or null . true if the term is not null and should be indexed, false otherwise. |
|
|