| java.lang.Object it.unimi.dsi.mg4j.index.Index it.unimi.dsi.mg4j.index.cluster.IndexCluster
All known Subclasses: it.unimi.dsi.mg4j.index.cluster.DocumentalCluster, it.unimi.dsi.mg4j.index.cluster.LexicalCluster,
IndexCluster | abstract public class IndexCluster extends Index (Code) | | An abstract index cluster. An index cluster is an index
exposing transparently a list of local indices as a single
global index. A cluster usually is generated by
partitioning an index
or
, but nothing
prevents the creation of hand-made clusters.
Note that, upon creation of an instance, the main index key
of all
is
to that instance.
An index cluster is defined by a property file. The only properties common
to all index clusters are localindex, which can be specified multiple
times (order is relevant) and contains the URIs of the local indices of the cluster,
and strategy, which contains the filename of a serialised
it.unimi.dsi.mg4j.index.cluster.ClusteringStrategy .
The indices will be loaded using
it.unimi.dsi.mg4j.index.Index.getInstance(CharSequencebooleanboolean) ,
so there is no restriction on the URIs that can be used (e.g., you can cluster
a set of remote indices).
If you plan to use global document sizes (e.g., for
) you will need
to load them explicitly using the property
it.unimi.dsi.mg4j.index.Index.UriKeys.SIZES , which must specify
a size file for the whole collection. If you are clustering a partitioned index,
this is usually the original size file.
Optionally, an index cluster may provide
to reduce useless access to local indices that do not contain a term. The filters
have the standard extension
IndexCluster.BLOOM_EXTENSION .
This class exposes a
that uses the indexclass property to load the appropriate implementing subclass;
Bloom filters are loaded automatically.
|
Inner Class :public static enum PropertyKeys | |
Constructor Summary | |
protected | IndexCluster(Index[] localIndex, BloomFilter[] termFilter, int numberOfDocuments, int numberOfTerms, long numberOfPostings, long numberOfOccurrences, int maxCount, Payload payload, boolean hasCounts, boolean hasPositions, TermProcessor termProcessor, String field, IntList sizes, Properties properties) |
Method Summary | |
public static Index | getInstance(CharSequence basename, boolean randomAccess, boolean documentSizes, EnumMap<UriKeys, String> queryProperties) Returns a new index cluster.
This method uses the LOCALINDEX property to locate the local indices,
loads them (passing on randomAccess ) and
builds a new index cluster using the appropriate implementing subclass.
Note that documentSizes is just passed to the local indices. | public void | keyIndex(Index newKeyIndex) |
BLOOM_EXTENSION | final public static String BLOOM_EXTENSION(Code) | | The default extension for Bloom term filters.
|
STRATEGY_DEFAULT_EXTENSION | final public static String STRATEGY_DEFAULT_EXTENSION(Code) | | The default extension of a strategy.
|
localIndex | final protected Index[] localIndex(Code) | | The local indices of this cluster.
|
termFilter | final protected BloomFilter[] termFilter(Code) | | An array of Bloom filter to reduce index access, or null .
|
IndexCluster | protected IndexCluster(Index[] localIndex, BloomFilter[] termFilter, int numberOfDocuments, int numberOfTerms, long numberOfPostings, long numberOfOccurrences, int maxCount, Payload payload, boolean hasCounts, boolean hasPositions, TermProcessor termProcessor, String field, IntList sizes, Properties properties)(Code) | | |
getInstance | public static Index getInstance(CharSequence basename, boolean randomAccess, boolean documentSizes, EnumMap<UriKeys, String> queryProperties) throws ConfigurationException, IOException, ClassNotFoundException, SecurityException, URISyntaxException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code) | | Returns a new index cluster.
This method uses the LOCALINDEX property to locate the local indices,
loads them (passing on randomAccess ) and
builds a new index cluster using the appropriate implementing subclass.
Note that documentSizes is just passed to the local indices. This can be useful
in
, as it allows local scoring, but it is useless in
, as scoring is necessarily centralised. In the
latter case, the property
it.unimi.dsi.mg4j.index.Index.UriKeys.SIZES can be used to specify a global sizes file (which
usually comes from an original global index).
Parameters: basename - the basename. Parameters: randomAccess - whether the index should be accessible randomly. Parameters: documentSizes - if true, document sizes will be loaded (note that sometimes document sizesmight be loaded anyway because the compression method for positions requires it). Parameters: queryProperties - a map containing associations between it.unimi.dsi.mg4j.index.Index.UriKeys and values, or null . |
Methods inherited from it.unimi.dsi.mg4j.index.Index | public IndexIterator documents(int term) throws IOException(Code)(Java Doc) public IndexIterator documents(CharSequence term) throws IOException(Code)(Java Doc) abstract public IndexIterator documents(CharSequence prefix, int limit) throws IOException, TooManyTermsException(Code)(Java Doc) public static Index getInstance(CharSequence uri, boolean randomAccess, boolean documentSizes, boolean maps) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc) public static Index getInstance(CharSequence uri, boolean randomAccess, boolean documentSizes) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc) public static Index getInstance(CharSequence uri, boolean randomAccess) throws ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc) public static Index getInstance(CharSequence uri) throws ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc) public IndexReader getReader() throws IOException(Code)(Java Doc) abstract public IndexReader getReader(int bufferSize) throws IOException(Code)(Java Doc) protected static TermProcessor getTermProcessor(Properties properties)(Code)(Java Doc) public void keyIndex(Index newKeyIndex)(Code)(Java Doc)
|
|
|