Java Doc for Index.java in  » Search-Engine » mg4j » it » unimi » dsi » mg4j » index » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Search Engine » mg4j » it.unimi.dsi.mg4j.index 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   it.unimi.dsi.mg4j.index.Index

All known Subclasses:   it.unimi.dsi.mg4j.index.BitStreamIndex,  it.unimi.dsi.mg4j.index.cluster.IndexCluster,  it.unimi.dsi.mg4j.index.remote.RemoteIndex,
Index
abstract public class Index implements Serializable(Code)
An abstract representation of an index.

Concrete subclasses of this class represent abstract index access information: for instance, the basename or IP address/port, flags, etc. It allows to build easily over the index: in turn, index readers provide .

In principle, this class should just contain methods declarations, and attributes for all data that is common to any form of index. Note that we use an abstract class, rather than an interface, because interfaces do not allow to declare attributes.

This class provide static factory methods (e.g., Index.getInstance(CharSequence) ) that return an index given a suitable URI string. If the scheme part is mg4j, then the URI is assumed to point at a remote index. Otherwise, it is assumed to be the basename of a local index. In both cases, a query part introduced by ? can specify additional parameters (key=value pairs separated by ;). For instance, the URI example?inmemory=1 will load the index with basename example, caching its content in core memory. Please have a look at constants in Index.UriKeys (and analogous enums in subclasses) for additional parameters.

Thread safety

Indices are a natural candidate for multithreaded access. An instance of this class must be thread safe as long as external data structures provided to its constructors are. For instance, the tool it.unimi.dsi.mg4j.tool.IndexBuilder generates a ImmutableExternalPrefixMap so that by default the resulting index is thread safe.

For instance, a it.unimi.dsi.mg4j.index.DiskBasedIndex requires a list of term offsets, term maps, etc. As long as all these data structures are thread safe, the same is true of the index. Data structures created by static factory methods such as it.unimi.dsi.mg4j.index.DiskBasedIndex.getInstance(CharSequence) are thread safe.

Note that it.unimi.dsi.mg4j.index.IndexReader s returned by Index.getReader() are not thread safe (even if the method Index.getReader() is). The logic behind this arrangement is that you create as many reader as you need, and then java.io.Closeable.close them. In a multithreaded environment, a pool of index readers can be created, and a custom it.unimi.dsi.mg4j.query.nodes.QueryBuilderVisitor can be used to build it.unimi.dsi.mg4j.search.DocumentIterator s using the given pool of readers. In this case readers are not closed, but rather reused.

Read-once load

Implementations of this class are strongly encouraged to offer read-once constructors and factory methods: property files and other data related to the index (but not to an it.unimi.dsi.mg4j.index.IndexReader should be read exactly once, and sequentially. This feature is very useful when .
author:
   Paolo Boldi
author:
   Sebastiano Vigna
since:
   0.9


Inner Class :public static enum PropertyKeys
Inner Class :public static enum UriKeys
Inner Class :protected class EmptyIndexIterator extends IntIterators.EmptyIterator implements IndexIterator,Serializable

Field Summary
final public  EmptyIndexIteratoremptyIndexIterator
     A singleton for an iterator returning no documents based on this index.
final public  Stringfield
     The field indexed by this index, or null.
final public  booleanhasCounts
     Whether this index contains counts.
final public  booleanhasPayloads
     Whether this index contains payloads; if true, Index.payload is non-null.
final public  booleanhasPositions
     Whether this index contains positions.
public  IndexkeyIndex
     The index used as a key to retrieve intervals.
final public  intmaxCount
     The maximum number of positions in an position list, or -1 if it is unknown.
final public  intnumberOfDocuments
     The number of documents of the collection.
final public  longnumberOfOccurrences
     The number of occurrences of the collection.
final public  longnumberOfPostings
     The number of postings (pairs term/document) of the collection.
final public  intnumberOfTerms
     The number of terms of the collection.
final public  Payloadpayload
     The payload for this index, or null.
final public  Propertiesproperties
     The properties of this index.
public  ReferenceSet<Index>singletonSet
     An immutable singleton set containing just Index.keyIndex .
final public  IntListsizes
     The size of each document, or null if sizes are not necessary or not loaded in this index.
final public  TermProcessortermProcessor
     The term processor used to build this index.

Constructor Summary
protected  Index(int numberOfDocuments, int numberOfTerms, long numberOfPostings, long numberOfOccurrences, int maxCount, Payload payload, boolean hasCounts, boolean hasPositions, TermProcessor termProcessor, String field, IntList sizes, Properties properties)
     Creates a new instance, initialising all fields.

Method Summary
public  IndexIteratordocuments(int term)
     Creates a new IndexReader for this index and uses it to return an index iterator over the documents containing a term.

Since the reader is created from scratch, it is essential to the returned iterator after usage.

public  IndexIteratordocuments(CharSequence term)
     Creates a new IndexReader for this index and uses it to return an index iterator over the documents containing a term; the term is given explicitly, and the index is used, if present.

Since the reader is created from scratch, it is essential to the returned iterator after usage.

abstract public  IndexIteratordocuments(CharSequence prefix, int limit)
     Creates a number of instances of IndexReader for this index and uses them to return a document iterator over the documents containing a set of terms defined by a prefix; the prefix is given explicitly, and unless the index has a , an UnsupportedOperationException will be thrown.
public static  IndexgetInstance(CharSequence uri, boolean randomAccess, boolean documentSizes, boolean maps)
     Returns a new index using the given URI.

If uri has scheme mg4j, the index is considered to be remote and index creation delegated to IndexServer.getIndex(Stringintbooleanboolean) .

public static  IndexgetInstance(CharSequence uri, boolean randomAccess, boolean documentSizes)
     Returns a new index using the given URI, searching dynamically for term and prefix maps.
public static  IndexgetInstance(CharSequence uri, boolean randomAccess)
     Returns a new index using the given URI, searching dynamically for term and prefix maps and loading document sizes only if it is necessary.
public static  IndexgetInstance(CharSequence uri)
     Returns a new index using the given URI, searching dynamically for term and prefix maps, loading offsets but loading document sizes only if it is necessary.
public  IndexReadergetReader()
     Creates and returns a new IndexReader based on this index, using the default buffer size.
abstract public  IndexReadergetReader(int bufferSize)
     Creates and returns a new IndexReader based on this index.
protected static  TermProcessorgetTermProcessor(Properties properties)
    
public  voidkeyIndex(Index newKeyIndex)
     Set the index used as a key to retrieve intervals from iterators generated from this index.

This setter is a compromise between clarity of design and efficiency. Each index iterator is based on an index, and when that index is passed to DocumentIterator.intervalIterator(Index) , intervals corresponding to the positions of the term in the current document are returned.


Field Detail
emptyIndexIterator
final public EmptyIndexIterator emptyIndexIterator(Code)
A singleton for an iterator returning no documents based on this index.



field
final public String field(Code)
The field indexed by this index, or null.



hasCounts
final public boolean hasCounts(Code)
Whether this index contains counts.



hasPayloads
final public boolean hasPayloads(Code)
Whether this index contains payloads; if true, Index.payload is non-null.



hasPositions
final public boolean hasPositions(Code)
Whether this index contains positions.



keyIndex
public Index keyIndex(Code)
The index used as a key to retrieve intervals. Usually equal to this, but it is .



maxCount
final public int maxCount(Code)
The maximum number of positions in an position list, or -1 if it is unknown.



numberOfDocuments
final public int numberOfDocuments(Code)
The number of documents of the collection.



numberOfOccurrences
final public long numberOfOccurrences(Code)
The number of occurrences of the collection.



numberOfPostings
final public long numberOfPostings(Code)
The number of postings (pairs term/document) of the collection.



numberOfTerms
final public int numberOfTerms(Code)
The number of terms of the collection. This field might be set to -1 in some cases (for instance, in certain documental clusters).



payload
final public Payload payload(Code)
The payload for this index, or null.



properties
final public Properties properties(Code)
The properties of this index. It is stored here for convenience (for instance, if custom keys are added to the property file), but it may be null.



singletonSet
public ReferenceSet<Index> singletonSet(Code)
An immutable singleton set containing just Index.keyIndex .



sizes
final public IntList sizes(Code)
The size of each document, or null if sizes are not necessary or not loaded in this index.



termProcessor
final public TermProcessor termProcessor(Code)
The term processor used to build this index.




Constructor Detail
Index
protected Index(int numberOfDocuments, int numberOfTerms, long numberOfPostings, long numberOfOccurrences, int maxCount, Payload payload, boolean hasCounts, boolean hasPositions, TermProcessor termProcessor, String field, IntList sizes, Properties properties)(Code)
Creates a new instance, initialising all fields.




Method Detail
documents
public IndexIterator documents(int term) throws IOException(Code)
Creates a new IndexReader for this index and uses it to return an index iterator over the documents containing a term.

Since the reader is created from scratch, it is essential to the returned iterator after usage. See IndexReader.documents(int) for a method with the same semantics, but making reader reuse possible.
Parameters:
  term - a term.
throws:
  IOException - if an exception occurred while accessing the index.
throws:
  UnsupportedOperationException - if this index is not accessible by termnumber.
See Also:   IndexReader.documents(int)




documents
public IndexIterator documents(CharSequence term) throws IOException(Code)
Creates a new IndexReader for this index and uses it to return an index iterator over the documents containing a term; the term is given explicitly, and the index is used, if present.

Since the reader is created from scratch, it is essential to the returned iterator after usage. See IndexReader.documents(int) for a method with the same semantics, but making reader reuse possible.

Unless the of this index is null, words coming from a query will have to be processed before being used with this method.
Parameters:
  term - a term.
throws:
  IOException - if an exception occurred while accessing the index.
throws:
  UnsupportedOperationException - if the is not available for this index.
See Also:   IndexReader.documents(CharSequence)




documents
abstract public IndexIterator documents(CharSequence prefix, int limit) throws IOException, TooManyTermsException(Code)
Creates a number of instances of IndexReader for this index and uses them to return a document iterator over the documents containing a set of terms defined by a prefix; the prefix is given explicitly, and unless the index has a , an UnsupportedOperationException will be thrown.

This method is not provided by IndexReader because it requires the creation of several index readers at the same time. These readers must be afterwards.
Parameters:
  prefix - a prefix.
Parameters:
  limit - a limit on the number of terms that will be used to resolvethe prefix query; if the terms starting with prefix are more thanlimit, a TooManyTermsException will be thrown.
throws:
  IOException - if an exception occurred while accessing the index.
throws:
  UnsupportedOperationException - if this index cannot resolve prefixes.
throws:
  TooManyTermsException - if there are more than limit terms starting with prefix.




getInstance
public static Index getInstance(CharSequence uri, boolean randomAccess, boolean documentSizes, boolean maps) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)
Returns a new index using the given URI.

If uri has scheme mg4j, the index is considered to be remote and index creation delegated to IndexServer.getIndex(Stringintbooleanboolean) . Otherwise, we delegate to DiskBasedIndex.getInstance(CharSequencebooleanbooleanbooleanEnumMap) .
Parameters:
  uri - the URI defining the index.
Parameters:
  randomAccess - whether the index should be accessible randomly.
Parameters:
  documentSizes - if true, document sizes will be loaded (note that sometimes document sizesmight be loaded anyway because the compression method for positions requires it).
Parameters:
  maps - if true, and maps will be guessed and loaded (thisfeature might not be available with some kind of index).




getInstance
public static Index getInstance(CharSequence uri, boolean randomAccess, boolean documentSizes) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)
Returns a new index using the given URI, searching dynamically for term and prefix maps.
Parameters:
  uri - the URI defining the index.
Parameters:
  randomAccess - whether the index should be accessible randomly.
Parameters:
  documentSizes - if true, document sizes will be loaded (note that sometimes document sizesmight be loaded anyway because the compression method for positions requires it).
See Also:   Index.getInstance(CharSequence,boolean,boolean,boolean)



getInstance
public static Index getInstance(CharSequence uri, boolean randomAccess) throws ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)
Returns a new index using the given URI, searching dynamically for term and prefix maps and loading document sizes only if it is necessary.
Parameters:
  uri - the URI defining the index.
Parameters:
  randomAccess - whether the index should be accessible randomly.
See Also:   Index.getInstance(CharSequence,boolean,boolean)



getInstance
public static Index getInstance(CharSequence uri) throws ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)
Returns a new index using the given URI, searching dynamically for term and prefix maps, loading offsets but loading document sizes only if it is necessary.
Parameters:
  uri - the URI defining the index.
See Also:   Index.getInstance(CharSequence,boolean)



getReader
public IndexReader getReader() throws IOException(Code)
Creates and returns a new IndexReader based on this index, using the default buffer size. After that, you can use the reader to read this index. a new IndexReader to read this index.



getReader
abstract public IndexReader getReader(int bufferSize) throws IOException(Code)
Creates and returns a new IndexReader based on this index. After that, you can use the reader to read this index.
Parameters:
  bufferSize - the size of the buffer to be used accessing the reader, or -1for a default buffer size. a new IndexReader to read this index.



getTermProcessor
protected static TermProcessor getTermProcessor(Properties properties)(Code)



keyIndex
public void keyIndex(Index newKeyIndex)(Code)
Set the index used as a key to retrieve intervals from iterators generated from this index.

This setter is a compromise between clarity of design and efficiency. Each index iterator is based on an index, and when that index is passed to DocumentIterator.intervalIterator(Index) , intervals corresponding to the positions of the term in the current document are returned. Analogously, it.unimi.dsi.mg4j.search.DocumentIterator.indices returns a singleton set containing the index. However, when composing indices into clusters, often iterators generated by a local index must act as if they really belong to the global index. This method allows to set the index that is used as a key to return intervals, and that is contained in Index.singletonSet .

Note that setting this value will only influence created afterwards.
Parameters:
  newKeyIndex - the new index to be used as a key for interval retrieval.




Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.