Java Doc for Paste.java in  » Search-Engine » mg4j » it » unimi » dsi » mg4j » tool » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Search Engine » mg4j » it.unimi.dsi.mg4j.tool 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   it.unimi.dsi.mg4j.tool.Combine
      it.unimi.dsi.mg4j.tool.Paste

Paste
final public class Paste extends Combine (Code)
Pastes several indices.

Pasting is a very slow way of combining indices: we assume that not only documents, but also document occurrences might be scattered throughout several indices. When a document appears in several indices, its occurrences in a given index are combined by renumbering them starting from the sum of the sizes for the document in the previous indices.

Conceptually, this operation is equivalent to splitting a collection vertically: each document is divided into a fixed number n of consecutive segments (possibly of length 0), and a set of n indices is created using the k-th segment of all documents. Pasting the resulting indices will produce an index that is identical to the index generated by the original collection. The behaviour is analogous to that of the UN*X paste command if documents are single-line lists of words.

In pratice, pasting is usually applied to indices obtained from a (e.g., indices containing anchor text fragments).

Note that in case every document appears at most in one index pasting is equivalent to . It is, however, significantly slower, as the presence of the same document in several lists makes it necessary to scan completely the inverted lists to be pasted to compute the frequency.
author:
   Sebastiano Vigna
since:
   1.0



Field Summary
final public static  intDEFAULT_MEMORY_BUFFER_SIZE
     The default size of the temporary bit stream buffer used while pasting.
protected  int[]doc
     The reference array of the document queue.
protected  IntHeapPriorityQueuedocumentQueue
     The queue containing document pointers (for remapped indices).

Constructor Summary
public  Paste(String outputBasename, String[] inputBasename, boolean metadataOnly, int bufferSize, File tempFileDir, int tempBufferSize, Map<Component, Coding> writerFlags, boolean interleaved, boolean skips, int quantum, int height, int skipBufferSize, long logInterval)
    

Method Summary
protected  intcombine(int numUsedIndices)
    
protected  intcombineNumberOfDocuments()
    
protected  intcombineSizes()
    
protected  BitStreamIndexgetIndex(CharSequence basename)
     Returns an index with given basename, loading document sizes.
Parameters:
  basename - an index basename.
public static  voidmain(String arg)
    
public  voidrun()
    

Field Detail
DEFAULT_MEMORY_BUFFER_SIZE
final public static int DEFAULT_MEMORY_BUFFER_SIZE(Code)
The default size of the temporary bit stream buffer used while pasting. Posting lists larger than this size will be precomputed on disk and then added to the index.



doc
protected int[] doc(Code)
The reference array of the document queue.



documentQueue
protected IntHeapPriorityQueue documentQueue(Code)
The queue containing document pointers (for remapped indices).




Constructor Detail
Paste
public Paste(String outputBasename, String[] inputBasename, boolean metadataOnly, int bufferSize, File tempFileDir, int tempBufferSize, Map<Component, Coding> writerFlags, boolean interleaved, boolean skips, int quantum, int height, int skipBufferSize, long logInterval) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)




Method Detail
combine
protected int combine(int numUsedIndices) throws IOException(Code)



combineNumberOfDocuments
protected int combineNumberOfDocuments()(Code)



combineSizes
protected int combineSizes() throws IOException(Code)



getIndex
protected BitStreamIndex getIndex(CharSequence basename) throws ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)
Returns an index with given basename, loading document sizes.
Parameters:
  basename - an index basename. an index loaded with document sizes.



main
public static void main(String arg) throws ConfigurationException, SecurityException, JSAPException, IOException, URISyntaxException, ClassNotFoundException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)



run
public void run() throws ConfigurationException, IOException(Code)



Fields inherited from it.unimi.dsi.mg4j.tool.Combine
final public static int DEFAULT_BUFFER_SIZE(Code)(Java Doc)
final protected int[] frequency(Code)(Java Doc)
final protected boolean hasCounts(Code)(Java Doc)
final protected boolean hasPayloads(Code)(Java Doc)
final protected boolean hasPositions(Code)(Java Doc)
final protected BitStreamIndex[] index(Code)(Java Doc)
final protected IndexIterator[] indexIterator(Code)(Java Doc)
final protected IndexReader[] indexReader(Code)(Java Doc)
protected IndexWriter indexWriter(Code)(Java Doc)
final protected String[] inputBasename(Code)(Java Doc)
protected int maxCount(Code)(Java Doc)
final protected int numIndices(Code)(Java Doc)
final protected int numberOfDocuments(Code)(Java Doc)
protected long numberOfOccurrences(Code)(Java Doc)
protected int[] position(Code)(Java Doc)
protected int[] size(Code)(Java Doc)
protected ObjectHeapSemiIndirectPriorityQueue<MutableString> termQueue(Code)(Java Doc)
protected int[] usedIndex(Code)(Java Doc)

Methods inherited from it.unimi.dsi.mg4j.tool.Combine
abstract protected int combine(int numUsedIndices) throws IOException(Code)(Java Doc)
abstract protected int combineNumberOfDocuments()(Code)(Java Doc)
abstract protected int combineSizes() throws IOException(Code)(Java Doc)
protected BitStreamIndex getIndex(CharSequence basename) throws ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc)
public static void main(String[] arg) throws JSAPException, ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc)
public static void main(String[] arg, Class<? extends Combine> combineClass) throws JSAPException, ConfigurationException, IOException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException(Code)(Java Doc)
public void run() throws ConfigurationException, IOException(Code)(Java Doc)
protected IntIterator sizes(int numIndex) throws FileNotFoundException(Code)(Java Doc)

Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.