org.apache.lucene.analysis |
|
Java Source File Name | Type | Comment |
Analyzer.java | Class | An Analyzer builds TokenStreams, which analyze text. |
CachingTokenFilter.java | Class | This class can be used if the Tokens of a TokenStream
are intended to be consumed more than once. |
CharArraySet.java | Class | A simple class that stores Strings as char[]'s in a
hash table. |
CharTokenizer.java | Class | An abstract base class for simple, character-oriented tokenizers. |
ISOLatin1AccentFilter.java | Class | A filter that replaces accented characters in the ISO Latin 1 character set
(ISO-8859-1) by their unaccented equivalent. |
KeywordAnalyzer.java | Class | "Tokenizes" the entire stream as a single token. |
KeywordTokenizer.java | Class | Emits the entire input as a single token. |
LengthFilter.java | Class | Removes words that are too long and too short from the stream. |
LetterTokenizer.java | Class | A LetterTokenizer is a tokenizer that divides text at non-letters. |
LowerCaseFilter.java | Class | Normalizes token text to lower case. |
LowerCaseTokenizer.java | Class | LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together. |
PerFieldAnalyzerWrapper.java | Class | This analyzer is used to facilitate scenarios where different
fields require different analysis techniques. |
PorterStemFilter.java | Class | Transforms the token stream as per the Porter stemming algorithm. |
PorterStemmer.java | Class | Stemmer, implementing the Porter Stemming Algorithm
The Stemmer class transforms a word into its root form. |
SimpleAnalyzer.java | Class | An Analyzer that filters LetterTokenizer with LowerCaseFilter. |
SinkTokenizer.java | Class | |
StopAnalyzer.java | Class | Filters LetterTokenizer with LowerCaseFilter and StopFilter. |
StopFilter.java | Class | Removes stop words from a token stream. |
TeeSinkTokenTest.java | Class | |
TeeTokenFilter.java | Class | Works in conjunction with the SinkTokenizer to provide the ability to set aside tokens
that have already been analyzed. |
TestAnalyzers.java | Class | |
TestCachingTokenFilter.java | Class | |
TestCharArraySet.java | Class | |
TestISOLatin1AccentFilter.java | Class | |
TestKeywordAnalyzer.java | Class | |
TestLengthFilter.java | Class | |
TestPerFieldAnalzyerWrapper.java | Class | Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. |
TestStandardAnalyzer.java | Class | Copyright 2004 The Apache Software Foundation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. |
TestStopAnalyzer.java | Class | |
TestStopFilter.java | Class | |
TestToken.java | Class | |
Token.java | Class | A Token is an occurence of a term from the text of a field. |
TokenFilter.java | Class | A TokenFilter is a TokenStream whose input is another token stream. |
Tokenizer.java | Class | A Tokenizer is a TokenStream whose input is a Reader. |
TokenStream.java | Class | A TokenStream enumerates the sequence of tokens, either from
fields of a document or from query text.
This is an abstract class. |
WhitespaceAnalyzer.java | Class | An Analyzer that uses WhitespaceTokenizer. |
WhitespaceTokenizer.java | Class | A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
Adjacent sequences of non-Whitespace characters form tokens. |
WordlistLoader.java | Class | Loader for text files that represent a list of stopwords. |