| java.lang.Object org.apache.lucene.analysis.Analyzer org.apache.lucene.analysis.cz.CzechAnalyzer
CzechAnalyzer | final public class CzechAnalyzer extends Analyzer (Code) | | Analyzer for Czech language. Supports an external list of stopwords (words that
will not be indexed at all).
A default set of stopwords is used unless an alternative list is specified, the
exclusion list is empty by default.
author: Lukas Zapletal [lzap@root.cz] |
CZECH_STOP_WORDS | final public static String[] CZECH_STOP_WORDS(Code) | | List of typical stopwords.
|
CzechAnalyzer | public CzechAnalyzer(String[] stopwords)(Code) | | Builds an analyzer with the given stop words.
|
CzechAnalyzer | public CzechAnalyzer(File stopwords) throws IOException(Code) | | Builds an analyzer with the given stop words.
|
loadStopWords | public void loadStopWords(InputStream wordfile, String encoding)(Code) | | Loads stopwords hash from resource stream (file, database...).
Parameters: wordfile - File containing the wordlist Parameters: encoding - Encoding used (win-1250, iso-8859-2, ...), null for default system encoding |
tokenStream | final public TokenStream tokenStream(String fieldName, Reader reader)(Code) | | Creates a TokenStream which tokenizes all the text in the provided Reader.
A TokenStream build from a StandardTokenizer filtered withStandardFilter, LowerCaseFilter, and StopFilter |
|
|