public class EnglishSnowballAnalyzer
extends org.apache.lucene.analysis.StopwordAnalyzerBase
StandardTokenizer with StandardFilter, LowerCaseFilter, StopFilter, SnowballFilter for English and ASCIIFoldingFilter.| Constructor and Description |
|---|
EnglishSnowballAnalyzer(org.apache.lucene.util.Version matchVersion)
Builds an analyzer with the default stop words:
getDefaultStopSet(). |
EnglishSnowballAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords)
Builds an analyzer with the given stop words.
|
EnglishSnowballAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords,
Set<?> stemExclusionSet)
Builds an analyzer with the given stop words.
|
| Modifier and Type | Method and Description |
|---|---|
protected org.apache.lucene.analysis.ReusableAnalyzerBase.TokenStreamComponents |
createComponents(String fieldName,
Reader reader)
Creates a
ReusableAnalyzerBase.TokenStreamComponents which tokenizes all the text in the provided
Reader. |
static Set<?> |
getDefaultStopSet()
Returns an unmodifiable instance of the default stop words set.
|
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSetinitReader, reusableTokenStream, tokenStreampublic EnglishSnowballAnalyzer(org.apache.lucene.util.Version matchVersion)
getDefaultStopSet().public EnglishSnowballAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords)
matchVersion - lucene compatibility versionstopwords - a stopword setpublic EnglishSnowballAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords,
Set<?> stemExclusionSet)
KeywordMarkerFilter before stemming.matchVersion - lucene compatibility versionstopwords - a stopword setstemExclusionSet - a set of terms not to be stemmedpublic static Set<?> getDefaultStopSet()
protected org.apache.lucene.analysis.ReusableAnalyzerBase.TokenStreamComponents createComponents(String fieldName, Reader reader)
ReusableAnalyzerBase.TokenStreamComponents which tokenizes all the text in the provided
Reader.createComponents in class org.apache.lucene.analysis.ReusableAnalyzerBaseReusableAnalyzerBase.TokenStreamComponents built from an StandardTokenizer
filtered with StandardFilter, LowerCaseFilter, StopFilter , KeywordMarkerFilter if a stem
exclusion set is provided, SnowballFilter and ASCIIFoldingFilter.Copyright © 2004–2020 Jahia Solutions Group SA. All rights reserved.