public final class JahiaFrenchAnalyzer
extends org.apache.lucene.analysis.StopwordAnalyzerBase
Analyzer for French language.
Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required Version
compatibility when creating FrenchAnalyzer:
NOTE: This class uses the same Version
dependent settings as StandardAnalyzer.
| Constructor and Description |
|---|
JahiaFrenchAnalyzer(org.apache.lucene.util.Version matchVersion)
Builds an analyzer with the given stop words
|
JahiaFrenchAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords)
Builds an analyzer with the given stop words
|
JahiaFrenchAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords,
Set<?> stemExclutionSet)
Builds an analyzer with the given stop words
|
| Modifier and Type | Method and Description |
|---|---|
protected org.apache.lucene.analysis.ReusableAnalyzerBase.TokenStreamComponents |
createComponents(String fieldName,
Reader reader)
Creates
ReusableAnalyzerBase.TokenStreamComponents used to tokenize all the text in the provided
Reader. |
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSetinitReader, reusableTokenStream, tokenStreampublic JahiaFrenchAnalyzer(org.apache.lucene.util.Version matchVersion)
stopwords - a stopword setpublic JahiaFrenchAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords)
matchVersion - lucene compatibility versionstopwords - a stopword setpublic JahiaFrenchAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopwords,
Set<?> stemExclutionSet)
matchVersion - lucene compatibility versionstopwords - a stopword setstemExclutionSet - a stemming exclusion setprotected org.apache.lucene.analysis.ReusableAnalyzerBase.TokenStreamComponents createComponents(String fieldName, Reader reader)
ReusableAnalyzerBase.TokenStreamComponents used to tokenize all the text in the provided
Reader.createComponents in class org.apache.lucene.analysis.ReusableAnalyzerBaseReusableAnalyzerBase.TokenStreamComponents built from a StandardTokenizer filtered
with StandardFilter, ElisionFilter, LowerCaseFilter, StopFilter, KeywordMarkerFilter
if a stem exclusion set is provided, and FrenchLightStemFilterCopyright © 2004–2020 Jahia Solutions Group SA. All rights reserved.