Package com.yahoo.language.simple
Class SimpleLinguistics
java.lang.Object
com.yahoo.language.simple.SimpleLinguistics
- All Implemented Interfaces:
Linguistics
Factory of simple linguistic processor implementations.
Useful for testing and english-only use cases.
- Author:
- bratseth, bjorncs
-
Nested Class Summary
Nested classes/interfaces inherited from interface com.yahoo.language.Linguistics
Linguistics.Component -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbooleanequals(Linguistics other) Check if another instance is equivalent to this oneReturns a thread-unsafe character classes instance.Returns a thread-unsafe detector.Returns a thread-unsafe gram splitter.Returns a thread-unsafe normalizer.Returns a thread-unsafe segmenter.Returns a thread-unsafe stemmer or lemmatizer.Returns a thread-unsafe tokenizer.Returns a thread-unsafe transformer.toString()
-
Constructor Details
-
SimpleLinguistics
@Inject public SimpleLinguistics()
-
-
Method Details
-
getStemmer
Description copied from interface:LinguisticsReturns a thread-unsafe stemmer or lemmatizer. This is used at query time to do stemming of search terms to indexes which contains text tokenized with stemming turned on- Specified by:
getStemmerin interfaceLinguistics
-
getTokenizer
Description copied from interface:LinguisticsReturns a thread-unsafe tokenizer. This is used at indexing time to produce an optionally stemmed and transformed (accent normalized) stream of indexable tokens.- Specified by:
getTokenizerin interfaceLinguistics
-
getNormalizer
Description copied from interface:LinguisticsReturns a thread-unsafe normalizer. This is used at query time to cjk normalize query text.- Specified by:
getNormalizerin interfaceLinguistics
-
getTransformer
Description copied from interface:LinguisticsReturns a thread-unsafe transformer. This is used at query time to do stemming of search terms to indexes which contains text tokenized with accent normalization turned on- Specified by:
getTransformerin interfaceLinguistics
-
getSegmenter
Description copied from interface:LinguisticsReturns a thread-unsafe segmenter. This is used at query time to find the individual semantic components of search terms to indexes tokenized with segmentation.- Specified by:
getSegmenterin interfaceLinguistics
-
getDetector
Description copied from interface:LinguisticsReturns a thread-unsafe detector. The language of the text is a parameter to other linguistic operations. This is used to determine the language of a query or document field when not specified explicitly.- Specified by:
getDetectorin interfaceLinguistics
-
getGramSplitter
Description copied from interface:LinguisticsReturns a thread-unsafe gram splitter. This is used to split query or document text into fixed-length grams which allows matching without needing or using segmented tokens.- Specified by:
getGramSplitterin interfaceLinguistics
-
getCharacterClasses
Description copied from interface:LinguisticsReturns a thread-unsafe character classes instance.- Specified by:
getCharacterClassesin interfaceLinguistics
-
equals
Description copied from interface:LinguisticsCheck if another instance is equivalent to this one- Specified by:
equalsin interfaceLinguistics
-
toString
-