Uses of Package
com.yahoo.language.process
Packages that use com.yahoo.language.process
Package
Description
-
Classes in com.yahoo.language.process used by ai.vespa.language.chunkerClassDescriptionA chunker converts splits a text string into multiple smaller strings (chunks).
-
Classes in com.yahoo.language.process used by com.yahoo.languageClassDescriptionDetermines the class of a given character.A class which splits consecutive word character sequences into overlapping character n-grams.This interface provides NFKC normalization of Strings through the underlying linguistics library.A segmenter splits a string into separate segments (such as words) without applying any further processing (such as stemming) on each segment.Interface providing stemming of single words.Language-sensitive tokenization of a text string.Interface for providers of text transformations such as accent removal.
-
Classes in com.yahoo.language.process used by com.yahoo.language.processClassDescriptionDetermines the class of a given character.A chunker converts splits a text string into multiple smaller strings (chunks).An embedder converts a text string to a tensorRuntime that is injectable through
Embedderconstructor.Generates field values given an input text.An immutable start index and length pairContext of an invocation of a component carrying out a processing task.Parameters to a linguistics operation.A segmenter splits a string into separate segments (such as words) without applying any further processing (such as stemming) on each segment.An immutable list of special tokens - strings which should override the normal tokenizer semantics and be tokenized into a single token.An immutable special tokenA list of strings which does not allow for duplicate elements.Interface providing stemming of single words.An enum of the stemming modes which can be requested.A single token produced by the tokenizer.Language-sensitive tokenization of a text string.List of token scripts (e.g. latin, japanese, chinese, etc.) which may warrant different linguistics treatment.An enumeration of token types. -
Classes in com.yahoo.language.process used by com.yahoo.language.simpleClassDescriptionDetermines the class of a given character.A class which splits consecutive word character sequences into overlapping character n-grams.Parameters to a linguistics operation.This interface provides NFKC normalization of Strings through the underlying linguistics library.A segmenter splits a string into separate segments (such as words) without applying any further processing (such as stemming) on each segment.Immutable named lists of "special tokens" - strings which should override the normal tokenizer semantics and be tokenized into a single token.Interface providing stemming of single words.A single token produced by the tokenizer.Language-sensitive tokenization of a text string.List of token scripts (e.g. latin, japanese, chinese, etc.) which may warrant different linguistics treatment.An enumeration of token types.Interface for providers of text transformations such as accent removal.