Package com.yahoo.language.simple
Class SimpleToken
java.lang.Object
com.yahoo.language.simple.SimpleToken
- All Implemented Interfaces:
Token
- Author:
- Mathias Mølster Lidal
-
Constructor Summary
ConstructorsConstructorDescriptionSimpleToken(String original) SimpleToken(String original, String tokenString) -
Method Summary
Modifier and TypeMethodDescriptionaddComponent(Token token) booleanstatic SimpleTokengetComponent(int i) Returns a component token of thisintReturns the number of components, if this token is a compound word (e.g. german "kommunikationsfehler".intReturns the number of stem forms available for this token.longReturns the offset position of this tokengetOrig()Returns the original form of this tokenReturns the script of this tokengetStem(int i) Returns a stem (or more generally: Alternative form) of this token.Returns the token string in a form suitable for indexing: The most lowercased variant of the most processed token form available, If called on a compound token this returns a lowercased form of the entire word.getType()Returns the type of this token - word, space or punctuation etc.inthashCode()booleanWhether this token should be indexedbooleanReturns whether this is an instance of a declared special token (e.g. c++)setOffset(long offset) setScript(TokenScript script) setSpecialToken(boolean specialToken) setTokenString(String string) toString()
-
Constructor Details
-
SimpleToken
-
SimpleToken
-
-
Method Details
-
getOrig
Description copied from interface:TokenReturns the original form of this token -
getNumStems
public int getNumStems()Description copied from interface:TokenReturns the number of stem forms available for this token.- Specified by:
getNumStemsin interfaceToken
-
getStem
Returns a stem (or more generally: Alternative form) of this token. -
getNumComponents
public int getNumComponents()Description copied from interface:TokenReturns the number of components, if this token is a compound word (e.g. german "kommunikationsfehler". Otherwise, returns 0.- Specified by:
getNumComponentsin interfaceToken- Returns:
- number of components, or 0 if none
-
getComponent
Description copied from interface:TokenReturns a component token of this- Specified by:
getComponentin interfaceToken
-
addStem
-
addComponent
-
getTokenString
Description copied from interface:TokenReturns the token string in a form suitable for indexing: The most lowercased variant of the most processed token form available, If called on a compound token this returns a lowercased form of the entire word. If this is a special token with a configured replacement, this will return the replacement token.- Specified by:
getTokenStringin interfaceToken
-
setTokenString
-
getType
Description copied from interface:TokenReturns the type of this token - word, space or punctuation etc. -
setType
-
getScript
Description copied from interface:TokenReturns the script of this token -
setScript
-
isSpecialToken
public boolean isSpecialToken()Description copied from interface:TokenReturns whether this is an instance of a declared special token (e.g. c++)- Specified by:
isSpecialTokenin interfaceToken
-
setSpecialToken
-
getOffset
public long getOffset()Description copied from interface:TokenReturns the offset position of this token -
setOffset
-
equals
-
hashCode
public int hashCode() -
toString
-
toDetailString
-
isIndexable
public boolean isIndexable()Description copied from interface:TokenWhether this token should be indexed- Specified by:
isIndexablein interfaceToken
-
fromStems
-