public final class FTLexer extends FTIterator implements IndexToken
IndexToken.IndexType| Constructor and Description |
|---|
FTLexer()
Constructor, using the default full-text options.
|
FTLexer(FTOpt opt)
Default constructor.
|
| Modifier and Type | Method and Description |
|---|---|
int |
count()
Returns total number of tokens.
|
FTOpt |
ftOpt()
Returns the full-text options.
|
byte[] |
get()
Returns the original token.
|
boolean |
hasNext() |
int[][] |
info()
Gets full-text info for the specified token; needed for visualizations.
|
void |
init()
Initializes the iterator.
|
FTLexer |
init(byte[] txt)
Initializes the iterator.
|
static StringList |
languages()
Lists all languages for which tokenizers and stemmers are available.
|
FTSpan |
next() |
byte[] |
nextToken()
Returns the next token.
|
boolean |
paragraph()
Is paragraph? Does not have to be implemented by all tokenizers.
|
int |
pos(int w,
FTUnit u)
Calculates a position value, dependent on the specified unit.
|
FTLexer |
sc()
Sets the special character flag.
|
byte[] |
text()
Returns the text to be processed.
|
IndexToken.IndexType |
type()
Returns the index type.
|
removepublic FTLexer()
XMLSerializer, FTFilter, and the map visualizations.public FTLexer(FTOpt opt)
opt - full-text optionspublic FTLexer sc()
public void init()
public FTLexer init(byte[] txt)
FTIteratorinit in class FTIteratortxt - textpublic boolean hasNext()
hasNext in interface java.util.Iterator<FTSpan>public byte[] nextToken()
FTIteratorIterator.next()
to avoid the creation of new FTSpan instances.nextToken in class FTIteratorpublic int count()
public IndexToken.IndexType type()
IndexTokentype in interface IndexTokenpublic byte[] get()
IndexToken;
use next() or nextToken() if not using this interface.get in interface IndexTokenpublic FTOpt ftOpt()
null.public byte[] text()
public boolean paragraph()
public int pos(int w,
FTUnit u)
w - word positionu - unitpublic int[][] info()
Tokenizer.info() for more info.public static StringList languages()