-
batch.internal.support.DelimitedLineTokenizer.tokenize()
Yields the tokens resulting from the splitting of the supplied
line
along the configured delimiter.
Does not include the delimiter in the returned token array.
Empty tokens are returned as empty strings, never null
.
@param line the line to be tokenised (can be null
)
@return the resulting tokens; an empty String[]
if no delimiter was found or if the suppliedline
is null
or zero length
-
client.net.sf.saxon.ce.regex.ARegularExpression.tokenize()
Use this regular expression to tokenize an input string.
@param input the string to be tokenized
@return a SequenceIterator containing the resulting tokens, as objects of type StringValue
-
com.aliasi.tokenizer.Tokenizer.tokenize()
-
com.atilika.kuromoji.AbstractTokenizer.tokenize()
Tokenize input text
@param text
@return list of Token
-
com.github.pmerienne.trident.ml.preprocessing.EnglishTokenizer.tokenize()
-
com.github.pmerienne.trident.ml.preprocessing.TextTokenizer.tokenize()
-
com.github.pmerienne.trident.ml.preprocessing.TwitterTokenizer.tokenize()
-
com.google.dart.engine.html.scanner.AbstractScanner.tokenize()
Scan the source code to produce a list of tokens representing the source.
@return the first token in the list of tokens that were produced
-
com.google.dart.engine.html.scanner.StringScanner.tokenize()
-
com.google.dart.engine.scanner.Scanner.tokenize()
Scan the source code to produce a list of tokens representing the source.
@return the first token in the list of tokens that were produced
-
com.openkm.kea.filter.KEAPhraseFilter.tokenize()
-
com.totsp.gwittir.client.util.HistoryTokenizer.tokenize()
-
cx.fbn.nevernote.oauth.OAuthTokenizer.tokenize()
-
edu.buffalo.cse.ir.wikiindexer.tokenizer.Tokenizer.tokenize()
Main method used to tokenize. IT simply calls the rules one by one
@param stream: The TokenStream to be worked upon
@throws TokenizerException : If any tokenization exception occurs
-
edu.harvard.wcfia.yoshikoder.document.tokenizer.TokenizationService.tokenize()
-
edu.stanford.nlp.process.PTBTokenizer.tokenize()
-
edu.udo.cs.wvtool.generic.tokenizer.WVTTokenizer.tokenize()
Tokenize a character stream.
@param source the Reader
from which to get the character stream
@param d the WVTDocumentInfo
value, describing the document being processed
@return a TokenEnumeration
@exception Exception if an error occurs
-
net.sf.saxon.expr.Tokenizer.tokenize()
Prepare a string for tokenization. The actual tokens are obtained by calls on next()
@param input the string to be tokenized
@param start start point within the string
@param end end point within the string (last character not read):-1 means end of string
@param lineNumber the linenumber in the source where the expression appears
@throws XPathException if a lexical error occurs, e.g. unmatchedstring quotes
-
net.sf.saxon.regex.RegularExpression.tokenize()
Use this regular expression to tokenize an input string.
@param input the string to be tokenized
@return a SequenceIterator containing the resulting tokens, as objects of type StringValue
-
opennlp.ccg.lexicon.DefaultTokenizer.tokenize()
Parses an input string into a list of words, including any explicitly given factors, and the semantic class of special tokens. Tokens are parsed into words using parseToken with the strictFactors flag set to false.
-
opennlp.ccg.lexicon.Tokenizer.tokenize()
Parses an input string into a list of words, including any explicitly given factors, and the semantic class of special tokens. Tokens are parsed into words using parseToken.
-
opennlp.tools.tokenize.SimpleTokenizer.tokenize()
-
opennlp.tools.tokenize.Tokenizer.tokenize()
Splits a string into its atomic parts
@param s The string to be tokenized.
@return The String[] with the individual tokens as the arrayelements.
-
opennlp.tools.tokenize.TokenizerME.tokenize()
Tokenize a String.
@param s The string to be tokenized.
@return A string array containing individual tokens as elements.
-
org.antlr.v4.runtime.tree.pattern.ParseTreePatternMatcher.tokenize()
-
org.antlr.works.grammar.syntax.GrammarSyntaxLexer.tokenize()
-
org.apache.qpid.framing.AMQShortString.tokenize()
-
org.apache.stanbol.enhancer.engines.entitylinking.LabelTokenizer.tokenize()
Tokenizes the parsed label in the parsed language
@param label the label
@param language the language of the lable or null
ifnot known
@return the tokenized label
-
org.eclipse.assemblyformatter.ir.Formatter.tokenize()
-
org.folg.places.standardize.Normalizer.tokenize()
Tokenize name by removing diacritics, lowercasing, and splitting on non alphanumeric characters
@param text string to tokenize
@return tokenized place levels
-
org.galagosearch.core.parse.TagTokenizer.tokenize()
Parses the text in the document.text attribute and fills in the document.terms and document.tags arrays.
@param document
@throws java.io.IOException
-
org.jasen.core.token.SimpleWordTokenizer.tokenize()
Tokenizes (splits) the text
@throws IOException
-
org.jboss.common.beans.property.token.ArrayTokenizer.tokenize()
Implementation of this method breaks down passed string into tokens.
@param value
@return
-
org.jitterbit.integration.data.script.Transform.tokenize()
-
org.languagetool.tokenizers.SentenceTokenizer.tokenize()
Tokenize the given string to sentences.
-
org.languagetool.tokenizers.Tokenizer.tokenize()
-
org.languagetool.tokenizers.WordTokenizer.tokenize()
-
org.pdf4j.saxon.regex.RegularExpression.tokenize()
Use this regular expression to tokenize an input string.
@param input the string to be tokenized
@return a SequenceIterator containing the resulting tokens, as objects of type StringValue
-
org.springframework.batch.item.file.transform.DelimitedLineTokenizer.tokenize()
-
org.springframework.batch.item.file.transform.LineTokenizer.tokenize()
Yields the tokens resulting from the splitting of the supplied line
.
@param line the line to be tokenized (can be null
)
@return the resulting tokens
-
org.zkoss.selector.lang.Tokenizer.tokenize()