List of tokenize() Examples

Examples of tokenize()

batch.internal.support.DelimitedLineTokenizer.tokenize()
Yields the tokens resulting from the splitting of the supplied line along the configured delimiter.
Does not include the delimiter in the returned token array.
Empty tokens are returned as empty strings, never null. @param line the line to be tokenised (can be null) @return the resulting tokens; an empty String[] if no delimiter was found or if the suppliedline is null or zero length
client.net.sf.saxon.ce.regex.ARegularExpression.tokenize()
Use this regular expression to tokenize an input string. @param input the string to be tokenized @return a SequenceIterator containing the resulting tokens, as objects of type StringValue
com.aliasi.tokenizer.Tokenizer.tokenize()
com.atilika.kuromoji.AbstractTokenizer.tokenize()
Tokenize input text @param text @return list of Token
com.github.pmerienne.trident.ml.preprocessing.EnglishTokenizer.tokenize()
com.github.pmerienne.trident.ml.preprocessing.TextTokenizer.tokenize()
com.github.pmerienne.trident.ml.preprocessing.TwitterTokenizer.tokenize()
com.google.dart.engine.html.scanner.AbstractScanner.tokenize()
Scan the source code to produce a list of tokens representing the source. @return the first token in the list of tokens that were produced
com.google.dart.engine.html.scanner.StringScanner.tokenize()
com.google.dart.engine.scanner.Scanner.tokenize()
Scan the source code to produce a list of tokens representing the source. @return the first token in the list of tokens that were produced
com.openkm.kea.filter.KEAPhraseFilter.tokenize()
com.totsp.gwittir.client.util.HistoryTokenizer.tokenize()
cx.fbn.nevernote.oauth.OAuthTokenizer.tokenize()
edu.buffalo.cse.ir.wikiindexer.tokenizer.Tokenizer.tokenize()
Main method used to tokenize. IT simply calls the rules one by one @param stream: The TokenStream to be worked upon @throws TokenizerException : If any tokenization exception occurs
edu.harvard.wcfia.yoshikoder.document.tokenizer.TokenizationService.tokenize()
edu.stanford.nlp.process.PTBTokenizer.tokenize()
edu.udo.cs.wvtool.generic.tokenizer.WVTTokenizer.tokenize()
Tokenize a character stream. @param source the Reader from which to get the character stream @param d the WVTDocumentInfo value, describing the document being processed @return a TokenEnumeration @exception Exception if an error occurs
net.sf.saxon.expr.Tokenizer.tokenize()
Prepare a string for tokenization. The actual tokens are obtained by calls on next() @param input the string to be tokenized @param start start point within the string @param end end point within the string (last character not read):-1 means end of string @param lineNumber the linenumber in the source where the expression appears @throws XPathException if a lexical error occurs, e.g. unmatchedstring quotes
net.sf.saxon.regex.RegularExpression.tokenize()
Use this regular expression to tokenize an input string. @param input the string to be tokenized @return a SequenceIterator containing the resulting tokens, as objects of type StringValue
opennlp.ccg.lexicon.DefaultTokenizer.tokenize()
Parses an input string into a list of words, including any explicitly given factors, and the semantic class of special tokens. Tokens are parsed into words using parseToken with the strictFactors flag set to false.
opennlp.ccg.lexicon.Tokenizer.tokenize()
Parses an input string into a list of words, including any explicitly given factors, and the semantic class of special tokens. Tokens are parsed into words using parseToken.
opennlp.tools.tokenize.SimpleTokenizer.tokenize()
opennlp.tools.tokenize.Tokenizer.tokenize()
Splits a string into its atomic parts @param s The string to be tokenized. @return The String[] with the individual tokens as the arrayelements.
opennlp.tools.tokenize.TokenizerME.tokenize()
Tokenize a String. @param s The string to be tokenized. @return A string array containing individual tokens as elements.
org.antlr.v4.runtime.tree.pattern.ParseTreePatternMatcher.tokenize()
org.antlr.works.grammar.syntax.GrammarSyntaxLexer.tokenize()
org.apache.qpid.framing.AMQShortString.tokenize()
org.apache.stanbol.enhancer.engines.entitylinking.LabelTokenizer.tokenize()
Tokenizes the parsed label in the parsed language @param label the label @param language the language of the lable or null ifnot known @return the tokenized label
org.eclipse.assemblyformatter.ir.Formatter.tokenize()
org.folg.places.standardize.Normalizer.tokenize()
Tokenize name by removing diacritics, lowercasing, and splitting on non alphanumeric characters @param text string to tokenize @return tokenized place levels
org.galagosearch.core.parse.TagTokenizer.tokenize()
Parses the text in the document.text attribute and fills in the document.terms and document.tags arrays. @param document @throws java.io.IOException
org.jasen.core.token.SimpleWordTokenizer.tokenize()
Tokenizes (splits) the text @throws IOException
org.jboss.common.beans.property.token.ArrayTokenizer.tokenize()
Implementation of this method breaks down passed string into tokens. @param value @return
org.jitterbit.integration.data.script.Transform.tokenize()
org.languagetool.tokenizers.SentenceTokenizer.tokenize()
Tokenize the given string to sentences.
org.languagetool.tokenizers.Tokenizer.tokenize()
org.languagetool.tokenizers.WordTokenizer.tokenize()
org.pdf4j.saxon.regex.RegularExpression.tokenize()
Use this regular expression to tokenize an input string. @param input the string to be tokenized @return a SequenceIterator containing the resulting tokens, as objects of type StringValue
org.springframework.batch.item.file.transform.DelimitedLineTokenizer.tokenize()
org.springframework.batch.item.file.transform.LineTokenizer.tokenize()
Yields the tokens resulting from the splitting of the supplied line. @param line the line to be tokenized (can be null) @return the resulting tokens
org.zkoss.selector.lang.Tokenizer.tokenize()

Examples of opennlp.tools.tokenize.Tokenizer.tokenize()

    return model.eval(mContextGenerator.getContext(text));
  }


  public double[] categorize(String documentText) {
    Tokenizer tokenizer = SimpleTokenizer.INSTANCE;
    return categorize(tokenizer.tokenize(documentText));
  }


  public String getBestCategory(double[] outcome) {
    return model.getBestOutcome(outcome);
  }

View Full Code Here

Examples of opennlp.tools.tokenize.Tokenizer.tokenize()

   */
  public SentencesToTree(String text, TokenizerModel model){
    /* Configure the tokenizer with preloaded model */
    Tokenizer tokenizer = new TokenizerME(model);
    /* tokens has an array of strings, where each string is a token */
    String s = spaces(tokenizer.tokenize(text));
    this.text = this.upperCase(s);
  }
  
  /**
   *

View Full Code Here

Examples of opennlp.tools.tokenize.Tokenizer.tokenize()

   * Categorizes the given text. The text is tokenized with the SimpleTokenizer before it
   * is passed to the feature generation.
   */
  public double[] categorize(String documentText) {
    Tokenizer tokenizer = SimpleTokenizer.INSTANCE;
    return categorize(tokenizer.tokenize(documentText));
  }


  public String getBestCategory(double[] outcome) {
    return model.getBestOutcome(outcome);
  }

View Full Code Here

Examples of opennlp.tools.tokenize.TokenizerME.tokenize()

   */
  public SentencesToTree(String text, TokenizerModel model){
    /* Configure the tokenizer with preloaded model */
    Tokenizer tokenizer = new TokenizerME(model);
    /* tokens has an array of strings, where each string is a token */
    String s = spaces(tokenizer.tokenize(text));
    this.text = this.upperCase(s);
  }
  
  /**
   *

View Full Code Here

Examples of org.antlr.v4.runtime.tree.pattern.ParseTreePatternMatcher.tokenize()

      rawGenerateAndBuildRecognizer("X1.g4", grammar, "X1Parser", "X1Lexer", false);
    assertTrue(ok);


    ParseTreePatternMatcher m = getPatternMatcher("X1");


    List<? extends Token> tokens = m.tokenize("<ID> = <expr> ;");
    String results = tokens.toString();
    String expected = "[ID:3, [@-1,1:1='=',<1>,1:1], expr:7, [@-1,1:1=';',<2>,1:1]]";
    assertEquals(expected, results);
  }

View Full Code Here

Examples of org.antlr.works.grammar.syntax.GrammarSyntaxLexer.tokenize()

            }


        }


        GrammarSyntaxLexer lexer = new GrammarSyntaxLexer();
        lexer.tokenize(content);


        ParseProperties parser = new ParseProperties();
        parser.parse(lexer.getTokens());
        return parser.propertiesTokens;
    }

View Full Code Here

Examples of org.apache.qpid.framing.AMQShortString.tokenize()

                List<AMQQueue> queueList2 = _wildCardBindingKey2queues.putIfAbsent(routingKey, new CopyOnWriteArrayList<AMQQueue>());


                if(queueList2 == null)
                {
                    queueList2 = _wildCardBindingKey2queues.get(routingKey);
                    AMQShortStringTokenizer keyTok = routingKey.tokenize(TOPIC_SEPARATOR);


                    ArrayList<AMQShortString> keyTokList = new ArrayList<AMQShortString>(keyTok.countTokens());


                    while (keyTok.hasMoreTokens())
                    {

View Full Code Here

Examples of org.apache.stanbol.enhancer.engines.entitylinking.LabelTokenizer.tokenize()

    public String[] tokenize(String label,String language){
        for(ServiceReference ref : getTokenizers(language)){
            LabelTokenizer tokenizer = (LabelTokenizer)labelTokenizerTracker.getService(ref);
            if(tokenizer != null){
                log.trace(" > use Tokenizer {} for language {}",tokenizer.getClass(),language);
                String[] tokens = tokenizer.tokenize(label, language);
                if(tokens != null){
                    if(log.isTraceEnabled()){
                        log.trace("   - tokenized {} -> {}",label, Arrays.toString(tokens));
                    }
                    return tokens;

View Full Code Here

Examples of org.eclipse.assemblyformatter.ir.Formatter.tokenize()

        MessageDialog.openError(shell, e.getClass().getCanonicalName(),
            e.getMessage());
      }


      final String directory = "E:\\assembly-formatter\\debug\\";
      formatter.tokenize();
      try {
        formatter.writeSectionList(directory
            + "section-list tokenize().xml");
      } catch (ParserConfigurationException e) {
        MessageDialog.openError(shell, e.getClass().getCanonicalName(),

View Full Code Here

Examples of org.folg.places.standardize.Normalizer.tokenize()

            StringBuilder reversedWord = new StringBuilder(nextLine);
            reversedWordsWriter.println(reversedWord.reverse());
         }


         if ( (useTokenizer) && (lineCount % TOKENIZE_EVERY_N == 0) ){
            List<List<String>> levels = normalizer.tokenize(nextLine);
            for (List<String> levelWords : levels) {
               tokenizerPlacesCountCC.addAll(levelWords);
               totalTokenizerPlacesCount += levelWords.size();
            }
         }

View Full Code Here

0 1 2 3 4 5 6 7

TOP

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.