Package org.apache.lucene.wikipedia.analysis

Examples of org.apache.lucene.wikipedia.analysis.WikipediaTokenizer.addAttribute()


   * @throws IOException
   */
  static Set<String> getTokens(Article article) throws IOException {
    Set<String> tokenList = new HashSet<String>();
    WikipediaTokenizer tok = new WikipediaTokenizer(new StringReader(article.getText()));
    TermAttribute term = tok.addAttribute(TermAttribute.class);
    try {
      while (tok.incrementToken()) {
        String token = term.term();
        if (!StringUtils.isEmpty(token))
          tokenList.add(token);
View Full Code Here

TOP
Copyright © 2018 www.massapi.com. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.