Package spiderman.plugin.util

Examples of spiderman.plugin.util.DefaultLinkNormalizer


   
    //resolveUrl
    String hostUrl = new StringBuilder("http://").append(new URL(task.site.getUrl()).getHost()).append("/").toString();
    List<String> newUrls = new ArrayList<String>(urls.size());
    for (String url : urls) {
      LinkNormalizer ln = new DefaultLinkNormalizer(hostUrl);
      String newUrl = ln.normalize(url);
//      String newUrl = URLCanonicalizer.getCanonicalURL(ln.normalize(url));
      if (newUrl.startsWith("mailto:"))
        continue;
      //去重复
      if (newUrls.contains(newUrl))
View Full Code Here

TOP

Related Classes of spiderman.plugin.util.DefaultLinkNormalizer

Copyright © 2018 www.massapicom. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.