JAVA Web Crawler Using Jsoup

In this tutorial, I want to share my learning experience about web crawler using jsoup framework. For the beginning, I write a simple code to access some website and get all available link inside html.
Below my code:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class Launcher {
	public static String url = ""; 
	public static void main(String args[]) throws Exception {
		Document doc = Jsoup.connect(url).get();
		Elements links ="a");
		int count = 0;
		for (Element link : links) {			
			if (count > 0) {
				if (link.attr("href").contains("http")) {
					try {
						Document tempDoc = Jsoup.connect(link.attr("href")).post();
					}catch (Exception e) {
						// TODO: handle exception
						System.out.println(link.attr("href") + " : Error");


Leave a Reply

Your email address will not be published. Required fields are marked *

Afiseaza emoticoanele Locco.Ro