Android基于Jsoup的网络爬虫

作者：System 时间：2024年08月10日分类：所有,爬虫字数：1239

这篇文章距离上次修改已过449天，其中的内容可能已经有所变动。




import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
 
public class JsoupExample {
    public static void main(String[] args) {
        String url = "http://example.com"; // 替换为目标网站
        try {
            Document document = Jsoup.connect(url).get();
            Elements elements = document.select("div.product-info"); // 选择器根据实际网页结构进行调整
 
            for (Element element : elements) {
                Elements titleElements = element.select("h3.title");
                Elements priceElements = element.select("p.price");
 
                if (!titleElements.isEmpty() && !priceElements.isEmpty()) {
                    String title = titleElements.get(0).text();
                    String price = priceElements.get(0).text();
                    System.out.println("Title: " + title);
                    System.out.println("Price: " + price);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

这段代码使用了Jsoup库来解析网页。首先，它连接到指定的URL，然后使用选择器选择所有含有"product-info"类的div元素。对于每个产品信息div，它会尝试提取包含在"title"和"price"类的h3和p元素中的标题和价格。最后，它打印出每个产品的标题和价格。这个例子展示了如何使用Jsoup进行基本的网页抓取和数据提取。

Android基于Jsoup的网络爬虫

评论已关闭

推荐阅读