SpringBoot使用jsoup爬取HTML

作者：System 时间：2024年08月17日分类：所有,html 字数：841

这篇文章距离上次修改已过709天，其中的内容可能已经有所变动。




import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
 
@RestController
public class JsoupController {
 
    @GetMapping("/parseHtml")
    public String parseHtml() {
        try {
            // 目标网页URL
            String url = "http://example.com";
            // 使用Jsoup连接网页
            Document doc = Jsoup.connect(url).get();
            // 解析你感兴趣的内容，例如标题
            String title = doc.title();
 
            return title;
        } catch (Exception e) {
            e.printStackTrace();
            return "Error parsing HTML";
        }
    }
}

这段代码展示了如何在Spring Boot应用中使用Jsoup库来解析HTML。当访问/parseHtml端点时，它会连接到指定的URL，获取HTML内容，并返回页面的标题。这是一个简单的示例，实际应用中可能需要根据具体需求来解析HTML文档中的其他部分。

SpringBoot使用jsoup爬取HTML

评论已关闭

推荐阅读