nodejs 做一个简单的爬虫

作者：System 时间：2024年08月16日分类：所有,爬虫字数：800

这篇文章距离上次修改已过354天，其中的内容可能已经有所变动。

在Node.js中创建一个简单的爬虫，你可以使用axios来发送HTTP请求，以及cheerio来解析返回的HTML内容。以下是一个简单的例子，展示了如何抓取一个网页上的所有链接。

首先，你需要安装所需的包：




npm install axios cheerio

然后，你可以使用以下代码创建你的爬虫：




const axios = require('axios');
const cheerio = require('cheerio');
 
async function fetchLinks(url) {
  try {
    const { data } = await axios.get(url);
    const $ = cheerio.load(data);
    const links = [];
 
    $('a').each((i, link) => {
      const href = $(link).attr('href');
      if (href) {
        links.push(href);
      }
    });
 
    console.log(links);
  } catch (error) {
    console.error('An error occurred:', error);
  }
}
 
// 使用示例
const url = 'https://example.com'; // 替换为你想抓取的网址
fetchLinks(url);

这段代码会抓取指定网页上的所有<a>标签的href属性，并将它们打印出来。你可以根据需要修改选择器和处理逻辑来抓取不同的内容。

nodejs 做一个简单的爬虫

评论已关闭

推荐阅读