Python爬虫(selenium)从网站获取信息并存入数据库(mysql)

作者：System 时间：2024年08月23日分类：所有,mysql 字数：1373

这篇文章距离上次修改已过704天，其中的内容可能已经有所变动。

以下是一个使用Python的Selenium库和PyMysql库从网站爬取信息并存储到MySQL数据库的示例代码。

首先，确保你已经安装了Selenium和PyMysql库。




from selenium import webdriver
import pymysql
 
# 连接数据库
connection = pymysql.connect(host='localhost',
                             user='your_username',
                             password='your_password',
                             database='your_database',
                             charset='utf8mb4',
                             cursorclass=pymysql.cursors.DictCursor)
 
try:
    with connection.cursor() as cursor:
        # 创建表
        sql = """
        CREATE TABLE IF NOT EXISTS example_table (
            id INT AUTO_INCREMENT PRIMARY KEY,
            data_from_website VARCHAR(255)
        ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
        """
        cursor.execute(sql)
 
    # 启动浏览器
    driver = webdriver.Chrome()
    driver.get('http://example.com')
 
    # 获取网站数据
    data_on_website = driver.find_element_by_id('element_id').text
 
    # 存储数据到数据库
    with connection.cursor() as cursor:
        sql = "INSERT INTO example_table(data_from_website) VALUES (%s)"
        cursor.execute(sql, (data_on_website))
    connection.commit()
 
finally:
    connection.close()
    driver.quit()

在这个例子中，我们首先连接到MySQL数据库，并创建一个表（如果尚不存在）。然后，我们使用Selenium启动一个浏览器，并导航至我们想要爬取信息的网站。我们使用Selenium定位元素并获取数据。接下来，我们将这些数据插入到我们之前创建的表中。最后，我们关闭浏览器和数据库连接。

Python爬虫(selenium)从网站获取信息并存入数据库(mysql)

评论已关闭

推荐阅读