python爬虫爬取网页图片

作者：System 时间：2024年08月17日分类：所有,爬虫字数：1221

这篇文章距离上次修改已过340天，其中的内容可能已经有所变动。

要使用Python爬取网页上的图片，你可以使用requests库来获取网页内容，然后使用BeautifulSoup来解析HTML并找到图片链接。以下是一个简单的例子：




import requests
from bs4 import BeautifulSoup
import os
 
# 目标网页URL
url = 'http://example.com'
 
# 发送HTTP请求
response = requests.get(url)
 
# 检查请求是否成功
if response.status_code == 200:
    # 解析网页内容
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # 找到所有的img标签
    images = soup.find_all('img')
    
    # 创建一个文件夹来保存图片
    os.makedirs('images', exist_ok=True)
    
    # 遍历图片链接并下载保存
    for img in images:
        # 获取图片地址
        img_url = img.get('src')
        
        # 获取图片名称
        img_name = os.path.basename(img_url)
        
        # 下载图片
        response_img = requests.get(img_url)
        if response_img.status_code == 200:
            with open(os.path.join('images', img_name), 'wb') as f:
                f.write(response_img.content)
                print(f'Image {img_name} downloaded successfully.')
        else:
            print(f'Failed to download {img_url}.')
else:
    print('Failed to retrieve the webpage.')

请确保你已经安装了requests和beautifulsoup4库，可以使用pip install requests beautifulsoup4来安装。

注意：这个例子仅用于学习目的，实际应用中应遵守网站的robots.txt规则，并尊重版权以及法律限制，避免非法下载内容。

python爬虫爬取网页图片

评论已关闭

推荐阅读