Python爬虫之异步爬虫

作者：System 时间：2024年08月19日分类：所有,爬虫字数：709

这篇文章距离上次修改已过354天，其中的内容可能已经有所变动。




import asyncio
import aiohttp
 
async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()
 
async def main():
    urls = ['http://httpbin.org/delay/1', 'http://httpbin.org/delay/2']
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
        for result in results:
            print(result)
 
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

这段代码使用了aiohttp库来进行异步HTTP请求，以及asyncio库来管理异步任务。fetch函数负责获取指定URL的内容，main函数则是协程的主要入口点，其中创建了一个ClientSession，然后并行地执行多个fetch调用。这样可以有效地提高爬取性能，特别是在网络I/O密集的任务中。

Python爬虫之异步爬虫

评论已关闭

推荐阅读