python爬虫之aiohttp多任务异步爬虫
import asyncio
import aiohttp
async def fetch(session, url, semaphore):
async with semaphore:
async with session.get(url) as response:
return await response.text()
async def main():
urls = ['http://httpbin.org/delay/1', 'http://httpbin.org/delay/2']
semaphore = asyncio.Semaphore(5) # 最多同时进行5个网络请求
async with aiohttp.ClientSession() as session:
tasks = [fetch(session, url, semaphore) for url in urls]
results = await asyncio.gather(*tasks)
for result in results:
print(result)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
这段代码使用了aiohttp
库来发送异步的HTTP GET请求,并使用了asyncio.Semaphore
来限制同时进行的请求数量。这是一个简单的异步多任务爬虫示例,适合用于处理并发请求的场景。
评论已关闭