python基础之爬虫模块requests模块详解
requests模块是Python中一个非常强大的用来发送HTTP请求的模块。它可以用来模拟浏览器的行为,比如访问网页、上传文件等。
- 发送GET请求
import requests
response = requests.get('https://www.google.com/')
print(response.text)
- 发送POST请求
import requests
response = requests.post('https://www.example.com/login', data={'username': 'user', 'password': 'pass'})
print(response.text)
- 发送带有headers的请求
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
}
response = requests.get('https://www.example.com', headers=headers)
print(response.text)
- 使用代理
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
response = requests.get('https://www.example.com', proxies=proxies)
print(response.text)
- 处理Cookies
import requests
response = requests.get('https://www.example.com')
print(response.cookies)
response = requests.get('https://www.example.com', cookies={'authenticated': True})
print(response.text)
- 超时处理
import requests
response = requests.get('https://www.example.com', timeout=5)
print(response.text)
- 文件上传
import requests
files = {'file': open('report.xls', 'rb')}
response = requests.post('https://www.example.com/upload', files=files)
print(response.text)
- 处理响应
import requests
response = requests.get('https://www.example.com')
print(response.status_code) # 状态码
print(response.headers) # 头部信息
print(response.cookies) # cookies
print(response.text) # 文本内容
print(response.content) # 二进制内容
以上就是requests模块的一些常用方法,可以应对大部分的网络请求场景。
评论已关闭