【Python学习】网络爬虫-获取京东商品评论并制作柱状图
import requests
from bs4 import BeautifulSoup
import matplotlib.pyplot as plt
# 获取京东商品评论
def get_jd_comments(url):
headers = {
'User-Agent': 'Mozilla/5.0',
'Referer': 'https://item.jd.com/100012043978.html' # 请替换为你要爬取的商品页面URL
}
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text, 'lxml')
comments = soup.find_all('p', class_='comment-content')
return [comment.text.strip() for comment in comments]
# 分析评论并绘制柱状图
def analyze_and_draw_bar(comments):
words = []
for comment in comments:
words.extend(comment.split())
word_count = {}
for word in words:
word_count[word] = word_count.get(word, 0) + 1
words = list(word_count.keys())
counts = [word_count[word] for word in words]
plt.bar(words, counts)
plt.show()
# 主函数
def main():
# 请替换为评论页面的URL
url = 'https://item.jd.com/100012043978.html'
comments = get_jd_comments(url)
analyze_and_draw_bar(comments)
if __name__ == '__main__':
main()
这段代码首先定义了一个获取京东商品评论的函数get_jd_comments
,它使用了Requests库来发送HTTP请求,并用BeautifulSoup库来解析页面。然后定义了一个分析评论并绘制柱状图的函数analyze_and_draw_bar
,它统计每个词出现的次数,并使用Matplotlib库绘制柱状图。最后,在main
函数中调用了这两个函数来获取评论并分析。
评论已关闭