PHP 爬虫实战：爬取 Twitter 上的数据

作者：System 时间：2024年08月14日分类：所有,php 字数：818

这篇文章距离上次修改已过343天，其中的内容可能已经有所变动。




<?php
// 初始化必要的变量
$url = 'https://twitter.com/i/profiles/show/1350719055196418561/timeline/tweets';
$options = [
    'http' => [
        'header' => "User-Agent: MyBot/1.0\r\n"
    ]
];
$context = stream_context_create($options);
$tweets = [];
 
// 发送请求并获取数据
$response = file_get_contents($url, false, $context);
 
// 解析JSON格式的响应数据
$data = json_decode($response, true);
 
// 提取tweets信息
foreach ($data['globalObjects']['tweets'] as $tweet) {
    $tweets[] = [
        'id' => $tweet['id'],
        'text' => $tweet['fullText'],
        'created_at' => $tweet['createdAt'],
        'user_id' => $tweet['userId']
    ];
}
 
// 打印结果
print_r($tweets);

这段代码使用了file_get_contents函数和json_decode函数来发送HTTP请求并处理响应。它还演示了如何通过自定义的User-Agent来伪装为一个真实的浏览器，避免被服务器识别为爬虫。最后，它提取了特定用户的Twitter数据，并以数组的形式输出。

PHP 爬虫实战：爬取 Twitter 上的数据

评论已关闭

推荐阅读