已解决
selenium(练习)提取dou yu网站上的数据
来自网友在路上 152852提问 提问时间:2023-09-19 01:03:07阅读次数: 52
最佳答案 问答题库528位专家为你答疑解惑
运行代码时,它会打开斗鱼网站并逐个打印每个房间的相关信息 打印出每个房间的标题、类型、所有者、观看人数和封面图片
import timefrom selenium import webdriverclass Douyu(object):def __init__(self):self.url = 'https://www.douyu.com/directory/all'self.driver = webdriver.Edge()def parse_data(self):time.sleep(3)room_list = self.driver.find_elements_by_xpath('//*[@id="listAll"]/section[2]/div[2]/ul/li/div')# print(len(room_list))# 遍历data_list = []for room in room_list:temp = {}temp['title'] = room.find_element_by_xpath('./a/div[2]/div[1]/h3').texttemp['type'] = room.find_element_by_xpath('./a/div[2]/div[1]/span').texttemp['owner'] = room.find_element_by_xpath('./a/div[2]/div[2]/h2').texttemp['num'] = room.find_element_by_xpath('./a/div[2]/div[2]/span').texttemp['picture'] = room.find_element_by_xpath('./a/div[1]/div[1]/picture/img').get_attribute('src')# print(temp)data_list.append(temp)return data_listdef save_data(self, data_list):for data in data_list:print((data))def run(self):# url# driver# getself.driver.get(self.url)while True:# parsedata_list = self.parse_data()# saveself.save_data(data_list)# next page# try:el_next = self.driver.find_element_by_xpath('//*[@class= "dy-Pagination-next"]')self.driver.execute_script('scrollTo(0,10000000)')el_next.click()# except Exception:# break#if __name__ == '__main__':douyu = Douyu()douyu.run()
查看全文
99%的人还看了
猜你感兴趣
版权申明
本文"selenium(练习)提取dou yu网站上的数据":http://eshow365.cn/6-8994-0.html 内容来自互联网,请自行判断内容的正确性。如有侵权请联系我们,立即删除!
- 上一篇: springboot基础--实现默认登录页面
- 下一篇: RabbitMQ 消息应答