爬蟲實(shí)戰(zhàn)4-Selenium和bs4聯(lián)合使用
摘要:使用Selenium模擬瀏覽器行為完成網(wǎng)站登錄,使用bs4解析html文件,取得所需文本,最后寫入csv文集,使用excel查看。
00 導(dǎo)入相關(guān)庫
import time
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from bs4 import BeautifulSoup
01 打開網(wǎng)站
dri=webdriver.Firefox(executable_path=r'D:\geckodriver.exe')
url='https://account.jishulink.com/login'
dri.get(url)
time.sleep(1)
02 完成登錄
dri.find_element_by_css_selector(".login-list > li:nth-child(1) > input:nth-child(2)").clear()
dri.find_element_by_css_selector(".login-list > li:nth-child(1) > input:nth-child(2)").send_keys('xxxx')
dri.find_element_by_css_selector(".login-list > li:nth-child(2) > input:nth-child(2)").clear()
dri.find_element_by_css_selector(".login-list > li:nth-child(2) > input:nth-child(2)").send_keys('xxxx')
time.sleep(0.1)
dri.find_element_by_css_selector(".login-btnStyle").click()
time.sleep(1)
03 進(jìn)入目的頁面
mouse=dri.find_element_by_css_selector('#personParent > img:nth-child(1)')
ActionChains(dri).move_to_element(mouse).perform()
time.sleep(1)
dri.find_element_by_css_selector('.top-p-link01 > a:nth-child(1)').click()
time.sleep(1)
dri.find_element_by_css_selector('.myContent-tab > li:nth-child(2) > a:nth-child(1)').click()
time.sleep(0.2)
04 從當(dāng)前頁面獲取文本
html=dri.page_source
soup=BeautifulSoup(html,'lxml')
tlist0=soup.find_all('h2',attrs={'ng-if':'post.subject'})
tt=[]
for tlist1 in tlist0:
tlist2=tlist1.find('a',attrs={'ng-href':True})
tt.append(tlist2.string.strip())
05 重復(fù)獲取剩下16個(gè)頁面的文本
for i in range(16):
dri.find_element_by_css_selector('.page > div:nth-child(1) > a:nth-child(12)').click()
time.sleep(1)
html=dri.page_source
soup=BeautifulSoup(html,'lxml')
tlist0=soup.find_all('h2',attrs={'ng-if':'post.subject'})
for tlist1 in tlist0:
tlist2=tlist1.find('a',attrs={'ng-href':True})
tt.append(tlist2.string.strip())
06 運(yùn)行結(jié)果如下

07 寫入csv文件
import csv
with open('jsl.csv','w',newline='',encoding='GB2312') as csvobj:
csvfile=csv.writer(csvobj)
csvfile.writerow(tt)
csvobj.close()
08 使用excel查看csv文件

工程師必備
- 項(xiàng)目客服
- 培訓(xùn)客服
- 平臺(tái)客服
TOP




















