爬蟲(chóng)實(shí)戰(zhàn)5-Selenium中多頁(yè)面處理技巧
摘要:在爬蟲(chóng)行為中,可能從當(dāng)前頁(yè)面的鏈接上打開(kāi)了另一個(gè)頁(yè)面,出現(xiàn)了多頁(yè)面的情況,這樣會(huì)增加網(wǎng)頁(yè)元素定位的難度。這種情況下的思路是,先定位所需頁(yè)面,再定位該頁(yè)面的元素。
00 導(dǎo)入相關(guān)庫(kù)
import time
from selenium import webdriver
01 打開(kāi)網(wǎng)站
dri=webdriver.Firefox(executable_path=r'D:\geckodriver.exe')
url='https://sh.58.com/'
dri.get(url)
time.sleep(1)
02 完成登錄
dri.find_element_by_css_selector("#commonTopbar_login > a:nth-child(1)").click()
dri.find_element_by_css_selector(".change_qrcode > span:nth-child(1)").click()
time.sleep(0.1)
dri.find_element_by_css_selector("#username").clear()
dri.find_element_by_css_selector("#username").send_keys('xxxx')
dri.find_element_by_css_selector("#password").clear()
dri.find_element_by_css_selector("#password").send_keys('xxxx')
time.sleep(0.1)
dri.find_element_by_css_selector("#btn_account").click()
time.sleep(1)
03 進(jìn)入目的頁(yè)面(注意此后同時(shí)開(kāi)了兩個(gè)頁(yè)面)
handle1=dri.current_window_handle #先給第一個(gè)頁(yè)面編號(hào)
dri.find_element_by_css_selector("div.noBorder:nth-child(1) >\
div:nth-child(2) > em:nth-child(26) > a:nth-child(1)").click()
time.sleep(1)
04 在第二個(gè)頁(yè)面上定位元素
handles=dri.window_handles
for handle in handles:
if handle!=handle1:
dri.switch_to_window(handle)#先定位到第二個(gè)頁(yè)面
dri.find_element_by_css_selector('#local > a:nth-child(13)').click()#定位頁(yè)面元素
dri.find_element_by_css_selector('.subarea > a:nth-child(14)').click()#定位頁(yè)面元素
05 運(yùn)行結(jié)果如下

工程師必備
- 項(xiàng)目客服
- 培訓(xùn)客服
- 平臺(tái)客服
TOP




















