爬蟲(chóng)實(shí)戰(zhàn)5-Selenium中多頁(yè)面處理技巧

摘要:在爬蟲(chóng)行為中,可能從當(dāng)前頁(yè)面的鏈接上打開(kāi)了另一個(gè)頁(yè)面,出現(xiàn)了多頁(yè)面的情況,這樣會(huì)增加網(wǎng)頁(yè)元素定位的難度。這種情況下的思路是,先定位所需頁(yè)面,再定位該頁(yè)面的元素。

00 導(dǎo)入相關(guān)庫(kù)

import time
from selenium import webdriver

01 打開(kāi)網(wǎng)站

dri=webdriver.Firefox(executable_path=r'D:\geckodriver.exe')
url='https://sh.58.com/'
dri.get(url)
time.sleep(1)

02 完成登錄

dri.find_element_by_css_selector("#commonTopbar_login > a:nth-child(1)").click()
dri.find_element_by_css_selector(".change_qrcode > span:nth-child(1)").click()
time.sleep(0.1)
dri.find_element_by_css_selector("#username").clear()
dri.find_element_by_css_selector("#username").send_keys('xxxx')
dri.find_element_by_css_selector("#password").clear()                                
dri.find_element_by_css_selector("#password").send_keys('xxxx')
time.sleep(0.1)
dri.find_element_by_css_selector("#btn_account").click()
time.sleep(1)

03 進(jìn)入目的頁(yè)面(注意此后同時(shí)開(kāi)了兩個(gè)頁(yè)面)

handle1=dri.current_window_handle #先給第一個(gè)頁(yè)面編號(hào)
dri.find_element_by_css_selector("div.noBorder:nth-child(1) >\
div:nth-child(2) > em:nth-child(26) > a:nth-child(1)").click()
time.sleep(1)

04 在第二個(gè)頁(yè)面上定位元素

handles=dri.window_handles
for handle in handles:
    if handle!=handle1:
        dri.switch_to_window(handle)#先定位到第二個(gè)頁(yè)面
        dri.find_element_by_css_selector('#local > a:nth-child(13)').click()#定位頁(yè)面元素
        dri.find_element_by_css_selector('.subarea > a:nth-child(14)').click()#定位頁(yè)面元素


05 運(yùn)行結(jié)果如下


爬蟲(chóng)實(shí)戰(zhàn)5-Selenium中多頁(yè)面處理技巧的圖1

登錄后免費(fèi)查看全文
立即登錄
App下載
技術(shù)鄰APP
工程師必備
  • 項(xiàng)目客服
  • 培訓(xùn)客服
  • 平臺(tái)客服

TOP