Python. Conclusion Selenium (WebScrapping) [4]

Preface

In this article, I will briefly talk about the capabilities of Selenium,which were not described in the previous parts. Such as:

  • Keyboard input
  • Switch to the active window (example: Steamexchange confirmation window)
  • Running JavaScript.
  • Scroll the page

Keyboard input

Create a new Python script, and rewrite the following code there.

from selenium import webdriver

def get_webdriver():
    option = webdriver. FirefoxOptions()
    driver = webdriver. Firefox(options=option)
    return driver
    
driver = get_webdriver()
driver.get('https://www.google.com/')
element = driver.find_element_by_css_selector('input')
element.send_keys('tutorial')
element.submit()

We are interested in the last 2 lines of code, because the rest was dealt with in previous articles.

element.send_keys('tutorial')

Here, using the send_keysfunction, enter the text "tutorial" into the element with the tag "input" (input field).

element.submit()

Then, press Enter

So we've created a bot that googles the phrase tutorial.

Switch to the active window

To switch to the active window, use the function switch_to_window

driver.switch_to_window(driver.window_handles[-1])

I do not know why you may need it, but the knowledge of this possibility will not be superfluous.

Execute javascript.

To execute scripts, the function execute_script()is used, which takes JS code as an argument. Here's an example of how the script works.

from selenium import webdriver

def get_webdriver():
    option = webdriver. FirefoxOptions()
    driver = webdriver. Firefox(options=option)
    return driver
    
driver = get_webdriver()
driver.get('https://under-prog.ru/')
driver.execute_script("alert('script working in selenium')")

In this example, we ran the simplest script on the site.

Scroll through the page

Page scrolling is carried out by executing JS code (as well as on sites).

ScrollTo is used for scrolling.

window.scrollTo(x,y)

which takes x, y coordinates.

If you need to scroll to the bottom,then use this:

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

document.body.scrollHeight returns the full scroll size.

Also, you can use smooth scrolling, so as not to arouse suspicion in the site.

driver.execute_script("window.scrollTo({ top: document.body.scrollHeight, left: 0, behavior: 'smooth'});")

Conclusion.

So we're done with Selenium,in the next article we'll move on to the best,in my opinion, web scraping method. I'm talking about the reuqests+bs4bundle, which unlike Selenium:

  • Works much faster.
  • Does not require web driver (geckodriver.exe)
  • Does not require a pre-installed browser (Firefox)
  • Much less consumes RAM.

However:

  • This method does not work on all sites (cloudflare or complex JSis enabled).
  • Writing a script takes a little longer.

Final result.

Пожалуйста отключи блокировщик рекламы, или внеси сайт в белый список!

Please disable your adblocker or whitelist this site!