This article, unlike most out there on the web, is intended to give you a basic understanding of Webdriver, its origins, and a high-level introduction to how it works behind the scenes, and how it is used in automation testing.

Webdriver is a framework that, without manual efforts, connects directly to the browser. Webdriver uses language bindings to act directly upon elements on the page(DOM), such as buttons or hyperlinks, through JSON wire protocol over HTTP.

Webdriver Beginnings → The story starts around 2004 at Thought Works, a global software consultancy. A developer named Jason Huggins was building the Core mode as “JavaScriptTestRunner” for testing an internal application. Along with his team, they wrote pieces of code that acted on the browser. To give you a clearer explanation, I can show it with an example. Open the chrome browser and go to any website. Select any button on the page and right-click on it. In the menu, click on inspect and then open the console. In the console tab type document.getElementById(“buttonID”).click();. You will see a click is being performed on the web page. As an automation engineer, time dictates we cannot write every single command this way. To resolve this, the team of developers at Thought Works wrote many such commands and made them into a framework they called Webdriver.

Architecture

The image is drawn to show an understanding

Language Bindings What are language bindings? Selenium uses this nice feature to write automation scripts in whatever programming/scripting language you want; be it Java, Python, Ruby, C#, or JavaScript.

If you are using Java + Selenium, the way a Webdriver session is created is with the below syntax:

But if you are using Selenium + Python, the syntax is different:

The language bindings are what enable us to write test automation scripts regardless of the chosen language. Simple :-)

JSON wire protocol Though Selenium is moving towards W3C protocol, understanding JSON wire protocol is not a bad idea. When we send a command such as driver.get(“url.com”) or any other Selenium command, a Rest API GET/POST call is generated from the client (Eclipse/IntelliJ or any IDE). To be simple, the JSON wire protocol is the list of APIs which convert the commands to a protocol that is sent to the driver. Think of it as a mediator which helps in sending the request and receiving the response. The reference link below will show you the documentation.

Reference:

https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol

Driver​:​ → So far, we’ve discussed that a request will be sent from the JSON wire protocol to the driver. But what is the driver? The driver is something that tells the browser to act on a given command. It sends a request to the browser and returns back the response received. For example, a check is performed to see if a button is present on the page or not. The results of that check come from the browser to the driver. Continuous requests and responses happen between a driver and the browser.

There are a lot of tools in the market built on top of Webdriver as a wrapper. Protractor is one such tool built on top of Webdriver. Think of it as a is a wrapper around Selenium-webdriver, built for Angular applications. Though, it can also be used for non-angular applications as well with few changes to the config file.​ With Protractor, you no longer have to add waits or sleeps in your tests; it executes the next command when the element is ready to be acted upon.

A Human, Quality Engineer, Developer