Browser Automation Using Puppeteer in Node.js

In simple terms, Puppeteer is a node library that provides high-level API to control a headless chrome or chromium instance.

According to its git repository :

” Puppeteer is a Node library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default but can be configured to run full (non-headless) Chrome or Chromium. “

You could use Puppeteer for a variety of reasons ranging from simple browser automation and testing to something like scraping or generating pdf or screenshot of the websites. The possibility to use Puppeteer is huge.

So, let’s just dive in

Install Puppeteer

npm i puppeteer

Using Puppeteer

Let’s start by simply opening the browser, navigating to a URL,taking a screenshot and closing the browser.

const puppeteer = require('puppeteer');

(async ()=>{
    const browser = await puppeteer.launch({
        headless : false
    });
    const page = await browser.newPage();
    await page.goto("http://13.235.19.19")
    await page.screenshot({path:"codecalls-100.jpeg", quality:100})
    await browser.close()
})();

The Code is simple but lets quickly walk through it. We declared an IIFE (Immediately invoked Function Expression) Which gets invoked as soon as it is declared. We use Puppeteer to launch a browser instance. Setting headless to false opens up chrome. It’s default is true so if you don’t set a value to it you would truly be using a headless chrome instance.

The newPage() opens up a new page in the browser while goto() method redirects the page to whatever URL is passed to it.

The screenshot method as the name suggests takes the screenshot of the webpage. You could pass it a variety of arguments like quality The value of which ranges from 0 to 100 and is not applicable if you set the image type to png.

This Image was taken with the value set to 0

This Image was taken with the value set to 100

Complete list of argument that can be passed to the screenshot method :here

Creating a PDF of the webpage

Similarly, we can get the pdf of the webpage(make sure that headless is set to the default value of true) :

    await page.pdf({path: 'page.pdf'});

Additionaly you could pass parameter like formats, landscape etc to see complete list or argument that can be passed check this link

Emulate a device

To emulate a device like IPhone all you have to do is to use emulate()

 await page.emulate(iPhone);

Change the size of the viewport

To change the size of the default display size of the browser Add a default viewport object to the parameter of the launch method:

(async ()=>{
    const browser = await puppeteer.launch({
        headless : false,
        defaultViewport:{
            width:1366,
            height: 768
        }
    });
    const page = await browser.newPage();
    await page.goto("http://13.235.19.19")
    await browser.close()
})();

Open Devtools in Puppeteer Chrome Instance

To open chrome devtools pass it as true to the config object in launch method

(async ()=>{
    const browser = await puppeteer.launch({
        headless : false,
        defaultViewport:{
            width:1366,
            height: 768
        },
        devtools: true

    });
    const page = await browser.newPage();
    await page.goto("http://13.235.19.19")
    await browser.close()
})();

Getting Page Title, URL, Source Code, and Cookies

To get page title, url, source code and cookies use the method below the goto method

    let title = await page.title()
    let url = await page.url()
    let pageSource = await page.content()
    let Cookies = await page.cookies()

Emulating a Keyboard Type Event

So, We are going to Google Search something using puppeteer. So We will open google.com we will find the selector of the input field of the search bar and will enter our search query and press Enter all of it through Puppeteer

    const page = await browser.newPage();
    await page.goto("https://www.google.com")
    await page.type(selector, "Hello World",{delay: 500})
    await page.keyboard.press('Enter');

Find the selector of the input field using dev tools, The second argument to type function is the string we want to type in the input field. The delay is the amount of time between each character is typed on to the input field. Use this to have a more human-like typing.

End Result with delay set to 500:

There is much more you could do with Puppeteer. Read the API DOCS to get the list of all the functions that puppeteer provides: API DOCS

Leave a Reply