Demonstration on how to use async python to control multiple playwright browsers for web-scraping

Last update: Oct 27, 2022

Related tags

Overview

Playwright Browser Pool

This example illustrates how it's possible to use a pool of browsers to retrieve page urls in a single asynchronous process.

# "response": 
(contains response status, headers etc.) # } if __name__ == '__main__': asyncio.run(run())">
import asyncio


async def run():
    # some example urls
    urls = [
        "https://www.airbnb.com/experiences/2496585",
        "https://www.airbnb.com/experiences/2488061",
        "https://www.airbnb.com/experiences/2563542",
        "https://www.airbnb.com/experiences/3010357",
        "https://www.airbnb.com/experiences/2624432",
        "https://www.airbnb.com/experiences/3033250",
    ]
    # start a browser pool
    async with BrowserPool(pool_size=3, browser_type="chromium", browser_kwargs={"headless": True}) as pool:
        # concurrently execute page retrieval
        for data in asyncio.as_completed(
            [pool.get_page(url) for url in batch]
        ):
            print(data)
            # will print:
            # {
            #   "content": 
    
            #   "response": 
    
      (contains response status, headers etc.)
    
            # }


if __name__ == '__main__':
    asyncio.run(run())

Owner

Bernardas Ališauskas

I like python, education and free software. More on https://gitlab.com/granitosaurus

GitHub Repository

Footballmapies - Football mapies for learning webscraping and use of gmplot module in python

1 Jan 28, 2022

The core packages of security analyzer web crawler

Security Analyzer 🐍 A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Cu

10 Jul 03, 2022

A simple reddit scraper to get memes (only images) from r/ProgrammerHumor.

memey A simple reddit scraper to get memes (only images) from r/ProgrammerHumor. Note Only works if you have firefox installed (yet). Instructions foo

2 Nov 16, 2021

Scrap-mtg-top-8 - A top 8 mtg scraper using python

1 Jan 24, 2022

A package designed to scrape data from Yahoo Finance.

yahoostock A package designed to scrape data from Yahoo Finance. Installation The most simple installation method is through PIP. pip install yahoosto

2 May 28, 2022

A simple code to fetch comments below an Instagram post and save them to a csv file

fetch_comments A simple code to fetch comments below an Instagram post and save them to a csv file usage First you have to enter your username and pas

2 Jul 14, 2022

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

Scrapy Cluster This Scrapy project uses Redis and Kafka to create a distributed

0 Jan 06, 2022

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

Game Scraper Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms. Join the discord About The Proj

2 Mar 28, 2022

Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

SpaceX Sofware I developed software to scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info to use the software you need Python a

16 Aug 02, 2022

Telegram group scraper tool

Telegram Group Scrapper

2 Jan 11, 2022

Examine.com supplement research scraper!

ExamineScraper Examine.com supplement research scraper! Why I want to be able to search pages for a specific term. For example, I want to be able to s

15 Dec 06, 2022

Create crawler get some new products with maximum discount in banimode website

crawler-banimode create crawler and get some new products with maximum discount in banimode website. این پروژه کوچک جهت یادگیری و کار با ابزار سلنیوم

2 Feb 17, 2022

Google Developer Profile Badge Scraper

Google Developer Profile Badge Scraper GDev Profile Badge Scraper is a Google Developer Profile Web Scraper which scrapes for specific badges in a use

7 Jan 10, 2022

A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

1.5k Jan 04, 2023

Works very well and you can ask for the type of image you want the scrapper to collect.

Works very well and you can ask for the type of image you want the scrapper to collect. Also follows a specific urls path depending on keyword selection.

1 Feb 17, 2022

Consulta de CPF e CNPJ na Receita Federal com Web-Scraping

Repositório contendo scripts Python que realizam a consulta de CPF e CNPJ diretamente no site da Receita Federal.

5 Nov 29, 2021

对于有验证码的站点爆破，用于安全合法测试

使用方法 python3 main.py + 配置好的文件 python3 main.py Verify.json python3 main.py NoVerify.json 以上分别对应有验证码的demo和无验证码的demo Tips: 你可以以域名作为配置文件名字加载：python3 main

47 Nov 09, 2022

Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

Instagram_scrapper This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or exce

5 Oct 17, 2022

Find thumbnails and original images from URL or HTML file.

Haul Find thumbnails and original images from URL or HTML file. Demo Hauler on Heroku Installation on Ubuntu $ sudo apt-get install build-essential py

150 Oct 15, 2022

优化版本的京东茅台抢购神器

1.8k Mar 18, 2022