A scrapy pipeline that provides an easy way to store files and images using various folder structures.

Last update: Oct 23, 2022

Overview

scrapy-folder-tree

This is a scrapy pipeline that provides an easy way to store files and images using various folder structures.

Supported folder structures:

Given this scraped file: 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg, you can choose the following folder structures:

Using file name

full
├── 0
.   ├── 5
.   .   ├── b
.   .   .   ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg

Using crawling time

full
├── 0
.   ├── 11
.   .   ├── 48
.   .   .   ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg

Using crawling date

full
├── 2022
.   ├── 1
.   .   ├── 24
.   .   .   ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg

Installation

pip install scrapy_folder_tree

Usage

Use the following settings in your project:

ITEM_PIPELINES = {
    'scrapy_folder_tree.FilesHashTreePipeline': 300
}

FOLDER_TREE_DEPTH = 3

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

Related tags

Overview

scrapy-folder-tree

Supported folder structures:

Installation

Usage

Owner

Panagiotis Simakis

Download images from forum threads

This is a python api to scrape search results from a url.

This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

download NCERT books using scrapy

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

A simple django-rest-framework api using web scraping

Dude is a very simple framework for writing web scrapers using Python decorators

Scrapping Connections' info on Linkedin

Ebay Webscraper for Getting Average Product Price

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

学习强国自动化百分百正确、瞬间答题，分值45分

SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features.

Scrape all the media from an OnlyFans account - Updated regularly

A web scraper which checks price of a product regularly and sends price alerts by email if price reduces.

Lovely Scrapper

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

京东秒杀商品抢购Python脚本

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

Related tags

Overview

scrapy-folder-tree

Supported folder structures:

Installation

Usage

Owner

Panagiotis Simakis

Download images from forum threads

This is a python api to scrape search results from a url.

This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

download NCERT books using scrapy

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

A simple django-rest-framework api using web scraping

Dude is a very simple framework for writing web scrapers using Python decorators

Scrapping Connections' info on Linkedin

Ebay Webscraper for Getting Average Product Price

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

学习强国 自动化 百分百正确、瞬间答题，分值45分

SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features.

Scrape all the media from an OnlyFans account - Updated regularly

A web scraper which checks price of a product regularly and sends price alerts by email if price reduces.

Lovely Scrapper

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

京东秒杀商品抢购Python脚本

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

学习强国自动化百分百正确、瞬间答题，分值45分