Python code to crawl computer vision papers from top CV conferences. Currently it supports CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, SIGGRAPH

Overview

Crawling-CV-Conference-Papers

News

  • 2021-6-21 Support CVPR-2021

Download all CVPR-2021 papers in one click. Just set the local download directory in download_cvpr2021.py and run it! Don't forget to have your chrome driver ready (i.e., corresponding version to your Chrome browser)

  • 2021-6-20 Support continuation of downloading from where the program encounters interruption. (prevent re-downloading from scratch)

Introduction

Python code to crawl computer vision papers from top CV conferences. Currently it supports CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, SIGGRAPH. It leverages selenium, a website testing framework to crawl the titles and pdf urls from the conference website, and download them one by one with some simple anti-anti-crawler tricks.

Websites for older conferences are not guaranteed to be bug-free, since this project is based on newest website structure.

Recommend to work with Mendeley. You will get a juicy academic corpus.

Currently only single-thread downloading is implemented. Therefore the downloading for thousands of papers would be slow (takes several hours). It is suggested that you run the script before bed and it would be finished when you get to work again :)

Multi-thread downloading will be coming soon!

Requirements

pip install selenium, slugify

Besides, downlowd chromedriver.exe from the link to any local path you favour.

Usage

To execute the crawler, you could run download.py or download.ipynb (Basically the same). Before the execution, some paths need to be set up, including:

conference = 'neurips'
conference_url = "https://papers.nips.cc/paper/2019" # the conference url to download papers from
chromedriver_path = '.../chromedriver.exe' # the chromedriver.exe path
root = './NeurIPS-2019-ALL' # file path to save the downloaded papers

Here are some conference url examples:

cvpr: https://openaccess.thecvf.com/CVPR2020 (CVPR 2020)
eccv: https://openaccess.thecvf.com/ECCV2018 (ECCV 2018) (changed in 2020)
eccv: https://www.ecva.net/papers.php (ECCV 2020) 
iccv: https://openaccess.thecvf.com/ICCV2019 (ICCV 2019)
icml: http://proceedings.mlr.press/v119/ (ICML 2020)
neurips: https://papers.nips.cc/paper/2020 (NeurIPS 2020)
iclr: https://openreview.net/group?id=ICLR.cc/2021/Conference (ICLR 2021)
siggraph: https://dl.acm.org/toc/tog/2020/39/4 (SIGGRAPH 2020)

Replace the url and the conference names with your choice.

If you want to crawl papers from other conference website, all you need to do is to write a retrieve function like the ones in retrieve_titles_urls_from_websites.py, to parse html code and retrieve the paper titles and pdf urls into two lists.

Others

Warnings: It is heard that crawling from conference websites might cause a banning of your IP (hasn't happened to me so far). Not sure of the risk.

Warnings: This project is for learning purpose only. Do not crawl the same website frequently, which will burden the server.

Welcome to submit a pull request if there is any bugs or if you would like to add support to other conferences!

Maintainer

Xiaoyang Huang

Email: [email protected]

Owner
Xiaoyang Huang
Xiaoyang Huang
Parallels Desktop dmg downloader

parallelsdesktop-dl Parallels Desktop dmg file downloader Usage usage: pd-dl [-h] [--dlv [DLV]] [-v] Parallels Desktop downloader optional arguments

2 Sep 13, 2022
apkizer is a mass downloader for android applications for all available versions.

apkizer apkizer collects all available versions of an Android application from apkpure.com Purpose Sometimes mobile applications can be useful to dig

Kamil Onur Özkaleli 41 Dec 16, 2022
命令行版本的HLS/DASH流下载器,支持标准AES-128-CBC解密

XstreamDL-CLI 基于python 3.7.4+的,命令行版本的,HLS/DASH流下载器,支持标准AES-128-CBC解密 使用 首先安装必要的库

xhlove 239 Dec 31, 2022
This script fully automates of downloading tiktok videos, editing them,compiling them and finally uploading them to youtube.

This script fully automates of downloading tiktok videos, editing them,compiling them and finally uploading them to youtube. If you wanted to create a tiktok video compiilation youtubbe channel this

Supriyo Sarkar 32 Dec 16, 2022
Google Art Image Downloader Tkinter

Google-Art-Image-Downloader-Tkinter 由 google-art-downloader 整改的批量 Google 艺术展平台高清图片下载 ⭐ It works perfectly from 2018 year till today, thanks for stars!

PY-GZKY 1 Jan 05, 2022
A manga download script written in python.

manga-dlp python script to download mangas Description A manga download script written in python. It only supports mangadex.org for now. But support f

Ivan Schaller 15 Nov 28, 2022
Download YouTube videos that are available in the given playlist

Youtube-Playlist-Downloader Download YouTube videos that are in a playlist Project assets: music downloaded music folder. (will be generated) music.db

Sultan Aljaberi 1 Dec 22, 2021
Downloads .ksy files and their dependencies straight from the official kaitai-struct format gallery.

ksy-dl Downloads .ksy files and their dependencies straight from the official kaitai-struct format gallery. This tool will: Fetch any of the official

3 Jun 20, 2022
Spotify Playlist Downloader With Python

Spotify Playlist Downloader This will let you download Spotify playlists for free without Premium. It gets all the songs from the API and downloads th

Yasho 16 Sep 28, 2022
Download YouTube videos that are available in the given playlist

Youtube-Playlist-Downloader Download YouTube videos that are available in the given playlist Project assets: music downloaded music folder. (will be g

Sultan Aljaberi 1 Dec 22, 2021
Download minecraft head or skin, allows TLauncher accounts

Download minecraft head or skin, allows TLauncher accounts

1 Dec 30, 2021
GTK4 + Python tutorial with code examples

Taiko's GTK4 Python tutorial Wanna make apps for Linux but not sure how to start with GTK? This guide will hopefully help! The intent is to show you h

190 Jan 08, 2023
😷 Dowload dos documentos da CPI da Pandemia

A CPI da Pandemia recebeu milhares de documentos públicos, todos disponibilizados no site do Senado Federal.

Eduardo Cuducos 98 Sep 23, 2022
A Quick demo of how to use the youtube_dl module in python.

youtube_dl python module demo A Quick demo of how to use the youtube_dl module in python. Whole documentation for the youtube_dl Installation git

7 Aug 27, 2021
Simple package for Sublime Text 4; download URL's for local viewing and editing

URLDownloader This is a simple example package that allows you to easily download the contents of any web URL to edit locally. Given a URL, the packag

Terence Martin 3 Mar 05, 2022
YouTube Downloader Bot With Python

TG YᴏᴜTᴜʙᴇ Uᴘʟᴏᴀᴅᴇʀ * Commands YouTube for Audio & Video and sends it to telegram after receiving valid URL [Do not forwarded any just copy and paste

Pʀᴇᴅᴀᴛᴏʀ 5 Oct 21, 2022
Libretrofuzz - Fuzzy Retroarch thumbnail downloader

Fuzzy Retroarch thumbnail downloader In Retroarch, when you use the manual scann

8 Nov 26, 2022
Let's you download entire YT-playlists.

Youtube MP3 Playlist Downloader Let's you download entire youtube playlists as mp3 files. This application is basically a script that makes it easier

11 Dec 18, 2022
a simple ehentai downloader with jpg 2 pdf

Simple_Ehentai_DownLoader a simple ehentai downloader with jpg 2 pdf 中文介绍 Environment python3.8 How to use before you start,there are some tips. the q

Hibian 6 Dec 11, 2022
A Telegram bot to download Subtitle for movies and tv shows.

Subtitle Downloader Bot A Telegram bot to download Subtitle for movies and tv shows. Host on Heroku Configuring Environments API_HASH : Your Telegram

Joy Biswas 15 Nov 12, 2022