100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

Overview

100 pandas puzzles

Puzzles notebook

Solutions notebook

Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of pandas' power.

Since pandas is a large library with many different specialist features and functions, these excercises focus mainly on the fundamentals of manipulating data (indexing, grouping, aggregating, cleaning), making use of the core DataFrame and Series objects. Many of the excerises here are straightforward in that the solutions require no more than a few lines of code (in pandas or NumPy - don't go using pure Python!). Choosing the right methods and following best practices is the underlying goal.

The exercises are loosely divided in sections. Each section has a difficulty rating; these ratings are subjective, of course, but should be a seen as a rough guide as to how elaborate the required solution needs to be.

Good luck solving the puzzles!

* the list of puzzles is not yet complete! Pull requests or suggestions for additional exercises, corrections and improvements are welcomed.

Overview of puzzles

Section Name Description Difficulty
Importing pandas Getting started and checking your pandas setup Easy
DataFrame basics A few of the fundamental routines for selecting, sorting, adding and aggregating data in DataFrames Easy
DataFrames: beyond the basics Slightly trickier: you may need to combine two or more methods to get the right answer Medium
DataFrames: harder problems These might require a bit of thinking outside the box... Hard
Series and DatetimeIndex Exercises for creating and manipulating Series with datetime data Easy/Medium
Cleaning Data Making a DataFrame easier to work with Easy/Medium
Using MultiIndexes Go beyond flat DataFrames with additional index levels Medium
Minesweeper Generate the numbers for safe squares in a Minesweeper grid Hard
Plotting Explore pandas' part of plotting functionality to see trends in data Medium

Setting up

To tackle the puzzles on your own computer, you'll need a Python 3 environment with the dependencies (namely pandas) installed.

One way to do this is as follows. I'm using a bash shell, the procedure with Mac OS should be essentially the same. Windows, I'm not sure about.

  1. Check you have Python 3 installed by printing the version of Python:
python -V
  1. Clone the puzzle repository using Git:
git clone https://github.com/ajcr/100-pandas-puzzles.git
  1. Install the dependencies (caution: if you don't want to modify any Python modules in your active environment, consider using a virtual environment instead):
python -m pip install -r requirements.txt
  1. Launch a jupyter notebook server:
jupyter notebook --notebook-dir=100-pandas-puzzles

You should be able to see the notebooks and launch them in your web browser.

Contributors

This repository has benefitted from numerous contributors, with those who have sent puzzles and fixes listed in CONTRIBUTORS.

Thanks to everyone who has raised an issue too.

Other links

If you feel like reading up on pandas before starting, the official documentation useful and very extensive. Good places get a broader overview of pandas are:

There are may other excellent resources and books that are easily searchable and purchaseable.

Owner
Alex Riley
Alex Riley
Collection of data visualizing projects through Tableau, Data Wrapper, and Power BI

Data-Visualization-Projects Collection of data visualizing projects through Tableau, Data Wrapper, and Power BI Indigenous-Brands-Social-Movements Pyt

Jinwoo(Roy) Yoon 1 Feb 05, 2022
在原神中使用围栏绘图

yuanshen_draw 在原神中使用围栏绘图 文件说明 toLines.py 将一张图片转换为对应的线条集合,视频可以按帧转换。 draw.py 在原神家园里绘制一张线条图。 draw_video.py 在原神家园里绘制视频(自动按帧摆放,截图(win)并回收) cat_to_video.py

14 Oct 08, 2022
Example scripts for generating plots of Bohemian matrices

Bohemian Eigenvalue Plotting Examples This repository contains examples of generating plots of Bohemian eigenvalues. The examples in this repository a

Bohemian Matrices 5 Nov 12, 2022
A shimmer pre-load component for Plotly Dash

dash-loading-shimmer A shimmer pre-load component for Plotly Dash Installation Get it with pip: pip install dash-loading-extras Or maybe you prefer Pi

Lucas Durand 4 Oct 12, 2022
Gallery of applications built using bqplot and widget libraries like ipywidgets, ipydatagrid etc.

bqplot Gallery This is a gallery of bqplot examples. View the gallery at https://bqplot.github.io/bqplot-gallery. Contributing new examples Clone this

8 Aug 23, 2022
Painlessly create beautiful matplotlib plots.

Announcement Thank you to everyone who has used prettyplotlib and made it what it is today! Unfortunately, I no longer have the bandwidth to maintain

Olga Botvinnik 1.6k Jan 06, 2023
:art: Diagram as Code for prototyping cloud system architectures

Diagrams Diagram as Code. Diagrams lets you draw the cloud system architecture in Python code. It was born for prototyping a new system architecture d

MinJae Kwon 27.5k Dec 30, 2022
China and India Population and GDP Visualization

China and India Population and GDP Visualization Historical Population Comparison between India and China This graph shows the population data of Indi

Nicolas De Mello 10 Oct 27, 2021
A Simple Flask-Plotly Example for NTU 110-1 DSSI Class

A Simple Flask-Plotly Example for NTU 110-1 DSSI Class Live Demo Prerequisites We will use Flask and Ploty to build a Flask application. If you haven'

Ting Ni Wu 1 Dec 11, 2021
GitHub Stats Visualizations : Transparent

GitHub Stats Visualizations : Transparent Generate visualizations of GitHub user and repository statistics using GitHub Actions. ⚠️ Disclaimer The pro

YuanYap 7 Apr 05, 2022
Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

Spotify 3.2k Jan 01, 2023
A python wrapper for creating and viewing effects for Matt Parker's christmas tree.

Christmas Tree Visualizer A python wrapper for creating and viewing effects for Matt Parker's christmas tree. Displays py or csv effect files and allo

4 Nov 22, 2022
Generate a 3D Skyline in STL format and a OpenSCAD file from Gitlab contributions

Your Gitlab's contributions in a 3D Skyline gitlab-skyline is a Python command to generate a skyline figure from Gitlab contributions as Github did at

Félix Gómez 70 Dec 22, 2022
An easy to use burndown chart generator for GitHub Project Boards.

Burndown Chart for GitHub Projects An easy to use burndown chart generator for GitHub Project Boards. Table of Contents Features Installation Assumpti

Joseph Hale 15 Dec 28, 2022
The repository is my code for various types of data visualization cases based on the Matplotlib library.

ScienceGallery The repository is my code for various types of data visualization cases based on the Matplotlib library. It summarizes the code and cas

Warrick Xu 2 Apr 20, 2022
WebApp served by OAK PoE device to visualize various streams, metadata and AI results

DepthAI PoE WebApp | Bootstrap 4 & Vue.js SPA Dashboard Based on dashmin (https:

Luxonis 6 Apr 09, 2022
Extract and visualize information from Gurobi log files

GRBlogtools Extract information from Gurobi log files and generate pandas DataFrames or Excel worksheets for further processing. Also includes a wrapp

Gurobi Optimization 56 Nov 17, 2022
Flame Graphs visualize profiled code

Flame Graphs visualize profiled code

Brendan Gregg 14.1k Jan 03, 2023
Generate visualizations of GitHub user and repository statistics using GitHub Actions.

GitHub Stats Visualization Generate visualizations of GitHub user and repository statistics using GitHub Actions. This project is currently a work-in-

JoelImgu 3 Dec 14, 2022
A tool to plot and execute Rossmos's Formula, that helps to catch serial criminals using mathematics

Rossmo Plotter A tool to plot and execute Rossmos's Formula using python, that helps to catch serial criminals using mathematics Author: Amlan Saha Ku

Amlan Saha Kundu 3 Aug 29, 2022