Efficient Python Tricks and Tools for Data Scientists

Overview

View on GitHub View Book

Efficient Python Tricks and Tools for Data Scientists

Why efficient Python? Because using Python more efficiently will make your code more readable and run more efficiently.

Why for data scientist? Because Python has a wide application. The Python tools used in the data science field are not necessarily useful for other fields such as web development.

The goal of this book is to spread the awareness of efficient ways to do Python. They include:

  • efficient built-in methods and libraries to work with iterator, dictionary, function, and class
  • efficient methods to work with popular data science libraries such as pandas and NumPy
  • efficient tools to incorporate in a data science project
  • efficient tools to incorporate in any project
  • efficient tools to work with Jupyter Notebook.

image

What Should You Expect From This Book?

This book expects you to have some basic knowledge of Python and data science.

You should also expect bite-size code snippets for each section. This will allow you to obtain multiple pieces of knowledge in fewer than one minute. I included the link to the resources for every tools introduced in case you want to explore them further.

About This Book

This book includes more than 300 tips and tools I have shared daily on my website, Data Science Simplified. If you want to get the updated of new tips on your mailbox, you can subscribe to my website.

About The Author

image

Khuyen Tran is a data science writer at NVIDIA and a data science intern at Ocelot Consulting. She wrote over 150 data science articles with 100k+ views per month on Towards Data Science. She also wrote 300+ daily data science tips at Data Science Simplified. Her current mission is to make open-source more accessible to the data science community.

Supporters

Special thanks to these supporters for supporting this project!

Owner
Khuyen Tran
Data Scientist | Data Science Writer at NVIDIA & Towards Data Science
Khuyen Tran
Book on Julia for Data Science

Book on Julia for Data Science

Julia Data Science 349 Dec 25, 2022
PennyLane is a cross-platform Python library for differentiable programming of quantum computers.

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

PennyLaneAI 1.6k Jan 04, 2023
🍊 :bar_chart: :bulb: Orange: Interactive data analysis

Orange Data Mining Orange is a data mining and visualization toolbox for novice and expert alike. To explore data with Orange, one requires no program

Bioinformatics Laboratory 3.9k Jan 05, 2023
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Jan 08, 2023
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 915 Dec 29, 2022
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke

Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f

Osvaldo Martin 851 Dec 27, 2022
Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica

Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica. It is free both as in "free beer" and as in "freedom".

Mathics 535 Jan 04, 2023
CoCalc: Collaborative Calculation in the Cloud

logo CoCalc Collaborative Calculation and Data Science CoCalc is a virtual online workspace for calculations, research, collaboration and authoring do

SageMath, Inc. 1k Dec 29, 2022
Read-only mirror of https://gitlab.gnome.org/GNOME/pybliographer

Pybliographer Pybliographer provides a framework for working with bibliographic databases. This software is licensed under the GPLv2. For more informa

GNOME Github Mirror 15 May 07, 2022
CONCEPT (COsmological N-body CodE in PyThon) is a free and open-source simulation code for cosmological structure formation

CONCEPT (COsmological N-body CodE in PyThon) is a free and open-source simulation code for cosmological structure formation. The code should run on any Linux system, from massively parallel computer

Jeppe Dakin 62 Dec 08, 2022
CS 506 - Computational Tools for Data Science

CS 506 - Computational Tools for Data Science Code, slides, and notes for Boston University CS506 Fall 2021 The Final Project Repository can be found

Lance Galletti 14 Mar 23, 2022
SeqLike - flexible biological sequence objects in Python

SeqLike - flexible biological sequence objects in Python Introduction A single object API that makes working with biological sequences in Python more

186 Dec 23, 2022
Data intensive science for everyone.

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

Galaxy Project 1k Jan 08, 2023
3D medical imaging reconstruction software

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

443 Jan 01, 2023
PsychoPy is an open-source package for creating experiments in behavioral science.

PsychoPy is an open-source package for creating experiments in behavioral science. It aims to provide a single package that is: precise enoug

PsychoPy 1.3k Dec 31, 2022
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Spack Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, a

Spack 3.1k Dec 31, 2022
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

Jon C Cline 0 Sep 05, 2021
Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects

Metaflow Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects. Metaflow

Netflix, Inc. 6.3k Jan 03, 2023
A simple computer program made with Python on the brachistochrone curve.

Brachistochrone-curve This is a simple computer program made with Python on the brachistochrone curve. I decided to write it after a physics lesson on

Diego Romeo 1 Dec 16, 2021
Algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos

Bioinformatics This is a repository of all the algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos Algorithm

16 Jun 30, 2022