Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Last update: Dec 22, 2021

Related tags

Data Analysis covid-county

Overview

Covid County

Executive summary

Setup

Install miniconda, then in the command line, run

conda create -n covid-county
conda activate covid-county
conda install pandas ipython matplotlib tabulate

(Let me know if you want pure-Python no-Conda instructions via venv.)

2020 US presidential election

I've already downloaded countypres_2000-2020.csv from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ but you can download it again to ensure I haven't committed bad data.

2020 data is missing counts for District of Columbia (FIPS 11001)? Party split taken from 2016 election.

Census

From https://www.census.gov/programs-surveys/popest/technical-documentation/research/evaluation-estimates/2020-evaluation-estimates/2010s-counties-total.html I downloaded co-est2020.csv from the "Annual Resident Population Estimates for States and Counties: April 1, 2010 to July 1, 2019; April 1, 2020; and July 1, 2020 (CO-EST2020)" link. It's committed in this repo but you can download it yourself too.

Covid

Install Git and run this in this directory: git clone --depth 1 https://github.com/nytimes/covid-19-data.git (it might take a while)

Note five boroughs of NYC are combined into a single "county". This is taken into account by merging the 2020 Presidential votes from all five boroughs into a single county (since we can't split the Covid deaths into individual boroughs, this is the best we can do). Fix follows the recommendation per upstream issue 105.

Run

python main.py

(Takes ~45 seconds on my 2015-vintage laptop.)

More results

party bin	total Covid-19 deaths
Rep 80+%	38284
Rep 60–79%	211416
Rep 50–59%	123587
Dem 50–59%	196084
Dem 60–79%	210070
Dem 80+%	18331
unknown	5243

Simply by party:

Dem: 424485
Rep: 373287

Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Related tags

Overview

Covid County

Executive summary

Setup

2020 US presidential election

Census

Covid

Run

More results

Owner

Ahmed Fasih

A columnar data container that can be compressed.

Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python.

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.

Statistical package in Python based on Pandas

Useful tool for inserting DataFrames into the Excel sheet.

Data processing with Pandas.

Airflow ETL With EKS EFS Sagemaker

A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

Data and code accompanying the paper Politics and Virality in the Time of Twitter

This repository contains some analysis of possible nerdle answers

Template for a Dataflow Flex Template in Python

Making the DAEN information accessible.

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Flood modeling by 2D shallow water equation

BasstatPL is a package for performing different tabulations and calculations for descriptive statistics.

An ETL framework + Monitoring UI/API (experimental project for learning purposes)

Meltano: ELT for the DataOps era. Meltano is open source, self-hosted, CLI-first, debuggable, and extensible.

CINECA molecular dynamics tutorial set

💬 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.

A script to "SHUA" H1-2 map of Mercenaries mode of Hearthstone