Notebooks, slides and dataset of the CorrelAid Machine Learning Winter School

Overview

CorrelAid Machine Learning Winter School

Welcome to the CorrelAid ML Winter School!

Task

The problem we want to solve is to classify trees in Roosevelt National Forest.

Setup

Please make sure you have a modern Python 3 installation. We recommend the Python distribution Miniconda that is available for all OS.

The easiest way to get started is with a clean virtual environment. You can do so by running the following commands, assuming that you have installed Miniconda or Anaconda.

$ conda create -n winter-school python=3.9
$ conda activate winter-school
(winter-school) $ pip install -r requirements.txt
(winter-school) $ python -m ipykernel install --user --name winter-school --display-name "Python 3.9 (winter-school)"

The first command will create a new environment with Python 3.9. To use this environment, you call conda activate <name> with the name of the environment as second step. Once activated, you can install packages as usual with the pip package manager. You will install all listed requirements from the provided requirements.txt as a third step. Finally, to actually make your new environment available as kernel within a Jupyter notebook, you need to run ipykernel install, which is the fourth command.

Once the setup is complete, you can run any notebook by calling

(winter-school) $ <jupyter-lab|jupyter notebook>

jupyter lab is opening your browser with a local version of JupyterLab, which is a web-based interactive development environment that is somewhat more powerful and more modern than the older Jupyter Notebook. Both work fine, so you can choose the tool that is more to your liking. We recommend to go with Jupyter Lab as it provides a file browser, among other improvements.

Data

The data to be analyzed is one of the classic data sets from the UCI Machine Learning Repository, the Forest Cover Type Dataset.

The dataset contains tree observations from four areas of the Roosevelt National Forest in Colorado. All observations are cartographic variables (no remote sensing) from 30 meter x 30 meter sections of forest. There are over half a million measurements total!

The dataset includes information on tree type, shadow coverage, distance to nearby landmarks (roads etcetera), soil type, and local topography.

Note: We provide the data set as it can be downloaded from kaggle and not in its original form from the UCI repository.

Attribute Information:

Given is the attribute name, attribute type, the measurement unit and a brief description. The forest cover type is the classification problem. The order of this listing corresponds to the order of numerals along the rows of the database.

Name / Data Type / Measurement / Description

  • Elevation / quantitative /meters / Elevation in meters
  • Aspect / quantitative / azimuth / Aspect in degrees azimuth
  • Slope / quantitative / degrees / Slope in degrees
  • Horizontal_Distance_To_Hydrology / quantitative / meters / Horz Dist to nearest surface water features
  • Vertical_Distance_To_Hydrology / quantitative / meters / Vert Dist to nearest surface water features
  • Horizontal_Distance_To_Roadways / quantitative / meters / Horz Dist to nearest roadway
  • Hillshade_9am / quantitative / 0 to 255 index / Hillshade index at 9am, summer solstice
  • Hillshade_Noon / quantitative / 0 to 255 index / Hillshade index at noon, summer soltice
  • Hillshade_3pm / quantitative / 0 to 255 index / Hillshade index at 3pm, summer solstice
  • Horizontal_Distance_To_Fire_Points / quantitative / meters / Horz Dist to nearest wildfire ignition points
  • Wilderness_Area (4 binary columns) / qualitative / 0 (absence) or 1 (presence) / Wilderness area designation
  • Soil_Type (40 binary columns) / qualitative / 0 (absence) or 1 (presence) / Soil Type designation
  • Cover_Type (7 types) / integer / 1 to 7 / Forest Cover Type designation
Owner
CorrelAid
Soziales Engagement 2.0 - Datenanalyse für den guten Zweck
CorrelAid
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

This is the official PyTorch implementation of the ALBEF paper [Blog]. This repository supports pre-training on custom datasets, as well as finetuning on VQA, SNLI-VE, NLVR2, Image-Text Retrieval on

Salesforce 805 Jan 09, 2023
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Dec 27, 2022
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

Yunan Zhu 23 Nov 05, 2022
A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

A selection of State Of The Art research papers (and code) on human trajectory prediction (forecasting). Papers marked with [W] are workshop papers.

Karttikeya Manglam 40 Nov 18, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

2 Aug 05, 2022
Unsupervised Feature Ranking via Attribute Networks.

FRANe Unsupervised Feature Ranking via Attribute Networks (FRANe) converts a dataset into a network (graph) with nodes that correspond to the features

7 Sep 29, 2022
AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages

AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages This repository contains the code for the pa

Kelechi 40 Nov 24, 2022
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
PyTorch code of my WACV 2022 paper Improving Model Generalization by Agreement of Learned Representations from Data Augmentation

Improving Model Generalization by Agreement of Learned Representations from Data Augmentation (WACV 2022) Paper ArXiv Why it matters? When data augmen

Rowel Atienza 5 Mar 04, 2022
Public Models considered for emotion estimation from EEG

Emotion-EEG Set of models for emotion estimation from EEG. Composed by the combination of two deep-learing models learning together (RNN and CNN) with

Victor Delvigne 21 Dec 23, 2022
Transformers based fully on MLPs

Awesome MLP-based Transformers papers An up-to-date list of Transformers based fully on MLPs without attention! Why this repo? After transformers and

Fawaz Sammani 35 Dec 30, 2022
HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow

Class HiddenMarkovModel HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow 2.0 Installatio

Susara Thenuwara 2 Nov 03, 2021
Spatial Single-Cell Analysis Toolkit

Single-Cell Image Analysis Package Scimap is a scalable toolkit for analyzing spatial molecular data. The underlying framework is generalizable to spa

Laboratory of Systems Pharmacology @ Harvard 30 Nov 08, 2022
Real-time Neural Representation Fusion for Robust Volumetric Mapping

NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric Mapping Paper | Supplementary This repository contains the implementation of

ETHZ ASL 106 Dec 24, 2022
Distributionally robust neural networks for group shifts

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization This code implements the g

151 Dec 25, 2022
Progressive Domain Adaptation for Object Detection

Progressive Domain Adaptation for Object Detection Implementation of our paper Progressive Domain Adaptation for Object Detection, based on pytorch-fa

96 Nov 25, 2022
Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces

This repository contains source code for the paper Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces a

9 Nov 21, 2022
A TensorFlow Implementation of "Deep Multi-Scale Video Prediction Beyond Mean Square Error" by Mathieu, Couprie & LeCun.

Adversarial Video Generation This project implements a generative adversarial network to predict future frames of video, as detailed in "Deep Multi-Sc

Matt Cooper 704 Nov 26, 2022
Text to image synthesis using thought vectors

Text To Image Synthesis Using Thought Vectors This is an experimental tensorflow implementation of synthesizing images from captions using Skip Though

Paarth Neekhara 2.1k Jan 05, 2023
Tiny Object Detection in Aerial Images.

AI-TOD AI-TOD is a dataset for tiny object detection in aerial images. [Paper] [Dataset] Description AI-TOD comes with 700,621 object instances for ei

jwwangchn 116 Dec 30, 2022