Python MapReduce library written in Cython.

Related tags

Miscellaneoushadoopy
Overview
Brandyn White 
  
Andrew Miller 
  
   

Source  
   https://github.com/bwhite/hadoopy/
Issues  
   https://github.com/bwhite/hadoopy/issues
Docs    
   http://bwhite.github.com/hadoopy/

IRC: #hadoopy @ freenode.net

Requirements
python development headers (python-dev), build tools (build-essential)

Optional
cython (>=.13) (without this it falls back to the pregenerated .c files)

Features
- oozie support
- Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch)
- typedbytes support (very fast)
- Local execution of unmodified MapReduce job with launch_local
- Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb)
- Works on OS X
- Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr)
- critical path is in Cython
- works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree)
- Simple HDFS access (readtb and ls) inside Python, even inside running jobs
- Unit test interface
- Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy)
- Supports design patterns in the Lin/Dyer book (
   http://www.umiacs.umd.edu/~jimmylin/book.html)

Limitations
- Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode.  Use psuedo-distributed instead for now.  (
   https://github.com/bwhite/hadoopy/issues/40)

Used in
- A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11)
- Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10)
- Vitrieve: Visual Search engine
- Picarus: Hadoop computer vision toolbox

Ubuntu Install (others are similar)
sudo apt-get install python-dev build-essential
sudo python setup.py install
  
 
Owner
Brandyn White
Brandyn White
PyDateWaiter helps waiting special day & calculating remain days till that day with Python code.

PyDateWaiter (v.Beta) PyDateWaiter helps waiting special day(aniversary) & calculating remain days till that day with Python code. Made by wallga gith

wallga 1 Jan 14, 2022
CPython extension implementing Shared Transactional Memory with native-looking interface

CPython extension implementing Shared Transactional Memory with native-looking interface

21 Jul 22, 2022
GUI for the Gammu library.

Wammu GUI for the Gammu library. Homepage https://wammu.eu/ License GNU GPL version 3 or later. First start On first start you will be asked for set

Gammu 60 Dec 14, 2022
Biohacking con Python honeycon21

biohacking-honeycon21 This repository includes the slides of the public presentation 'Biohacking con Python' in the Hack&Beers of HoneyCON21 (PPTX and

3 Nov 13, 2021
Procedural modeling of fruit and sandstorm in Blender (bpy).

SandFruit Procedural modelling of fruit and sandstorm. Created by Adriana Arcia and Maya Boateng. Last updated December 19, 2020 Goal & Inspiration Ou

Adriana Arcia 2 Mar 20, 2022
Tensorboard plugin 3d with python

tensorboard-plugin-3d Overview In this example, we render a run selector dropdown component. When the user selects a run, it shows a preview of all sc

KitwareMedical 26 Nov 14, 2022
A simple calculator made with tkinter.

Simple Calculator A simple calculator made with tkinter. Requirements None, only you need to have windows 😉 ...Enjoy! Installation Clone this reposit

Abhyush 2 Jan 11, 2022
Return-Parity-MDP - Towards Return Parity in Markov Decision Processes

Towards Return Parity in Markov Decision Processes Code for the AISTATS 2022 pap

Jianfeng Chi 3 Nov 27, 2022
Yet another Python Implementation of the Elo rating system.

Python Implementation - Elo Rating System Yet another Python Implementation of the Elo rating system (how innovative am I right?). Only supports 1vs1

Kraktoos 5 Dec 22, 2022
A clock purely made with python(turtle)...

Clock A clock purely made with python(turtle)... Requirements Pythone3 IDE or any other IDE Installation Clone this repository Running Open this proje

Abhyush 1 Jan 11, 2022
A Blender addon to align the origin to the top, center or bottom of a mesh object

Align Origin Blender Addon. Align Origin Blender Addon. What? This simple addon lets you align the origin to the top, center or bottom of a mesh objec

VA79 7 Nov 30, 2022
A country information finder module

A country information finder module

Fayas Noushad 3 Nov 28, 2021
HSPICE can not perform Monte Carlo (MC) simulations while considering aging effects

HSPICE can not perform Monte Carlo (MC) simulations while considering aging effects. I developed a python wrapper that automatically performs MC and aging simulations using HPSICE to save engineering

Habib Kazemi 2 Nov 22, 2021
This repository contains the exercices for the robotics class at Supaero, 2022.

Supaero robotics, 2022 This repository contains the exercices for the robotics class at Supaero, 2022. The exercices are organized by notebook. Each n

Gepetto team, LAAS-CNRS 5 Aug 01, 2022
Interfaces between napari and pymeshlab library to allow import, export and construction of surfaces.

napari-pymeshlab Interfaces between napari and the pymeshlab library to allow import, export and construction of surfaces. This is a WIP and feature r

Zach Marin 4 Oct 12, 2022
OpenSea NFT API App using Python and Streamlit

opensea-nft-api-tutorial OpenSea NFT API App using Python and Streamlit Tutorial Video Walkthrough https://www.youtube.com/watch?v=49SupvcFC1M Instruc

64 Oct 28, 2022
An universal linux port of deezer, supporting both Flatpak and AppImage

Deezer for linux This repo is an UNOFFICIAL linux port of the official windows-only Deezer app. Being based on the windows app, it allows downloading

Aurélien Hamy 154 Jan 06, 2023
🔵Open many google dorks in a fasted way

Dorkinho 🔵 The author is not responsible for misuse of the tool, use it in good practices like Pentest and CTF OSINT challenges. Dorkinho is a script

SidHawks 2 May 02, 2022
Python script to commit to your github for a perfect commit streak. This is purely for education purposes, please don't use this script to do bad stuff.

Daily-Git-Commit Commit to repo every day for the perfect commit streak Requirments pip install -r requirements.txt Setup Download this repository. Cr

JareBear 34 Dec 14, 2022
MySQL Connectivity based project. Contains various functions of a Store-Management-System

An Intermediate Level Python - MySQL Connectivity based project. Contains various functions of a Store-Management-System.

Yash Wadhvani 2 Nov 21, 2022