Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)

Overview

Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)

This is a simple audio classification api build to classify the sound of an audio, weather it is the cat or dog sound.

alt

Response

Given a .wav audio the model will classify what does the sound the audio belongs to either cat or dog.

{
  "predictions": {
    "class": "dog",
    "label": 1,
    "probability": 1.0
  },
  "success": true
}

Starting the server

To start server and start audio classification first you need to make sure you are in the server folder and run the following commands:

  1. creating a virtual environment
virtualenv venv && .\venv\Scripts\activate.bat
  1. installing packages
pip install -r requirements.txt
  1. Starting the server
python api/app.py

The server will start on a default port of 3001 and you will be able to make api request to the server to do audio classification.

Model Metrics

The following table shows all the metrics summary we get after training the model for few 15 epochs.

model name model description test accuracy validation accuracy train accuracy test loss validation loss train loss
cats-dogs-sound-cnn.pt audio sentiment classification for dogs and cats CNN. 90.7% 90.7% 93.5% 0.621 0.218 0.209

Classification report

The following is the classification report for the model on the test dataset.

# precision recall f1-score support
accuracy - - 90% 2305
macro avg 91% 90% 90% 2305
weighted avg 92% 89% 90% 2305

Confusion matrix

The following figure shows a confusion matrix for the classification model.

Audio Sentiment classification

If you hit the server at http://localhost:3001/classify you will be able to get the following expected response that is if the request method is POST and you provide the file expected by the server.

Expected Response

The expected response at http://localhost:3001/classify with a file audio of the right format will yield the following json response to the client.

{
  "predictions": {
    "class": "dog",
    "label": 1,
    "probability": 1.0
  },
  "success": true
}

Using curl

Make sure that you have the audio named cat.wav in the current folder that you are running your cmd otherwise you have to provide an absolute or relative path to the audio.

To make a curl POST request at http://localhost:3001/classify with the file cat.wav we run the following command.

# for cat
curl -X POST -F [email protected] http://127.0.0.1:3001/classify

# for dog
curl -X POST -F [email protected] http://127.0.0.1:3001/classify

Using Postman client

To make this request with postman we do it as follows:

  1. Change the request method to POST at http://127.0.0.1:3001/classify
  2. Click on form-data
  3. Select type to be file on the KEY attribute
  4. For the KEY type audio and select the audio you want to predict under value
  5. Click send

If everything went well you will get the following response depending on the face you have selected:

{
  "predictions": { "class": "dog", "label": 1, "probability": 1.0 },
  "success": true
}

Using JavaScript fetch api.

  1. First you need to get the input from html
  2. Create a formData object
  3. make a POST requests
res.json()) .then((data) => console.log(data));">
const input = document.getElementById("input").files[0];
let formData = new FormData();
formData.append("audio", input);
fetch("http://127.0.0.1:3001/classify", {
  method: "POST",
  body: formData,
})
  .then((res) => res.json())
  .then((data) => console.log(data));

If everything went well you will be able to get expected response.

{
  "predictions": { "class": "dog", "label": 1, "probability": 1.0 },
  "success": true
}

Notebooks

  • All notebooks for training and saving the models are found in the notebooks folder of this repository.
Owner
crispengari
ai || software development. (creator of initialiseur)
crispengari
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado financeiro.

Tutoriais Públicos Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado finan

Trading com Dados 68 Oct 15, 2022
Loopy belief propagation for factor graphs on discrete variables, in JAX!

PGMax implements general factor graphs for discrete probabilistic graphical models (PGMs), and hardware-accelerated differentiable loopy belief propagation (LBP) in JAX.

Vicarious 62 Dec 23, 2022
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples

Welcome to the cuQuantum repository! This public repository contains two sets of files related to the NVIDIA cuQuantum SDK: samples: All C/C++ sample

NVIDIA Corporation 147 Dec 27, 2022
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed

sebis - TUM - Germany 61 Dec 20, 2022
PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

Rowan Zellers 51 Oct 08, 2022
Romanian Automatic Speech Recognition from the ROBIN project

RobinASR This repository contains Robin's Automatic Speech Recognition (RobinASR) for the Romanian language based on the DeepSpeech2 architecture, tog

RACAI 10 Jan 01, 2023
A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling"

SelfGNN A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling" paper, which will appear in Th

Zekarias Tilahun 24 Jun 21, 2022
SOTA model in CIFAR10

A PyTorch Implementation of CIFAR Tricks 调研了CIFAR10数据集上各种trick,数据增强,正则化方法,并进行了实现。目前项目告一段落,如果有更好的想法,或者希望一起维护这个项目可以提issue或者在我的主页找到我的联系方式。 0. Requirement

PJDong 58 Dec 21, 2022
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs SMORE is a a versatile framework that scales multi-hop query emb

Google Research 135 Dec 27, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

15 Nov 18, 2022
Fuzzing the Kernel Using Unicornafl and AFL++

Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye

Security in Telecommunications 283 Dec 26, 2022
PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.

PyDeepFakeDet An integrated and scalable library for Deepfake detection research. Introduction PyDeepFakeDet is an integrated and scalable Deepfake de

Junke, Wang 49 Dec 11, 2022
Implicit Deep Adaptive Design (iDAD)

Implicit Deep Adaptive Design (iDAD) This code supports the NeurIPS paper 'Implicit Deep Adaptive Design: Policy-Based Experimental Design without Lik

Desi 12 Aug 14, 2022
A Marvelous ChatBot implement using PyTorch.

PyTorch Marvelous ChatBot [Update] it's 2019 now, previously model can not catch up state-of-art now. So we just move towards the future a transformer

JinTian 223 Oct 18, 2022
CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes (AAAI2022)

CMUA-Watermark The official code for CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes (AAAI2022) arxiv. It is bas

50 Nov 26, 2022
Code for the paper "Attention Approximates Sparse Distributed Memory"

Attention Approximates Sparse Distributed Memory - Codebase This is all of the code used to run analyses in the paper "Attention Approximates Sparse D

Trenton Bricken 14 Dec 05, 2022
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

Learning to Generate Grounded Visual Captions without Localization Supervision This is the PyTorch implementation of our paper: Learning to Generate G

Chih-Yao Ma 41 Nov 17, 2022
An abstraction layer for mathematical optimization solvers.

MathOptInterface Documentation Build Status Social An abstraction layer for mathematical optimization solvers. Replaces MathProgBase. Citing MathOptIn

JuMP-dev 284 Jan 04, 2023
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022