An easy-to-use app to visualise attentions of various VQA models.

Last update: Nov 13, 2022

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

An easy-to-use app to visualise attentions of various VQA models. Please click here to see a live demo of the app!

• Models
• Requirements
• Installation
• How to run
• How to use
• Contributing
• Acknowledgements

Models

• MFB - Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu, Jun Yu, Jianping Fan, Dacheng Tao
Arxiv

• (Coming soon) MCAN - Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian
Arvix

Requirements

Please check the requirements.txt file for the version numbers.

opencv_python==4.4.0.46
numpy==1.19.4
pandas==1.1.4
torch==1.4.0
matplotlib==3.3.2
gdown==3.12.2
seaborn==0.11.0
dotmap==1.3.23
streamlit==0.70.0
Pillow==8.0.1
PyYAML==5.3.1

Installation

Install Anaconda
Clone this repository and cd into it.
git clone https://github.com/apugoneappu/ask_me_anything.git && cd ask_me_anything
In a new environment (new_env)
pip install -r requirements.txt

How to run

From the directory of this repository, do the following -

conda activate new_env
streamlit run main.py
In a browser tab, open the Network URL displayed in your terminal.

Done! 🎉

How to use

Contributing

First of all, thank you for wanting to contribute to this work! I will try and make your job as easy as possible. Detailed instructions coming soon ...

Acknowledgements

This repository has been built by modifying the OpenVQA repository.

I would also like to thank Yash Khandelwal, Nikhil Shah and Chinmay Singh for their support and amazing suggestions!

Huge thanks to Streamlit for making all of this possible and for Streamlit Sharing that enables free hosting of this app! ❤️

An easy-to-use app to visualise attentions of various VQA models.

Related tags

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

Models

Requirements

Installation

How to run

How to use

Contributing

Acknowledgements

Owner

Apoorve

End-to-End Speech Processing Toolkit

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

PyTorch implementation of neural style transfer algorithm

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

A tutorial on DataFrames.jl prepared for JuliaCon2021

Code for our work "Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection".

An ML & Correlation platform for transforming disparate data points of interest into usable intelligence.

The pytorch implementation of SOKD (BMVC2021).

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

A Python script that creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editing software such as FinalCut Pro for further adjustments.

Simple object detection app with streamlit

The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Representing Long-Range Context for Graph Neural Networks with Global Attention

PyTorch implementation of Pay Attention to MLPs

DeepLab resnet v2 model in pytorch

Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"

Editing a classifier by rewriting its prediction rules

Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences

Faune proche - Retrieval of Faune-France data near a google maps location