An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

Last update: Dec 14, 2022

Overview

GLOM - Pytorch (wip)

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns) for emergent part-whole heirarchies from data.

Citations

@misc{hinton2021represent,
    title   = {How to represent part-whole hierarchies in a neural network}, 
    author  = {Geoffrey Hinton},
    year    = {2021},
    eprint  = {2102.12627},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

Comments

help

Hello, when I tried to reproduce your model, I got this error. I'm not sure how to correct it， can y help me?

Traceback (most recent call last): File "main.py", line 172, in outputs = custom_model(images,iters = 12) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/class/glom_pytorch/glom_pytorch.py", line 109, in forward consensus = self.attention(levels) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/root/class/glom_pytorch/glom_pytorch.py", line 49, in forward sim.masked_fill(self_mask, TOKEN_ATTEND_SELF_VALUE) RuntimeError: Expected object of scalar type Bool but got scalar type Float for argument #2 'mask' in call to th_masked_fill_bool

opened by DDxk369 1
Levels token

Hello, thank you for your good work. I was trying to implement the idea you shared in this todo:

https://github.com/lucidrains/glom-pytorch/projects/1#card-56284841

The text reads: allow each level to be represented by a list of tokens, updated with attention, simliar to https://github.com/lucidrains/transformer-in-transformer

I was going to implement it with a simple token at each level, but I was wondering if you had any suggestion on how to implement it correctly. Thank you.

opened by zenos4mbu 0
Implementing geometric mean for consensus opinion/levels_mean

Hi, I'm trying to implement the consensus opinion (levels_mean) as a geometric mean of the top-down predictions, bottom-up predictions, attention-weighted average of same-level embeddings, and embeddings of the previous time step as described by the original paper. Any ideas on how the weights should be set?

At first I thought this could be a learnable parameter, but section 9.1 reads

For interpreting a static image with no temporal context, the weights used for this weighted geometric mean need to change during the iterations that occur after a new fixation.

which leads me to believe that these might need to be outputted on the fly a la vanilla attention as opposed to being learned. Maybe an MLP that takes in the four source embeddings and outputs four scalars as weights?

opened by ryan-caesar-ramos 0
Classification
Hi @lucidrains ! Do you have any idea/insight on how to supervise classification (let's say, for example, MNIST digits classification) after having trained GLOM in an unsupervised way as a denoising autoencoder? In the paper that seems to be the final goal. However, it's not clear to me which columns and/or levels should be used for the classification. Also, since GLOM it's dealing with patches, how can single black patches vote towards a certain digit?

In other words, after training GLOM as a denoising autoencoder on MNIST, what we have is:

p X p columns, where p is the number of patches per dimension (e.g. 7X7=49 patches)

6 levels for each column, where the top-most levels should in theory represent higher-level entities, so it seems natural to search for the digit information in these layers

6*2=12 iterations, to allow for information to be passed by both top-down and bottom-up networks

Just by applying dimensionality reduction on the top-most level at different iterations does not seem enough to make the digit clusters emerge. So I'm wondering if you (or anybody else) have some insights on this. Cheers!
opened by A7ocin 1
Bug in forward?

Hello, thank you for making this code available! I think there could be a potential bug in the first line of the forward function:

b, h, w, _, device = *img.shape, img.device

but the input image shape is of kind b c h w, so it could be fixed by replacing it with

b, _, h, w, device = *img.shape, img.device

Am I wrong?

opened by A7ocin 9

Releases(0.0.14)

0.0.14(Mar 27, 2021)

Source code(tar.gz)
Source code(zip)
0.0.12(Mar 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.11(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.10a(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.9a(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.8(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.7(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.6(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.4(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2(Mar 5, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need.

GitHub Repository

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

87 Nov 29, 2022

😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t

865 Dec 24, 2022

Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

FL Analysis This repository contains the code and results for the paper "Towards Understanding Quality Challenges of the Federated Learning: A First L

3 Oct 17, 2022

An end-to-end image translation model with weight-map for color constancy

CCUnet An end-to-end image translation model with weight-map for color constancy 1. Download the dataset (take Colorchecker_recommended dataset as an

1 Dec 21, 2021

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

22 Nov 16, 2022

Robotic Process Automation in Windows and Linux by using Driagrams.net BPMN diagrams.

BPMN_RPA Robotic Process Automation in Windows and Linux by using BPMN diagrams. With this Framework you can draw Business Process Model Notation base

23 Dec 14, 2022

A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes.

OMNI A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes. Why? When I finished my Kubernetes cluster using a few Raspber

148 Dec 29, 2022

Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.

lacmus The program for searching through photos from the air of lost people in the forest using Retina Net neural nwtwork. The project is being develo

168 Dec 27, 2022

3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

3D AffordanceNet This repository is the official experiment implementation of 3D AffordanceNet benchmark. 3D AffordanceNet is a 3D point cloud benchma

49 Dec 01, 2022

Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

37 Dec 13, 2022

The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

134 Dec 19, 2022

Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy

Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy Simplex Algorithm is a popular algorithm for linear programmi

2 Oct 12, 2022

Implementation of "DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing".

DeepOrder Implementation of DeepOrder for the paper "DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing". Project

6 Nov 07, 2022

High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models Requirements A suitable conda environment named ldm can be created and activated with: conda env create -f environment.yaml co

5.6k Jan 04, 2023

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021)

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021) Yunsong Zhou, Yuan He, Hongzi Zhu, Cheng Wang, Hongyang Li, Qinhong Jia

51 Dec 14, 2022

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

Labelbox Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications. Use this github repository to help you s

1.7k Dec 29, 2022

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali

163 Dec 22, 2022

Bayesian Image Reconstruction using Deep Generative Models

Bayesian Image Reconstruction using Deep Generative Models R. Marinescu, D. Moyer, P. Golland For technical inquiries, please create a Github issue. F

51 Nov 23, 2022

基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet，并参加百度第三届论文复现赛，将在2021年5月15日比赛完后提供AIStudio链接～敬请期待参考项目： frazerlin-fcanet 数据准备本项目已挂

7 Mar 07, 2022

Food recognition model using convolutional neural network & computer vision

Food recognition model using convolutional neural network & computer vision. The goal is to match or beat the DeepFood Research Paper

1 Jan 13, 2022

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

Related tags

Overview

GLOM - Pytorch (wip)

Citations

Comments

help

Levels token

Implementing geometric mean for consensus opinion/levels_mean

Classification

Bug in forward?

Releases(0.0.14)

0.0.14(Mar 27, 2021)

0.0.12(Mar 6, 2021)

0.0.11(Mar 5, 2021)

0.0.10a(Mar 5, 2021)

0.0.9a(Mar 5, 2021)

0.0.8(Mar 5, 2021)

0.0.7(Mar 5, 2021)

0.0.6(Mar 5, 2021)

0.0.5(Mar 5, 2021)

0.0.4(Mar 5, 2021)

0.0.3(Mar 5, 2021)

0.0.2(Mar 5, 2021)