Changing the Mind of Transformers for Topically-Controllable Language Generation

Overview

Changing the Mind of Transformers for Topically-Controllable Language Generation

We will first introduce the how to run the IPython notebook demo by downloading our pretrained models. Then, we will introduce how to run our training and evaluation code.

Image of our model

Requirements and Setup

  • An Unix like OS with at least one GPU
  • To set up the python environment, run pip install -r requirements.txt. I use python 3.7 and pytorch 1.3.1, but I think other python 3 or pytorch > 1.0 versions might also be fine or just require very simple revision of the code. Our codes also use IPython notebook (for running the interactive demo), Spacy (for tokenization), nltk (for running evaluation and pplm), and gensim (for running the LDA baseline).
  • If your python path is not ~/anaconda3/bin/python, change your PY_PATH in the all the scripts in ./bin

Running IPython Notebook Demo

  • Download the pretrained models and dictionary file from here or following the instructions for training code below
  • Use IPython notebook to open ./src/evaluation/test_conditional_LM.ipynb
  • Run the 1st block after putting the models into the corresponding directory or revising the paths of TOPIC_MODEL_DIR, GENERATION_MODEL_DIR, DICT_FILE in the first block.
  • Modify the input context prompt in the 2nd block and run the block to see the generated topics
  • Choose some topics or specify some words and run the 3rd block to see the generated continuations that start with conditional x:. We will also generate the continuation without the condition that start with original x: as a baseline. The topical words that appear in the continuation will be highlighted.
  • You can append a genearted continuation to the 2nd block and repeat the process

Preprocessing Wikipedia for Training and Evaluation

  • First, download only the text from Wikipedia into json format using WikiExtractor
  • Check the path in ./bin/preprocessing_single_proc.sh and run the script. In the preprocessing, we will run Spacy tokenizer and GPT2 tokenizer, heuristically align their resulting tokens, split the corpus into training/validation/testing sets, and store the word indices into tensors.
  • Note that ./bin/preprocessing_single_proc.sh might be slow because it does not parallelize the tokenization processes. If you use job scheduler like slurm in your server, you might want to see the parallized scripts for tokenization in ./bin/old/tokenize_all_wiki_gpt2.sh and ./bin/old/tokenize_all_wiki.sh

Running Training

  • Prepare a word embedding file (e.g., we download the GloVe embedding from here)
  • Train our option generator using ./bin/train_option_generator.sh
  • Train our conditional text generator using ./bin/train_conditional_generator.sh (could train option generator and text generator at the same time)
  • You can start from original GPT2 model or start from our pretrained models. In our paper, we use learning rate = 1e-4. You can also try other values between 1e-4 and 1e-5.

Running Evaluation using Automatic Metrics

  • To evaluate/visualize conditional text generator, update the GENERATION_MODEL_DIR and TOPIC_MODEL_DIR using the model path from the previous step to run ./bin/train_conditional_generator.sh.
  • To evaluate/visualize option generator, update the GENERATION_MODEL_DIR and TOPIC_MODEL_DIR and run ./bin/eval_option_generator.sh. Set VISUALIZATION='Y' to visualize the topics given some randomly selected prompt. Set AUTO_EVAL_TOPICS='Y' to compare the quality of topics from different methods as we did in Table 1 in our EACL paper. Set AUTO_EVAL_GENRATION='Y' to evaluate the topics by the quality of text that is generated given these topics as we did in Table 6 in our paper appendix.
  • Our scores are stored at the end of each OUT_FILE file when AUTO_EVAL*='Y'. Our text generator is called "model condition", and our option generator is called NSD_topic in our code, where NSD stands for neural set decoder.
  • In our code, we also evaluate some globally clustering baselines such as LDA and kmeans. In order to test them, you can train a LDA model by following the steps here. You can also see an example code at ./src/preprocessing/tools/train_LDA_model.py. For kmeans clustering, we use ./src/preprocessing/tools/word_emb_global_clustering.py. If you do not want to test them, just remove LDA_org and global_centers from METHOD_LIST

Running Evaluation using Amazon Mechanical Turk

  • Download STSb dataset from here
  • Preprocessing STS using ./src/evaluation/filter_STS_for_GPT2.py and remove the duplication by sort sts-train_longer.csv | uniq > sts-train_longer_uniq.csv
  • Set OUTPUT_CSV_FOR_MTURK='Y' in ./bin/train_conditional_generator.sh and ./bin/eval_option_generator.sh to generate CSV files for MTurk tasks.
  • Our crowdsourcing templates and responses from workers could be found in ./MTurk_eval

Citation

If you use the code in a publication, please cite our paper.

Haw-Shiuan Chang, Jiaming Yuan, Mohit Iyyer, and Andrew McCallum,
“Changing the Mind of Transformers for Topically-Controllable Language Generation.” 
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Owner
IESL
IESL
Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

extrinsic2pyramid Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space Intro A very simple and straightforward modu

JEONG HYEONJIN 106 Dec 28, 2022
Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

Pre-trained image classification models for Jax/Haiku Jax/Haiku Applications are deep learning models that are made available alongside pre-trained we

Alper Baris CELIK 14 Dec 20, 2022
Mercury: easily convert Python notebook to web app and share with others

Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad

MLJAR 2.2k Dec 27, 2022
PyTorch code to run synthetic experiments.

Code repository for Invariant Risk Minimization Source code for the paper: @article{InvariantRiskMinimization, title={Invariant Risk Minimization}

Facebook Research 345 Dec 12, 2022
Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Jan 02, 2023
Social Distancing Detector

Computer vision has opened up a lot of opportunities to explore into AI domain that were earlier highly limited. Here is an application of haarcascade classifier and OpenCV to develop a social distan

Ashish Pandey 2 Jul 18, 2022
Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

Thorsten Hempel 284 Dec 23, 2022
5 Jan 05, 2023
Lite-HRNet: A Lightweight High-Resolution Network

LiteHRNet Benchmark 🔥 🔥 Based on MMsegmentation 🔥 🔥 Cityscapes FCN resize concat config mIoU last mAcc last eval last mIoU best mAcc best eval bes

16 Dec 12, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

Triangle Multiplicative Module - Pytorch Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or c

Phil Wang 22 Oct 28, 2022
A simple interface for editing natural photos with generative neural networks.

Neural Photo Editor A simple interface for editing natural photos with generative neural networks. This repository contains code for the paper "Neural

Andy Brock 2.1k Dec 29, 2022
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

1 Jan 05, 2022
EfficientMPC - Efficient Model Predictive Control Implementation

efficientMPC Efficient Model Predictive Control Implementation The original algo

Vin 8 Dec 04, 2022
3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry

SynergyNet 3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at Unive

Cho-Ying Wu 239 Jan 06, 2023
A small library for creating and manipulating custom JAX Pytree classes

Treeo A small library for creating and manipulating custom JAX Pytree classes Light-weight: has no dependencies other than jax. Compatible: Treeo Tree

Cristian Garcia 58 Nov 23, 2022
CLIP+FFT text-to-image

Aphantasia This is a text-to-image tool, part of the artwork of the same name. Based on CLIP model, with FFT parameterizer from Lucent library as a ge

vadim epstein 690 Jan 02, 2023
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

1 Nov 01, 2021
Replication attempt for the Protein Folding Model

RGN2-Replica (WIP) To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding f

Eric Alcaide 36 Nov 29, 2022
Code for paper entitled "Improving Novelty Detection using the Reconstructions of Nearest Neighbours"

NLN: Nearest-Latent-Neighbours A repository containing the implementation of the paper entitled Improving Novelty Detection using the Reconstructions

Michael (Misha) Mesarcik 4 Dec 14, 2022