NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

Overview

NeuralWOZ

This code is official implementation of "NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation".

Sungdong Kim, Minsuk Chang, Sang-woo Lee
In ACL 2021.

Citation

@inproceedings{kim2021neuralwoz,
  title={NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation},
  author={Kim, Sungdong and Chang, Minsuk and Lee, Sang-woo},
  booktitle={ACL},
  year={2021}
}

Requirements

python3.6
torch==1.4.0
transformers==2.11.0

Please install apex for the mixed precision training.
See details in requirements.txt

Data Download and Preprocessing

1. Download dataset

Please run this script at first. It will create data repository, and save and preprocess MultiWOZ 2.1 dataset.

python3 create_data.py

2. Preprocessing

To train NeuralWOZ under various settings, you should create each training instances with running below script.

python3 neuralwoz/preprocess.py --exceptd $TARGET_DOMAIN --fewshot_ratio $FEWSHOT_RATIO
  • exceptd: Specify "target domain" to exclude from training dataset for leave-one-out scheme. It is one of the (hotel|restaurant|attraction|train|taxi).
  • fewshot_ratio: Choose proportion of examples in the target domain to include. Default is 0. which means zero-shot. It is one of the (0.|0.01|0.05|0.1). You can check the fewshot examples in the assets/fewshot_key.json.

This script will create "$TARGET_DOMAIN_$FEWSHOT_RATIO_collector_(train|dev).json" and "$TARGET_DOMAIN_$FEWSHOT_RATIO_labeler_train.h5".

Training NeuralWOZ

You should specify output_path to save the trained model.
Each output consists of the below four files after the training.

  • pytorch_model.bin
  • config.json
  • vocab.json
  • merges.txt

For each zero/few-shot settings, you should set the TRAIN_DATA and DEV_DATA from the preprocessing. For example, hotel_0.0_collector_(train|dev).json should be used for the Collector training when the target domain is hotel in the zero-shot domain transfer task.

We use N_GPU=4 and N_ACCUM=2 for Collector training and N_GPU=2 and N_ACCUM=2 for Labeler training to fit 32 for batch size based on V100 32GB GPU.

1. Collector

python3 neuralwoz/train_collector.py \
  --dataset_dir data \
  --output_path $OUTPUT_PATH \
  --model_name_or_path facebook/bart-large \
  --train_data $TRAIN_DATA \
  --dev_data $DEV_DATA \
  --n_gpu $N_GPU \
  --per_gpu_train_batch_size 4 \
  --num_train_epochs 30 \
  --learning_rate 1e-5 \
  --gradient_accumulation_steps $N_ACCUM \
  --warmup_steps 1000 \
  --fp16

2. Labeler

python3 neuralwoz/train_labeler.py \
  --dataset_dir data \
  --output_path $OUTPUT_PATH \
  --model_name_or_path roberta-base-dream \
  --train_data $TRAIN_DATA \
  --dev_data labeler_dev_data.json \
  --n_gpu $N_GPU \
  --per_gpu_train_batch_size 8 \
  --num_train_epochs 10 \
  --learning_rate 1e-5 \
  --gradient_accumulation_steps $N_ACCUM \
  --warmup_steps 1000 \
  --beta 5. \
  --fp16

Download Synthetic Dialogues from NeuralWOZ

Please download synthetic dialogues from here

  • The naming convention is nwoz_{target_domain}_{fewshot_proportion}.json
  • Each dataset contains synthesized dialogues from our NeuralWOZ
  • Specifically, It contains synthetic dialogues for the target_domain while excluding original dialogues for the target domain (leave-one-out setup)
  • You can check the i-th synthesized dialogue in each files with aug_{target_domain}_{fewshot_proprotion}_{i} for dialogue_idx key.
  • You can use the json file to directly train zero/few-shot learner for DST task
  • Please see readme for training TRADE and readme for training SUMBT using the dataset
  • If you want to synthesize your own dialogues, please see below sections.

Download Pretrained Models

Pretrained models are available in this link. The naming convention is like below

  • NEURALWOZ: (Collector|Labeler)_{target_domain}_{fewshot_proportion}.tar.gz
  • TRADE: nwoz_TRADE_{target_domain}_{fewshot_proportion}.tar.gz
  • SUMBT: nwoz_SUMBT_{target_domain}_{fewshot_proportion}.tar.gz

To synthesize your own dialogues, please download and unzip both of Collector and Labeler in same target domain and fewshot_proportion at $COLLECTOR_PATH and $LABELER_PATH, repectively.

Please use tar -zxvf MODEL.tar.gz for the unzipping.

Generate Synthetic Dialogues using NeuralWOZ

python3 neuralwoz/run_neuralwoz.py \
  --dataset_dir data \
  --output_dir data \
  --output_file_name neuralwoz-output.json \
  --target_data collector_dev_data.json \
  --include_domain $TARGET_DOMAIN \
  --collector_path $COLLECTOR_PATH \
  --labeler_path $LABELER_PATH \
  --num_dialogues $NUM_DIALOGUES \
  --batch_size 16 \
  --num_beams 1 \
  --top_k 0 \
  --top_p 0.98 \
  --temperature 0.9 \
  --include_missing_dontcare

License

Copyright 2021-present NAVER Corp.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
NAVER AI
Official account of NAVER AI, Korea No.1 Industrial AI Research Group
NAVER AI
Fast, flexible and fun neural networks.

Brainstorm Discontinuation Notice Brainstorm is no longer being maintained, so we recommend using one of the many other,available frameworks, such as

IDSIA 1.3k Nov 21, 2022
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

Perceiver - Pytorch Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch Install $ pip install perceiver-pytorch Usage

Phil Wang 876 Dec 29, 2022
FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

FPGA & FreeNet Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification by Zhuo Zheng, Yanfei Zhong, Ailong M

Zhuo Zheng 92 Jan 03, 2023
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022
PyTorch source code for Distilling Knowledge by Mimicking Features

LSHFM.detection This is the PyTorch source code for Distilling Knowledge by Mimicking Features. And this project contains code for object detection wi

Guo-Hua Wang 4 Dec 17, 2022
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 49 Nov 28, 2022
Lama-cleaner: Image inpainting tool powered by LaMa

Lama-cleaner: Image inpainting tool powered by LaMa

Qing 5.8k Jan 05, 2023
This Deep Learning Model Predicts that from which disease you are suffering.

Deep-Learning-Project This Deep Learning Model Predicts that from which disease you are suffering. This Project Covers the Topics of Deep Learning Int

Jai Viral Doshi 0 Jan 20, 2022
An All-MLP solution for Vision, from Google AI

MLP Mixer - Pytorch An All-MLP solution for Vision, from Google AI, in Pytorch. No convolutions nor attention needed! Yannic Kilcher video Install $ p

Phil Wang 784 Jan 06, 2023
Self-Supervised CNN-GCN Autoencoder

GCNDepth Self-Supervised CNN-GCN Autoencoder GCNDepth: Self-supervised monocular depth estimation based on graph convolutional network To be published

53 Dec 14, 2022
PiRank: Learning to Rank via Differentiable Sorting

PiRank: Learning to Rank via Differentiable Sorting This repository provides a reference implementation for learning PiRank-based models as described

54 Dec 17, 2022
PyTorch implementation of PSPNet

PSPNet with PyTorch Unofficial implementation of "Pyramid Scene Parsing Network" (https://arxiv.org/abs/1612.01105). This repository is just for caffe

Kazuto Nakashima 52 Nov 16, 2022
Diverse Image Generation via Self-Conditioned GANs

Diverse Image Generation via Self-Conditioned GANs Project | Paper Diverse Image Generation via Self-Conditioned GANs Steven Liu, Tongzhou Wang, David

Steven Liu 147 Dec 03, 2022
Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets This is the official PyTorch implementation for the paper Rapid Neural A

48 Dec 26, 2022
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022
A Partition Filter Network for Joint Entity and Relation Extraction EMNLP 2021

EMNLP 2021 - A Partition Filter Network for Joint Entity and Relation Extraction

zhy 127 Jan 04, 2023
ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

AI2 152 Dec 27, 2022
Torch code for our CVPR 2018 paper "Residual Dense Network for Image Super-Resolution" (Spotlight)

Residual Dense Network for Image Super-Resolution This repository is for RDN introduced in the following paper Yulun Zhang, Yapeng Tian, Yu Kong, Bine

Yulun Zhang 494 Dec 30, 2022
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

PDVC Official implementation for End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021) [paper] [valse论文速递(Chinese)] This repo supports:

Teng Wang 118 Dec 16, 2022
Distributed DataLoader For Pytorch Based On Ray

Dpex——用户无感知分布式数据预处理组件 一、前言 随着GPU与CPU的算力差距越来越大以及模型训练时的预处理Pipeline变得越来越复杂,CPU部分的数据预处理已经逐渐成为了模型训练的瓶颈所在,这导致单机的GPU配置的提升并不能带来期望的线性加速。预处理性能瓶颈的本质在于每个GPU能够使用的C

Dalong 23 Nov 02, 2022