[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Overview

REval

Table of Contents

πŸŽ“   Introduction

REval is a simple framework for probing sentence-level representations of Relation Extraction models.

βœ…   Requirements

REval is tested with:

  • Python 3.7

πŸš€   Installation

With pip

<TBD>

From source

git clone https://github.com/DFKI-NLP/REval
cd REval
pip install -r requirements.txt

πŸ”¬   Probing

Supported Datasets

  • SemEval 2010 Task 8 (CoreNLP annotated version) [LINK]
  • TACRED (obtained via LDC) [LINK]

Probing Tasks

Task SemEval 2010 TACRED
ArgTypeHead βœ”οΈ βœ”οΈ
ArgTypeTail βœ”οΈ βœ”οΈ
Length βœ”οΈ βœ”οΈ
EntityDistance βœ”οΈ βœ”οΈ
ArgumentOrder βœ”οΈ
EntityExistsBetweenHeadTail βœ”οΈ βœ”οΈ
PosTagHeadLeft βœ”οΈ βœ”οΈ
PosTagHeadRight βœ”οΈ βœ”οΈ
PosTagTailLeft βœ”οΈ βœ”οΈ
PosTagTailRight βœ”οΈ βœ”οΈ
TreeDepth βœ”οΈ βœ”οΈ
SDPTreeDepth βœ”οΈ βœ”οΈ
ArgumentHeadGrammaticalRole βœ”οΈ βœ”οΈ
ArgumentTailGrammaticalRole βœ”οΈ βœ”οΈ

πŸ”§   Usage

Step 1: create the probing task datasets from the original datasets.

SemEval 2010 Task 8

python reval.py generate-all-from-semeval \
    --train-file <SEMEVAL DIR>/train.json \
    --validation-file <SEMEVAL DIR>/dev.json \
    --test-file <SEMEVAL DIR>/test.json \
    --output-dir ./data/semeval/

TACRED

python reval.py generate-all-from-tacred \
    --train-file <TACRED DIR>/train.json \
    --validation-file <TACRED DIR>/dev.json \
    --test-file <TACRED DIR>/test.json \
    --output-dir ./data/tacred/

Step 2: Run the probing tasks on a model.

For example, download a Relation Extraction model trained with RelEx, e.g., the CNN trained on SemEval.

mkdir -p models/cnn_semeval
wget --content-disposition https://cloud.dfki.de/owncloud/index.php/s/F3gf9xkeb2foTFe/download -P models/cnn_semeval
python probing_task_evaluation.py \
    --model-dir ./models/cnn_semeval/ \
    --data-dir ./data/semeval/ \
    --dataset semeval2010 \
    --cuda-device 0 \
    --batch-size 64 \
    --cache-representations

After the run is completed, the results are stored to probing_task_results.json in the model-dir.

{
    "ArgTypeHead": {
        "acc": 75.82,
        "devacc": 78.96,
        "ndev": 670,
        "ntest": 2283
    },
    "ArgTypeTail": {
        "acc": 75.4,
        "devacc": 78.79,
        "ndev": 627,
        "ntest": 2130
    },
    [...]
}

πŸ“š   Citation

If you use REval, please consider citing the following paper:

@inproceedings{alt-etal-2020-probing,
    title={Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction},
    author={Christoph Alt and Aleksandra Gabryszak and Leonhard Hennig},
    year={2020},
    booktitle={Proceedings of ACL},
    url={https://arxiv.org/abs/2004.08134}
}

πŸ“˜   License

REval is released under the terms of the MIT License.

Owner
Speech and Language Technology (SLT) Group of the Berlin lab of the German Research Center for Artificial Intelligence (DFKI)
Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

Rithesh Kumar 16 Oct 06, 2022
A modular domain adaptation library written in PyTorch.

A modular domain adaptation library written in PyTorch.

Kevin Musgrave 225 Dec 29, 2022
Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

Mohamed Chaabane 253 Dec 18, 2022
A simple program for training and testing vit

Vit This is a simple program for training and testing vit. Key requirements: torch, torchvision and timm. Dataset I put 5 categories of the cub classi

xiezhenyu 2 Oct 11, 2022
TensorLight - A high-level framework for TensorFlow

TensorLight is a high-level framework for TensorFlow-based machine intelligence applications. It reduces boilerplate code and enables advanced feature

Benjamin Kan 10 Jul 31, 2022
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

122 Dec 28, 2022
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English βš–οΈ πŸ† πŸ§‘β€πŸŽ“ πŸ‘©β€βš–οΈ Dataset Summary Inspired by the recent widespread use of th

95 Dec 08, 2022
SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

SAT: 2D Semantics Assisted Training for 3D Visual Grounding SAT: 2D Semantics Assisted Training for 3D Visual Grounding by Zhengyuan Yang, Songyang Zh

Zhengyuan Yang 22 Nov 30, 2022
DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

DI-smartcross DI-smartcross - Decision Intelligence Platform for Traffic Crossin

OpenDILab 213 Jan 02, 2023
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
Live Hand Tracking Using Python

Live-Hand-Tracking-Using-Python Project Description: In this project, we will be

Hassan Shahzad 2 Jan 06, 2022
Active window border replacement for window managers.

xborder Active window border replacement for window managers. Usage git clone https://github.com/deter0/xborder cd xborder chmod +x xborders ./xborder

deter 250 Dec 30, 2022
PyTorch Lightning implementation of Automatic Speech Recognition

lasr Lightening Automatic Speech Recognition An MIT License ASR research library, built on PyTorch-Lightning, for developing end-to-end ASR models. In

Soohwan Kim 40 Sep 19, 2022
Adversarial Learning for Modeling Human Motion

Adversarial Learning for Modeling Human Motion This repository contains the open source code which reproduces the results for the paper: Adversarial l

wangqi 6 Jun 15, 2021
PyTorch-Multi-Style-Transfer - Neural Style and MSG-Net

PyTorch-Style-Transfer This repo provides PyTorch Implementation of MSG-Net (ours) and Neural Style (Gatys et al. CVPR 2016), which has been included

Hang Zhang 906 Jan 04, 2023
Functional deep learning

Pipeline abstractions for deep learning. Full documentation here: https://lf1-io.github.io/padl/ PADL: is a pipeline builder for PyTorch. may be used

LF1 101 Nov 09, 2022
Implicit Model Specialization through DAG-based Decentralized Federated Learning

Federated Learning DAG Experiments This repository contains software artifacts to reproduce the experiments presented in the Middleware '21 paper "Imp

Operating Systems and Middleware Group 5 Oct 16, 2022
Deeprl - Standard DQN and dueling network for simple games

DeepRL This code implements the standard deep Q-learning and dueling network with experience replay (memory buffer) for playing simple games. DQN algo

Yao Zhou 6 Apr 12, 2020
A Traffic Sign Recognition Project which can help the driver recognise the signs via text as well as audio. Can be used at Night also.

Traffic-Sign-Recognition In this report, we propose a Convolutional Neural Network(CNN) for traffic sign classification that achieves outstanding perf

Mini Project 64 Nov 19, 2022
Source code for our EMNLP'21 paper γ€ŠRaise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Child-Tuning Source code for EMNLP 2021 Long paper: Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning. 1. Environ

46 Dec 12, 2022