PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Last update: Aug 19, 2022

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

This repository contains the PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing", the Sheffield entry for the first Clarity enhancement challenge (CEC1). The system consists of a Conv-TasNet based denoising module, and a finite-inpulse-response (FIR) filter based amplification module. A differentiable approximation to the Cambridge MSBG model released in the CEC1 is used in the loss function.

Requirements

To run the training recipe of the amplification module, the MSBG package and PyTorch STOI are needed.

Training

To build the overall system, the Conv-TasNet based denoising module needs to be trained in the first stage, and the scripts are in the recipe_den_convtasnet. The FIR based amplification module is trained in the second stage, and the scripts are in the recipe_amp_fir. The MBSTOI folder contains the MBSTOI implementation from the CEC1 project, with also the DBSTOI implementation.

References

[1] Luo Y, Mesgarani N. Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation[J]. IEEE/ACM transactions on audio, speech, and language processing, 2019, 27(8): 1256-1266.
[2] Andersen A H, de Haan J M, Tan Z H, et al. Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions[J]. Speech Communication, 2018, 102: 1-13.
[3] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', ICASSP 2010, Texas, Dallas.

Citation

If you use this work, please cite:

@article{tutwo,
  title={A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing},
  author={Tu, Zehai and Zhang, Jisi and Ma, Ning and Barker, Jon},
  year={2021},
  booktitle={The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021)},
}

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Related tags

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

Requirements

Training

References

Citation

Owner

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

unofficial pytorch implementation of RefineGAN

Simple streamlit app to demonstrate HERE Tour Planning

Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

Learning Representational Invariances for Data-Efficient Action Recognition

This repository is for Competition for ML_data class

EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models

PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

Repository for the Bias Benchmark for QA dataset.

Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

A fast MoE impl for PyTorch

Centroid-UNet is deep neural network model to detect centroids from satellite images.

MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

This repository contains the code for our paper VDA (public in EMNLP2021 main conference)

This repo is about implementing different approaches of pose estimation and also is a sub-task of the smart hospital bed project :smile:

Offline Reinforcement Learning with Implicit Q-Learning

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities