A two-stage U-Net for high-fidelity denoising of historical recordings

Last update: Jan 05, 2023

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Official repository of the paper (not submitted yet):

E. Moliner and V. Välimäki,, "A two-stage U-Net for high-fidelity denosing of historical recordinds", in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, May, 2022

Abstract

Enhancing the sound quality of historical music recordings is a long-standing problem. This paper presents a novel denoising method based on a fully-convolutional deep neural network. A two-stage U-Net model architecture is designed to model and suppress the degradations with high fidelity. The method processes the time-frequency representation of audio, and is trained using realistic noisy data to jointly remove hiss, clicks, thumps, and other common additive disturbances from old analog discs. The proposed model outperforms previous methods in both objective and subjective metrics. The results of a formal blind listening test show that the method can denoise real gramophone recordings with an excellent quality. This study shows the importance of realistic training data and the power of deep learning in audio restoration.

Listen to our audio samples

Requirements

You will need at least python 3.7 and CUDA 10.1 if you want to use GPU. See requirements.txt for the required package versions.

To install the environment through anaconda, follow the instructions:

conda env update -f environment.yml
conda activate historical_denoiser

Denoising Recordings

Run the following commands to clone the repository and install the pretrained weights of the two-stage U-Net model:

git clone https://github.com/eloimoliner/denoising-historical-recordings.git
cd denoising-historical-recordings
wget https://github.com/eloimoliner/denoising-historical-recordings/releases/download/v0.0/checkpoint.zip
unzip checkpoint.zip /experiments/trained_model/

If the environment is installed correctly, you can denoise an audio file by running:

bash inference.sh "file name"

A ".wav" file with the denoised version, as well as the residual noise and the original signal in "mono", will be generated in the same directory as the input file.

Training

TODO

Comments

Will it work in Windows without CUDA?

Hello, The readme says: "You will need at least python 3.7 and CUDA 10.1 if you want to use GPU."

Unfortunately, my first attempt to run it in Windows without CUDA-supporting VGA failed. There is really no separate environment file for CPU-only? Is it possible to make it work without massive changes to the code?

opened by vitacon 15
installation without conda

Hi,

could you leave some hints about how to install this without conda? Your readme appears to be very much specified to this one case. Also it seems that you develop under linux so you use bash to execute. Maybe here a hint for win- users would be cool too.

I am just trying to get this to run under windows and so far had no success. I will update if I get further. All the best!

opened by GitHubGeniusOverlord 9
strange tensorflow version in requirements.txt

Hi,

when running python -m pip install tensorflow==2.3.0 as indicated in your requirements file, I get

ERROR: Could not find a version that satisfies the requirement tensorflow==2.3.0 (from versions: 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0, 2.5.1, 2.5.2, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.8.0rc0) ERROR: No matching distribution found for tensorflow==2.3.0

It seems this version isn't even supported by pip anymore. Upgrade to 2.5.0?

The same is true for scipy==1.4.1. Not sure about which version to take there.

opened by GitHubGeniusOverlord 3
Update inference.sh

Small change to allow spaces in file names. Bash expands the variable $1 correctly even if it is in double quotes, python receives a single argument and not (if there are spaces) multiple arguments.

opened by JorenSix 1
How to start training for denoising?

If I would like to do a denoising task, where I've clean signals (in the "clean" folder) and noisy signals (in the "noise" folder).

opened by listener17 1

Releases(v0.0)

v0.0(Aug 31, 2021)

Uploading pretrained model
Source code(tar.gz)
Source code(zip)
checkpoint.zip(251.80 MB)

Owner

Eloi Moliner Juanpere

Doctoral candidate on audio signal processing at Aalto university.

GitHub Repository

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Cross-Speaker-Emotion-Transfer - PyTorch Implementation PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Conditio

114 Jan 08, 2023

A two-stage U-Net for high-fidelity denoising of historical recordings

Related tags

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Abstract

Requirements

Denoising Recordings

Training

Comments

Will it work in Windows without CUDA?

installation without conda

strange tensorflow version in requirements.txt

Update inference.sh

How to start training for denoising?

Releases(v0.0)

v0.0(Aug 31, 2021)

Owner

Eloi Moliner Juanpere

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Diverse Branch Block: Building a Convolution as an Inception-like Unit

OpenMMLab Computer Vision Foundation

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Reproducing code of hair style replacement method from Barbershorp.

pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.

Code for Mesh Convolution Using a Learned Kernel Basis

Code for Environment Inference for Invariant Learning (ICML 2020 UDL Workshop Paper)

PyTorch implementation of CloudWalk's recent work DenseBody

A data-driven maritime port simulator

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

NumQMBasic - A mini-course offered to Undergrad physics students

Code release for ConvNeXt model

PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

Python Implementation of Chess Playing AI with variable difficulty

Social Network Ads Prediction

Data Preparation, Processing, and Visualization for MoVi Data

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch