Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

Last update: Dec 23, 2022

Related tags

Deep Learning VITS_Singing

Overview

Init

Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

本项目基于

https://github.com/jaywalnut310/vits
https://github.com/SJTMusicTeam/Muskits/
https://wenet.org.cn/opencpop/ 歌声数据

使用muskit数据预处理，获得初步数据

cd egs/opencpop/svs1/
./local/data.sh

VISinger_data
--lable
--midi_dump
--wav_dump

采样率转换

python wave_16k.py
--wav_dump
--wav_dump_16k

使用muskit将数据处理成vits的格式

1, 将lable进行拆分
python muskit/data_label_single.py

label_dump,midi_dump,wav_dump:一个文件一个标注

注意：label和lable的混用（两个单词都是对的）

VISinger_data
--label_dump
--midi_dump
--wav_dump
--wav_dump_16k

2, 将label和midi处理为frame对应的发音单元和音符（基音）
python muskit/data_format_vits.py
VISinger_data
--label_vits
--label_dump
--midi_dump
--wav_dump
--wav_dump_16k

3, 生成VITS需要的files，并分割为train和dev，test不需要（可以手动设计）
python muskit/data_format_vits.py

vits_file.txt 中的内容格式：wave path|label path|pitch path;

cp vits_file.txt VISinger/filelists/
cd VISinger/

python preprocess.py 分割为train和dev

VITS训练

cd VISinger
CUDA_VISIBLE_DEVICES=0 python train.py -c configs/singing_base.json -m singing_base 2>exit_error.log;cat exit_error.log
python vsinging_infer.py

使用16K节约内存，方便模型修改

编辑midi，然后测试

cd ../;python muskit/infer_midi.py;cd -;python vsinging_edit.py

样例音频

vits_singing_样例.wav

You might also like...

In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台本项目是一个简易的人脸检测识别平台，提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js，后端采用 pytorch，

5 Aug 2, 2022

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

Big Vision This codebase is designed for training large-scale vision models on Cloud TPU VMs. It is based on Jax/Flax libraries, and uses tf.data and

701 Jan 3, 2023

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

10 Dec 14, 2022

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

1 Feb 13, 2022

Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

6 Sep 3, 2021

A voice recognition assistant similar to amazon alexa, siri and google assistant.

kenyan-Siri Build an Artificial Assistant Full tutorial (video) To watch the tutorial, click on the image below Installation For windows users (run th

3 Aug 19, 2022

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

33 Jan 5, 2023

this is a lite easy to use virtual keyboard project for anyone to use

virtual_Keyboard this is a lite easy to use virtual keyboard project for anyone to use motivation I made this for this year's recruitment for RobEn AA

3 Oct 23, 2021

A collection of easy-to-use, ready-to-use, interesting deep neural network models

Interesting and reproducible research works should be conserved. This repository wraps a collection of deep neural network models into a simple and un

16 Jun 16, 2022

Comments

couple of questions

Hello how are you ! very cool stuff you have here ,I can clearly see you love singing voice synthesis (SVS) from your forks and repos !! i wanted to ask is that a fully working Visingerr or is it a try from you to make it to sing , like can it be tested on a custom English data and have like results the same as or near the demo in the paper. Also do you have like other samples i can hear , i know that you tested it on opencpop that has almost 5.2 hours of singing data , and also in the paper they trained Visingerr for 600k iterations right ? how many iterations did you achieve on the opencpop to get the result linked below (vits_singing_样例.wav). to be honest i thought vits is data hungry like tacotron2 or fastspeech (aka needs a lot of data to get great results) , that opencpop result of your is so impressive for 5.2 hours data , i also wonder if you lowered the sample rate of opencpop from 44.1 KHz to 22KHz as i heard 44.1 KHz takes alot of time to train x10 the time needed.

迫不及待地想知道你的消息 :)

opened by dutchsing009 5
问题

python prepare/data_vits.py 输出 1,../VISinger_data/label_vits/XXX._label.npy|XXX_score.npy|XXX_pitch.npy|XXX_slurs.npy 2,filelists/vits_file.txt 内容格式：wave path|label path|score path|pitch path|slurs path;

请问1 2这两步是怎么操作？

opened by baipeng0110 3
训练结果

目前模型缺乏时长预测模型和基音预测模型；训练语料中的句子修改歌词的效果；

原歌词：雨淋湿了天空灰得更讲究

https://user-images.githubusercontent.com/16432329/164953151-4c2513cb-f336-416b-8f04-604f13e63368.MP4

修改歌词：你闹够了没有让我更难受

https://user-images.githubusercontent.com/16432329/164953155-16c72670-cc89-40bc-99fe-42781c9dcdc0.MP4
help wanted

opened by MaxMax2016 0
About release models and VISinger

Hi

This is a fantastic project that I have ever seen.

Could you please share the released model? As on the inference step, it is said that "using the released model"

Also, is there any plan to implement the VISinger model?

Thank you!

opened by shiyanpei0826 1

Releases(0.0.1)

0.0.1(Sep 8, 2022)

Source code(tar.gz)
Source code(zip)
visinger.pth(85.76 MB)

Owner

AmorTX

Speech

GitHub Repository

Whisper is a file-based time-series database format for Graphite.

Whisper Overview Whisper is one of three components within the Graphite project: Graphite-Web, a Django-based web application that renders graphs and

1.2k Dec 25, 2022

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation Paper Multi-Target Adversarial Frameworks for Domain Adaptation in

20 Jun 21, 2022

Download from Onlyfans.com.

OnlySave: Onlyfans downloader Getting Started: Download the setup executable from the latest release. Install and run. Only works on Windows currently

4 May 30, 2022

Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

5 Steps to Speed Up Your Data-Analysis on a Single Core Material for my talk at the PyConDE & PyData Berlin 2022 Description Your data analysis pipeli

9 Dec 12, 2022

A python library for self-supervised learning on images.

Lightly is a computer vision framework for self-supervised learning. We, at Lightly, are passionate engineers who want to make deep learning more effi

2k Jan 08, 2023

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

30 Dec 06, 2022

Iranian Cars Detection using Yolov5s, PyTorch

Iranian Cars Detection using Yolov5 Train 1- git clone https://github.com/ultralytics/yolov5 cd yolov5 pip install -r requirements.txt 2- Dataset ../

22 Dec 05, 2022

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

127 Dec 28, 2022

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Introduction This repository contains the PyTorch implem

25 Nov 09, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis Requirements python 3.7 pytorch-gpu 1.7 numpy 1.19.4 pytorch_

12 Oct 29, 2022

A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

Quick Notetaker add-on for NVDA The Quick Notetaker add-on is a wonderful tool which allows writing notes quickly and easily anytime and from any app

5 Dec 06, 2022

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

Virtual_Image_Dragger This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the ind

17 Dec 17, 2022

A toolset for creating Qualtrics-based IAT experiments

Qualtrics IAT Tool A web app for generating the Implicit Association Test (IAT) running on Qualtrics Online Web App The app is hosted by Streamlit, a

0 Feb 12, 2022

Autotype on websites that have copy-paste disabled like Moodle, HackerEarth contest etc.

Autotype A quick and small python script that helps you autotype on websites that have copy paste disabled like Moodle, HackerEarth contests etc as it

32 Nov 03, 2022

Python Implementation of the CoronaWarnApp (CWA) Event Registration

Python implementation of the Corona-Warn-App (CWA) Event Registration This is an implementation of the Protocol used to generate event and location QR

17 Oct 05, 2022

Code and data for the paper "Hearing What You Cannot See"

Hearing What You Cannot See: Acoustic Vehicle Detection Around Corners Public repository of the paper "Hearing What You Cannot See: Acoustic Vehicle D

26 Jul 13, 2022

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

4 May 08, 2022

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Tensorpack is a neural network training interface based on TensorFlow. Features: It's Yet Another TF high-level API, with speed, and flexibility built

6.2k Jan 09, 2023

Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

Related tags

Overview

Init

本项目基于

使用muskit数据预处理，获得初步数据

采样率转换

使用muskit将数据处理成vits的格式

VITS训练

使用16K节约内存，方便模型修改

编辑midi，然后测试

样例音频

You might also like...

In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Voice of Pajlada with model and weights.

A voice recognition assistant similar to amazon alexa, siri and google assistant.

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

this is a lite easy to use virtual keyboard project for anyone to use

A collection of easy-to-use, ready-to-use, interesting deep neural network models

Comments

couple of questions

问题

训练结果

About release models and VISinger

Releases(0.0.1)

0.0.1(Sep 8, 2022)

Owner

AmorTX

Whisper is a file-based time-series database format for Graphite.

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Download from Onlyfans.com.

Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

A python library for self-supervised learning on images.

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Iranian Cars Detection using Yolov5s, PyTorch

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

A toolset for creating Qualtrics-based IAT experiments

Autotype on websites that have copy-paste disabled like Moodle, HackerEarth contest etc.

Python Implementation of the CoronaWarnApp (CWA) Event Registration

Code and data for the paper "Hearing What You Cannot See"

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility