Azure Text-to-speech service for Home Assistant

Last update: Aug 06, 2022

Related tags

Overview

Azure Text-to-speech service for Home Assistant

The Azure text-to-speech platform uses online Azure Text-to-Speech cognitive service to read a text with natural sounding voice.

The main reason behind this custom integration is to decouple the Microsoft TTS service from the python library pycsspeechtts used by the "official" integration.

This integration uses the native Azure Cognitive Speech Service Text-to-speech REST API (I know.. it is too long for a service name).

Features

Supports multi language. You can find the full list of languages here.
Supports SSML.

Basic Configuration

# Text to speech
tts:
  - platform: azure_tts
    service_name: azure_say
    api_key: <your_api_key>

Configuration variables

This integration accepts the same configuration variables as the out-of-the-box Microsoft TTS].

Comments

init and concatenate str error

Hi, i got two errors with your integration: my configuration.yaml is:

    #https://github.com/yassineselmi/homeassistant-azure-tts
  - platform: azure_tts
    service_name: tts_microsoft_noemi_notok
    cache: false
    api_key: ####################
    language: hu-HU
    gender: Female
    #type: hu-HU-NoemiNeural
    type: NoemiNeural
    rate: 100
    volume: 100
    pitch: default
    contour: (0, 0) (100, 100)
    region: westeurope

my automation is:

alias: Announcement, Time (Microsoft)
description: ''
trigger:
  - platform: time_pattern
    minutes: /15
condition: []
action:
  - service: tts.tts_microsoft_noemi_notok
    data:
      entity_id: media_player.living_room_speaker, media_player.bedroom_speaker
      message: {{ now().hour}} óra {{ "%0.02d" | format(now().strftime("%-M") | int) }} perc
mode: single

Error1

Error on init TTS: No TTS from azure_tts for 'message: 20 óra 30 perc'
8:30:51 PM – (ERROR) Text-to-Speech (TTS)

Logger: homeassistant.components.tts
Source: components/tts/__init__.py:188
Integration: Text-to-Speech (TTS) (documentation, issues)
First occurred: 8:30:51 PM (1 occurrences)
Last logged: 8:30:51 PM

Error on init TTS: No TTS from azure_tts for 'message: 20 óra 30 perc'

Error2

Error occurred for Azure TTS: can only concatenate str (not "bytes") to str
8:30:51 PM – (ERROR) azure_tts (custom integration)

Logger: custom_components.azure_tts.tts
Source: custom_components/azure_tts/tts.py:415
Integration: azure_tts (documentation, issues)
First occurred: 8:30:51 PM (1 occurrences)
Last logged: 8:30:51 PM

Error occurred for Azure TTS: can only concatenate str (not "bytes") to str

do you have a solution for this issue?

also id like to change the ptch of the voice a bit deeper, and at sample site (microsoft) and in azur, its posible to change this attribute. id like to use 0.9 for pitch and 1.2 for speed

Thanks, Zoltan

ps: with his integration it works: https://github.com/georgezhao2010/azure_cognitive_speech

  - platform: azure_cognitive_speech
    service_name: tts_microsoft_noemi
    cache: false
    api_key: #############
    region: westeurope
    default_voice: Noemi

opened by vzoltan 2

Releases(0.1.2)

0.1.2(Oct 30, 2022)
Fixed error when trying to log the request body

Source code(tar.gz)
Source code(zip)
0.1.1(Nov 8, 2021)

Source code(tar.gz)
Source code(zip)

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

37 Dec 4, 2022

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text

33 Sep 22, 2022

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g

8 Dec 25, 2022

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

2.2k Jan 9, 2023

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the

22 Nov 13, 2022

Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

11 Nov 17, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 6, 2022

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS) Yoonhyung Lee, Joongbo Shin, Kyomin Jung Abstract: Although early

147 Dec 5, 2022

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

8.4k Dec 26, 2022

Azure Text-to-speech service for Home Assistant

Related tags

Overview

Azure Text-to-speech service for Home Assistant

Features

Basic Configuration

Configuration variables

You might also like...

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Speech Recognition for Uyghur using Speech transformer

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Comments

init and concatenate str error

Releases(0.1.2)

0.1.2(Oct 30, 2022)

0.1.1(Nov 8, 2021)

Owner

Yassine Selmi

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Pre-Training with Whole Word Masking for Chinese BERT

SGMC: Spectral Graph Matrix Completion

Knowledge Graph,Question Answering System，基于知识图谱和向量检索的医疗诊断问答系统

Text classification on IMDB dataset using Keras and Bi-LSTM network

Text vectorization tool to outperform TFIDF for classification tasks

Ukrainian TTS (text-to-speech) using Coqui TTS

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

Biterm Topic Model (BTM): modeling topics in short texts

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

Maha is a text processing library specially developed to deal with Arabic text.

Script and models for clustering LAION-400m CLIP embeddings.

Entity Disambiguation as text extraction (ACL 2022)

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

XLNet: Generalized Autoregressive Pretraining for Language Understanding

a chinese segment base on crf