DCL - An easy to use diacritic library used for diacritic and accent manipulation.

Related tags

Audiopython-dcl
Overview

Diacritics Library

Code Style: Black Imports: isort PRs welcome

This library is used for adding, and removing diacritics from strings.

Getting started

Start by importing the module:

import dcl

DCL currently supports a multitude of diacritics:

  • acute
  • breve
  • caron
  • cedilla
  • grave
  • interpunct
  • macron
  • ogonek
  • ring
  • ring_and_acute
  • slash
  • stroke
  • stroke_and_acute
  • tilde
  • tittle
  • umlaut/diaresis
  • umlaut_and_macron

Each accent has their own attribute which is directly accessible from the dcl module.

dcl.acute('a')
>>> 'á'

These attributes return a Character object, which is essentially just a handy "wrapper" around our diacritic, which we can use to access various attributes to retrieve further information about the diacritic we're focusing on.

" char.character # the same as str(char) >>> 'ą' char.diacritic # some return >>> '˛' char.diacritic_name >>> 'ogonek' char.raw # returns the raw representation of our character >>> '\U00000105' char.raw_diacritic >>> '\U000002db' ">
char = dcl.ogonek('a')

repr(char)
>>> ""

char.character  # the same as str(char)
>>> 'ą'

char.diacritic  # some return 
>>> '˛'

char.diacritic_name
>>> 'ogonek'

char.raw  # returns the raw representation of our character
>>> '\U00000105'

char.raw_diacritic 
>>> '\U000002db'

Some functions can't take certain letters. For example, the letter h cannot take a cedilla diacritic. In this case, an exception is raised named DiacriticError. You can access this exception via dcl.errors.DiacriticError.

from dcl.errors import DiacriticError

try:
    char = dcl.cedilla('h')
except DiacriticError as e:
    print(e)
else:
    print(repr(char))

>>> 'Character h cannot take a cedilla diacritic'

If you want to, you may also use the DiacriticApplicant object from dcl.objects. The functions you see above use this object too, and it's virtually the same principle, except from the fact that we use properties to get the diacritic, and the class simply holds the string and it's properties. Alas with the functions above, this object also returns the same Character object through it's properties.

" ">
from dcl.objects import DiacriticApplicant

da = DiacriticApplicant('a')
repr(da.ogonek)
>>> ""

There is also the clean_diacritics function, accessible straight from the dcl module. This function allows us to completely clean a string from any diacritics.

>> 'Kreusada' dcl.clean_diacritics("Café") >>> 'Cafe' ">
dcl.clean_diacritics("Krëûšàdå")
>>> 'Kreusada'

dcl.clean_diacritics("Café")
>>> 'Cafe'

Along with this function, there's also count_diacritics, get_diacritics and has_diacritics.

The has_diacritics function simply checks if the string contains a character with a diacritic.

>> True dcl.has_diacritics("dcl") >>> False ">
dcl.has_diacritics("Café")
>>> True

dcl.has_diacritics("dcl")
>>> False

The get_diacritics function is used to get all the diacritics in a string. It returns a dictionary. For each diacritic in the string, the key will show the diacritic's index in the string, and the value will show the Character representation.

>> {3: } dcl.get_diacritics("Krëûšàdå") >>> {2: , 3: , 4: , 5: , 7: } ">
dcl.get_diacritics("Café")
>>> {3: <acute 'é'>}

dcl.get_diacritics("Krëûšàdå")
>>> {2: <umlaut 'ë'>, 3: <circumflex 'û'>, 4: <caron 'š'>, 5: <grave 'à'>, 7: <ring 'å'>}

The count_diacritics function counts the number of diacritics in a string. The actual implementation of this simply returns the dictionary length from get_diacritics.

>> 1 ">
dcl.count_diacritics("Café")
>>> 1

Creating an end user program

Creating a program would be pretty simple for this, and I'd love to be able to help you out with a base idea. Have a look at this for example:

import dcl
import string

from dcl.errors import DiacriticError

char = str(input("Enter a character: "))
if not char in string.ascii_letters:
    print("Please enter a letter from a-Z.")
else:
    accent = str(input("Enter an accent, you can choose from the following: " + ", ".join(dcl.diacritic_list)))
    if not dcl.isdiacritictype(accent):
        print("That was not a valid accent.")
    else:
        try:
            function = getattr(dcl, accent)  # or dcl.objects.DiacriticApplicant
            output = function(char)
        except DiacriticError as e:
            print(e)
        else:
            print(str(output))

It's worth checking if the provided accent is a diacritic type. If it is, then you can use getattr. Without checking, the user could provide a default global such as __file__.

You can also create a program which can remove diacritics from a string. It's made easy!

import dcl

string = str(input("Enter the string which you want to be cleared from diacritics: "))
print("Here is your cleaned string: " + dcl.clean_diacritics(string))

Or perhaps your program wants to count the number of diacritics contained within your string.

import dcl

string = str(input("This program will count the number of diacritics contained in your input. Enter a string: "))
count = dcl.count_diacritics(string)
if count == 1:
    grammar = "is"
else:
    grammar = "are"
print(f"There {grammar} {count} diacritics/accent in your string.")
Owner
Kreus Amredes
Python developer, contributor and maintainer. 🐍
Kreus Amredes
Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art!

osu-Extract Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art! Requirements python3 mutagen pillow Us

William Carter 2 Mar 09, 2022
Hide Your Secret Message in any Wave Audio File.

HiddenWave Embedding secret messages in wave audio file What is HiddenWave Hiddenwave is a python based program for simple audio steganography. You ca

TechChip 99 Dec 28, 2022
Terminal-based audio-to-text converter

att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa

Sven Eschlbeck 4 Dec 15, 2022
gentle forced aligner

Gentle Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text. Getting Started There are three ways to install Gentle.

1.2k Dec 30, 2022
DaisyXmusic ❤ A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 34 Oct 22, 2022
Scrap electronic music charts into CSV files

musiccharts A small python script to scrap (electronic) music charts into directories with csv files. Installation Download MusicCharts.exe Run MusicC

Dustin Scharf 1 May 11, 2022
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 02, 2023
A voice assistant which can handle your everyday task and allows you to book items from your favourite store!

Voicely Table of Contents About The Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Contact Acknowled

Awantika Nigam 2 Nov 17, 2021
Powerful, simple, audio tag editor for GNU/Linux

puddletag puddletag is an audio tag editor (primarily created) for GNU/Linux similar to the Windows program, Mp3tag. Unlike most taggers for GNU/Linux

341 Dec 26, 2022
Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

18 Sep 11, 2022
NovaMusic is a music sharing robot. Users can get music and music lyrics using inline queries.

A music sharing telegram robot using Redis database and Telebot python library using Redis database.

Hesam Norin 7 Oct 21, 2022
Enhanced Audio Player for Discord

Discodo is an enhanced audio player for discord

Mary 42 Oct 05, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
Small Python application that links a Digico console and Reaper, handling automatic marker insertion and tracking.

Digico-Reaper-Link This is a small GUI based helper application designed to help with using Digico's Copy Audio function with a Reaper DAW used for re

Justin Stasiw 10 Oct 24, 2022
Real-time audio visualizations (spectrum, spectrogram, etc.)

Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco

Timothée Lecomte 700 Dec 31, 2022
Make an audio file (really) long-winded

longwind Make an audio file (really) long-winded Daily repetitions are an illusion anyway.

Vincent Lostanlen 2 Sep 12, 2022
A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d

2 Dec 15, 2021
Audio fingerprinting and recognition in Python

dejavu Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here: How it works Dejavu can memorize audio by liste

Will Drevo 6k Jan 06, 2023
Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs.

vog_oracles Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs. Huge thanks to mzucker on GitHub for the note detection cod

19 Sep 29, 2022