Emmabuntus, Ubuntu, Derivate, Linux, Open Source BackTrack, Linux, distributions, Ubuntu, derivate, securuty, forensic VirtualBox, Linux, Ubuntu, Raring Ringtail synaptic, Ubuntu, Linux, software packages jwplayer, multimedia, Linux, Ubuntu, flash Meshlab, graphic, software, Ubuntu, open source, Linux Synapse, Linux, Ubuntu, raring, Quantal Gimp, Ubuntu, Linux FreeMind, Linux, open source Linux, infographic, history

eSpeak is a compact open source software speech synthesizer for Linux.

eSpeak is a compact open source software speech synthesizer for Linux and other platforms. It uses a formant synthesis method, providing many languages in a small size.

Much of the programming for eSpeak's languages was based on information found on Wikipedia, with some subsequent feedback from native speakers.

Projects using eSpeak include NVDA, Ubuntu and OLPC, and it has also been used by Google Translate.

eSpeak is derived from the "Speak" speech synthesizer for British English for Acorn RISC OS computers which was originally written in 1995.

A rewritten version for Linux appeared in February 2006 and a Windows SAPI 5 version in January 2007. Subsequent development has added and improved support for additional languages.
Because of its small size and many languages, it is included as the default speech synthesizer in the NVDA open source screen reader for Windows, and on the Ubuntu and other Linux installation discs.

The quality of the language voices varies greatly. Some have had more work or feedback from native speakers than others. Most of the people who have helped to improve the various languages are blind users of text-to-speech.

eSpeak provides two methods of synthesis: the original eSpeak synthesizer and a Klatt synthesizer.
In addition, eSpeak can be used as a front-end, providing text-to-phoneme translation and prosody, to MBROLA diphone voices.

The eSpeak and Klatt synthesizers use different types of formant synthesis.

The eSpeak synthesizer creates voiced speech sounds such as vowels and sonorant consonants by adding together sine waves to make the formant peaks. Unvoiced consonants such as /s/ are made by playing recorded sounds. Voiced consonants such as /z/ are made by mixing a synthesized voiced sound with a recorded unvoiced sound.

The Klatt synthesizer mostly uses the same formant data as the eSpeak synthesizer. It produces voiced sounds by starting with a waveform which is rich in harmonics (simulating the vibration of the vocal cords) and then applying digital filters in order to produce speech sounds.

  • eSpeak can be used as a command-line program, or as a shared library.
  • It supports Speech Synthesis Markup Language (SSML).
  • Language voices are identified by the language's ISO 639-1 code. They can be modified by "voice variants". These are text files which can change characteristics such as pitch range, add effects such as echo, whisper and croaky voice, or make systematic adjustments to formant frequencies to change the sound of the voice. For example, "af" is the Afrikaans voice. "af+f2" is the Afrikaans voice modified with the "f2" voice variant which changes the formants and the pitch range to give a female sound.
  • eSpeak uses an ASCII representation of phoneme names which is loosely based on the Kirshenbaum system.
  • Phonetic representations can be included within text input by including them within double square-brackets. For example: espeak -v en "Hello [[w3:ld]]" will say "Hello world" in English.

Adserver           610x250

Custom Search
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:

Share on Google Plus

About Hugo Repetto

Ubuntu is a Linux distribution that offers an operating system predominantly focused on desktop computers but also provides support for servers. Based on Debian GNU / Linux, Ubuntu focuses on ease of use, freedom in usage restriction, regular releases (every 6 months) and ease of installation.
    Blogger Comment
    Facebook Comment


Post a Comment