Romanian TTS - Online text-to-speech system

Online demo

Demo under maintanence

This work is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License.
THE CONTRIBUTORS TO THIS WORK DISCLAIM ALL WARRANTIES WITH REGARD TO THIS DATA, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE CONTRIBUTORS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS DATA.

About

This online demo of Romanian text-to-speech systems is a result of two different projects:

1) The PRODOC Project, funded by the European Social Fund, under grant agreement POSDRU/6/1.5/S/5 which offered a 6 month research scholarship to Adriana Stan at The Centre for Speech Technology Research, University of Edinburgh, UK, under the supervision of prof. Simon KING, dr. Junichi Yamagishi and dr. Matthew Aylett. During this visit the first version of the Romanian TTS system was developed, and it was based on the RSS Database and Cereproc's front-end framework.

2) The SWARA Project, funded by the Romanian Ministry of Education under grant agreement PN-II-PT-PCCA-2013-4 No 6/2014, which aims at providing a portable, fast and easy to use assistive speech synthesis system for laryngectomized patients, enabling them to interact in an almost natural manner with other social participants by using a customised voice.

SWARA is a collaborative project between The Technical University of Cluj-Napoca, SC FORTECH SRL, Iuliu Haţieganu University of Medicine and Pharmacy Cluj-Napoca and Babeş-Bolyai University, Cluj-Napoca.

Two of the major results of this project are the SWARA Corpus and the SWARA Front-end processor for Romanian. These two components are used in the demo above.

More information on the original TTS system can be found in this article:

Adriana Stan, Junichi YAMAGISHI, Simon KING, Matthew AYLETT, The Romanian Speech Synthesis (RSS) corpus: building a high quality HMM-based speech synthesis system using a high sampling rate, Speech Communication vol 53, pg. 442-450, 2011, 2011, doi: 10.1016/j.specom.2010.12.002 pdf | bib

Romanian Speech Synthesis (RSS) Database

The Romanian speech synthesis (RSS) corpus was recorded in a hemianechoic chamber (anechoic walls and ceiling; floor partially anechoic) at the University of Edinburgh. We used three high quality studio microphones: a Neumann u89i (large diaphragm condenser), a Sennheiser MKH 800 (small diaphragm condenser with very wide bandwidth) and a DPA 4035 (headset-mounted condenser). Although the current release includes only speech data recorded via Sennheiser MKH 800, we may release speech data recorded via other microphones in the future. All recordings were made at 96 kHz sampling frequency and 24 bits per sample, then downsampled to 48 kHz sampling frequency. For recording, downsampling and bit rate conversion, we used ProTools HD hardware and software. We conducted 8 sessions over the course of a month, recording about 500 sentences in each session. At the start of each session, the speaker listened to a previously recorded sample, in order to attain a similar voice quality and intonation.

DOWNLOAD WEBSITE: http://romaniantts.com/rssdb/

The RSS databse is described more thoroughly in the following paper:

The SWARA Corpus

The SWARA Corpus is a result of the SWARA Project, funded by the Romanian Ministry of Education, under the grant agreement PN-II-PT-PCCA-2013-4 No 6/2014. The corpus contains over 21 hours of high quality recordings from 17 different speakers. The data is segmented in 19,279 utterances and includes their orthographic transcripts and semi-automatic phone-level alignments.

DOWNLOAD WEBSITE: http://speech.utcluj.ro/swarasc/

A complete description of the SWARA Corpus is presented in the following paper:

Adriana Stan, Florina Dinescu, Cristina Țiple, Șerban Meza, Bogdan Orza, Magdalena Chirilă and Mircea Giurgiu, The SWARA Speech Corpus: A Large Parallel Romanian Read Speech Dataset, in Proceedings of the 9th Conference on Speech Technology and Human-Computer Dialogue, Bucharest, Romania, July 6-9, 2017 pdf | bib

Online demo

Demo under maintanence

About

Developers

Romanian Speech Synthesis (RSS) Database

The SWARA Corpus