SDS Resources | ÚFAL

This page summarise some of the available resources for building spoken dialogue systems.

General tools

OpenFST - http://www.openfst.org

OpenGRM - http://www.opengrm.org/

Automatic speech recognition

Julius - http://julius.sourceforge.jp/en/

C++ decoder
decoding with 60k words vocabulary in real-time
generates confusion networks
free for commercial use

Kaldi - http://kaldi.sourceforge.net/

C++
WFST - based decoder,
Also includes AM training tools

JUICER - http://juicer.amiproject.org/juicer/

WFST based decoder
can be user with OpenFST
free for commercial use

CMU Sphinx - http://cmusphinx.sourceforge.net/

C++ decoder (Pocketsphinx)
The rest is in Java
Also includes acoustic and language model training tools

RWTH ASR - http://www-i6.informatik.rwth-aachen.de/rwth-asr/

C++
WFST
generates confusion networks
non-free for commercial use

SRTK - https://bitbucket.org/yotaro/srtk

Python
OpenFST
probably not maintained any more

The acoustic models can be trained using:

HTK - http://htk.eng.cam.ac.uk/
CMU Sphinx - http://cmusphinx.sourceforge.net/

Acoustic data:

VOXFORGE - http://voxforge.org/
LDC - Linguistic Data Consortium - http://www.ldc.upenn.edu/

Speech synthesize

Flite - http://www.speech.cs.cmu.edu/flite/

C++
Speech synthesis only
English pre build voices
Python interface - http://code.google.com/p/py-audio/

Festival - http://www.cstr.ed.ac.uk/projects/festival/

C++
System for building voices
Includes unit selection synthesis and HTS

Project Festvox - http://festvox.org/

Source of new voices for Festival and Flite

Mary TTS - http://mary.dfki.de/

Java
Complete toolkit
Unit selection and HTS