This page summarise some of the available resources for building spoken dialogue systems.
General tools
OpenFST - http://www.openfst.org
OpenGRM - http://www.opengrm.org/
Automatic speech recognition
Julius - http://julius.sourceforge.jp/en/
- C++ decoder
- decoding with 60k words vocabulary in real-time
- generates confusion networks
- free for commercial use
Kaldi - http://kaldi.sourceforge.net/
- C++
- WFST - based decoder,
- Also includes AM training tools
JUICER - http://juicer.amiproject.org/juicer/
- WFST based decoder
- can be user with OpenFST
- free for commercial use
CMU Sphinx - http://cmusphinx.sourceforge.net/
- C++ decoder (Pocketsphinx)
- The rest is in Java
- Also includes acoustic and language model training tools
- C++
- WFST
- generates confusion networks
- non-free for commercial use
- Python
- OpenFST
- probably not maintained any more
The acoustic models can be trained using:
- HTK - http://htk.eng.cam.ac.uk/
- CMU Sphinx - http://cmusphinx.sourceforge.net/
Acoustic data:
- VOXFORGE - http://voxforge.org/
- LDC - Linguistic Data Consortium - http://www.ldc.upenn.edu/
Speech synthesize
- C++
- Speech synthesis only
- English pre build voices
- Python interface - http://code.google.com/p/py-audio/
Festival - http://www.cstr.ed.ac.uk/projects/festival/
- C++
- System for building voices
- Includes unit selection synthesis and HTS
Project Festvox - http://festvox.org/
- Source of new voices for Festival and Flite
Mary TTS - http://mary.dfki.de/
- Java
- Complete toolkit
- Unit selection and HTS