Saturday 2 p.m.–2:30 p.m.
Karaoke-style Read-aloud System
Renyuan Lyu
- Audience level:
- Novice
- Category:
- Education
Description
Using Cloud TTS (Text-to-speech) technology, such as Google Translate, iSpeech ... etc, and simple Speech-recognition technology, can be made to establish a method to create a Speech-text Synchronization file from original text-only file, that file can be used to show time-aligned high-light text like karaoke, which are very useful for language learning purpose.
Abstract
Starting from a text-only file,
using a cloud-based text-to-speech (TTS) technology,
like Google Translate,
and also a speech-recognition technology,
like Hidden Markov Model Toolkits (HTK),
we can generate its associated timed text file
which aligns up text with speech waveform file
in temporal axis.
Python is used not only as a glue
to link all different styles of software resources,
like Google Translate and HTK,
but also as a powerful tool
to deal with all text processing tasks in this project.
From such a kind of timed text file, we also provide a browsing web-app
to demonstrate the time-aligned high-lighted text
like a karaoke machine in word level,
which are considered very useful for the language learning purpose.