Saturday 2 p.m.–2:30 p.m.

Karaoke-style Read-aloud System

Renyuan Lyu

Audience level:


Using Cloud TTS (Text-to-speech) technology, such as Google Translate, iSpeech ... etc, and simple Speech-recognition technology, can be made to establish a method to create a Speech-text Synchronization file from original text-only file, that file can be used to show time-aligned high-light text like karaoke, which are very useful for language learning purpose.


Starting from a text-only file, using a cloud-based text-to-speech (TTS) technology, like Google Translate, and also a speech-recognition technology, like Hidden Markov Model Toolkits (HTK), we can generate its associated timed text file which aligns up text with speech waveform file in temporal axis. Python is used not only as a glue to link all different styles of software resources, like Google Translate and HTK, but also as a powerful tool to deal with all text processing tasks in this project. From such a kind of timed text file, we also provide a browsing web-app to demonstrate the time-aligned high-lighted text like a karaoke machine in word level, which are considered very useful for the language learning purpose.
  • このエントリーをはてなブックマークに追加