Web-based prototype application for the LetsRead project
A tool has been developed to analyze child utterances and compute results of disfluency detection and overall performance.
- Previously, the opinion of 108 teachers was gathered about 150 children of the LetsRead database (10 teachers per child in average), as a ground truth of overall reading performance.
- The same utterances of the 150 children can be selected and heard in the application.
- The current sentence is shown in a large font, simulating an application where a child would have to read it live.
- Along with the audio signal, an automatic annotation is presented where extra content and mispronunciations are detected.
- Stats of correct words and overall score are computed per sentence and accumulated and re-computed as more utterances of that child are heard.
- The mean of overall performance score given by teachers (0-5) and standard deviations are presented.
- Finally, the computed and accumulated scores are graphically displayed and compared to the ground truth.
Automatic Processing
- Word-level alignment of the spoken content with the original prompt is obtained to detect disfluencies such as false-starts, repetitions and mispronunciations.
- Acoustic models based on Hidden Markov Models and Neural Networks targeted to children were trained for the above.
- Several features are obtained and combined to compute an overall reading ability score.
The Letsread project: http://lsi.co.it.pt/spl/projects_letsread.html