Disruptive Technology Innovation Fund - Data-centre Audio/Visual Intelligence on-Device (DTIF-DAVID)

A child seated in a green chair looking at a teddy bear also seated in a separate green chair

The DTIF-DAVID project is a collaboration between XPERI, SoapBox Labs and NUIG funded by Enterprise Ireland (EI). Its main objective is the development of a multimodal (sound and vision) AI processing platform with low cost and low power consumption to be used for the creation of voice-enabled toys.

Embedded (on-device) processing of data is currently the preferred solution across the smart toy industry to enable Artificial Intelligence in smart toys. The key challenge remains how to deliver a high quality on device AI experience, with long playtime (battery life), in a cost-effective way, that requires ‘data-centre’ level processing – typically delivered via online cloud-based approaches.

An overview presentation of the DAVID platform with a focus on the privacy-by-design aspects of the platform can be found here.

Technologies built on the proposed embedded platform include Object/Gesture Detection/Recognition, Automatic Speech Recognition, Text-to-Speech Synthesis and more.

A detailed technical paper can be found here.

Our Collaborators

  •  Xperi-DTS is a World-Leading supplier of Imaging solutions for automotive and consumer markets.
  • SoapBox Labs is an award-winning voice technology company specializing in the development of automated speech recognition (ASR) for kids.

 

Child Speech Understanding and Synthesis for Edge-AI in Tomorrow’s Smart Toys

Primary Researcher:
Rishabh Jain

Principal Investigator:
Peter Corcoran

Description:
This research project is focused on speech and audio technology. Main goal is centered around using Text-To-Speech (TTS) technology specifically in the domain of child speech. As current TTS research is solely focused on using Adult Speech Data, this project aims to explore the use of Child Speech data to create better child speech synthesis models and tools. This project also aims to improve the current Automatic Speech Recognition (ASR) technologies in the domain of Child Speech by use of synthetic data generated from TTS models. A full list of research papers from the DAVID project can be found here.