...
Associated JBoss community project(s): Drogue IoT
Code for Cause
Summary of idea:
In Drogue IoT we have an example/demo/PoC, which implements a voice assistant based on our Drogue IoT cloud deployment. There is wake-word/keyword/trigger-word detection, which initiates the audio recording. Once the snippet is recorded it will be processed by a speech-to-text backend and is further processed.
For the wake-word detection we are currently using Pocketsphinx. However, the detection rate is pretty bad. Also it doesn't run on embedded devices.
The outcome of the project should be to use Tensforflow (Lite) to create a model for wake word detection. Additionally a program (script) which continuously listens to audio input (microphone). Once the trigger is detected, it records for up to X seconds, or until it detects silence. A single command is sufficient ("hey rodney"), no full speech-to-text model is required. It is perfectly fine to re-use existing data sets and scripts, as long as they are open source, and we can build this ourselves.
This should be able to run on a Raspberry Pi like device, with the prospect of being able to run on embedded devices (microcontroller) using Tensorflow Lite. Bonus points for actually porting that to a micro-controller, but that is not required.
Knowledge prerequisite:
- AI/ML - Tensorflow
- Python (probably)
- Linux
- Embedded programming (optionally)
Github repo: https://github.com/drogue-iot/drogue-cloud, https://github.com/drogue-iot/hey-rodney
Skill level: Advanced
Project Chat: https://matrix.to/#/#drogue-iot:matrix.org
Contact(s) / potential mentors(s): Jens Reimann (jreimann@redhat.com)
Associated JBoss community project(s): Drogue IoT
EAT - Testing Infinite Software Project Versions
...