STELLA: A Machine-Learning Speech-to-Text Mobile App

Posted on Sunday, December 6th, 2020

Client	Todd Kelley
Professor(s)	Jason Mombourquette,
Program	Computer Engineering Technology – Computing Science
Students	Johnathan Gonzalez Liang Chu Robin Saini Mayank Khera Nick Sturgeon Justin Dennison

Project Description:

Machine Learning is an exciting field of computer science that begins to blur the line between man and machine. By giving a computer lots of real-world data along with the expected results, it can begin to form an accurate idea of how to interpret new data. This is the basis for our transcript-creating application, STELLA (Speech Transcript Extraction and Labelling Linguistics Application).

Our client, Todd Kelley, approached us with an idea for a mobile application that would record a meeting and produce an accurate transcript, including a written account of the spoken text with timestamps and speaker identification. The goal was to improve the efficiency of taking meeting minutes by automating the process.

Machine Learning was an obvious first step in creating our application. A trained algorithm could take recorded audio and translate it into text separated by the individual speakers. We leveraged existing algorithms created by the top minds in the field, such as Google, to form the basis of our application.

In order to increase our developmental efficiency, we used a development framework that allowed us to create both an iOS and Android application from a single codebase, rather than having to create two full separate applications. This allowed us to focus more on the core features instead of having to duplicate everything for the differences between the two platforms.

We integrated user accounts into our application in order to allow users to access their meetings across devices. Users can sign up right from the application to streamline the account creation process. All audio is encrypted in storage to protect the users’ privacy, and meetings can only be accessed by the user who recorded them.

Management of meetings was a big requirement for our application. As with humans, the speech-to-text algorithms do not hear the spoken words accurately 100% of the time, and so the ability to edit any mistakes is a must. Within a meeting, users can edit individual entries and set the names of the identified speakers.

When viewing a meeting, viewers can play back the original recorded audio. This allows a reference to the source in order to verify any potential discrepancies. While playing the audio, the current entry associated with that timestamp will be highlighted to easily follow along.

Users can also share a meeting using the mobile device’s native sharing features. This includes text messaging, email, or even just saving to a text file. Speakers, timestamps, and spoken text are included in the shared meetings.

We believe that this application has the potential to greatly increase the efficiency of creating written records of meetings. Along the way, our team learned a great deal about developing mobile applications and working with Machine Learning.

Short Description:

STELLA (Speech Transcript Extraction and Labelling Linguistics Application) is a mobile app developed to ease taking meeting minutes by utilizing machine learning to identify and decipher spoken audio. It allows management and sharing of meetings.

Contact the Team