Feature #274: W3: Integrate Speech-to-Text (Whisper / OpenAI) - PoC Interactive avatar - Redmine

Actions

Copy link

Feature #274

open

W3: Integrate Speech-to-Text (Whisper / OpenAI)

Added by Anonymous 3 months ago. Updated 3 months ago.

Status:

Resolved

Priority:

Normal

Assignee:

Start date:

11/20/2025

Due date:

12/04/2025 (about 3 months late)

% Done:

Estimated time:

3:30 h

Description

Use a Speech-to-Text service to convert incoming audio into text.
Handle STT latency, errors, and return the recognized text to Unreal.
Verify accuracy with multiple speech samples.

Actions

Copy link

Updated by Anonymous 3 months ago

Status changed from Re-opened to New

Actions

Copy link

Updated by Anonymous 3 months ago

Status changed from New to In Progress

Actions

Copy link

Updated by Anonymous 3 months ago

Status changed from In Progress to Resolved

Actions

Copy link

Updated by Anonymous 3 months ago

Unreal Engine handled the microphone recording automatically through the built-in Audio Capture system. My work focused on ensuring that the recorded audio was correctly exported so it could be used by the backend Speech-to-Text (Whisper/OpenAI) service.

What was implemented:

Configured and tested UE5’s automatic microphone recording pipeline.

Verified that UE5 correctly exported .wav files suitable for STT processing.

Prepared the logic to send the recorded audio file to the backend, which handles transcription using Whisper/OpenAI.

Confirmed that Unreal can detect when recording stops and that the audio file is ready for backend processing.

What is not implemented yet:

Unreal Engine does not call Whisper/OpenAI directly.

STT transcription currently happens outside UE, on the backend.

JSON handling and displaying the returned transcript in UE was not part of this step.

This completes the Unreal-side preparation required for the backend STT pipeline.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

PoC Interactive avatar

Custom queries

Feature #274

W3: Integrate Speech-to-Text (Whisper / OpenAI)

Updated by Anonymous 3 months ago

Updated by Anonymous 3 months ago

Updated by Anonymous 3 months ago

Updated by Anonymous 3 months ago