SPEECH_2_TEXT

Performs speech to text on the selected audio file. Params: file_path : File File name of the audio file. Returns: out : String Filename and path of the recording.

Python Code

from flojoy import flojoy, DataContainer, String, File
from typing import Optional
from huggingsound import SpeechRecognitionModel


@flojoy(deps={"huggingsound": "0.1.6"})
def SPEECH_2_TEXT(
    file_path: File | None = None,
    default: Optional[DataContainer] = None,
) -> Optional[DataContainer]:
    """Performs speech to text on the selected audio file.

    Parameters
    ----------
    file_path: File
        File name of the audio file.

    Returns
    -------
    String
        Filename and path of the recording.
    """
    file_path = file_path.unwrap() if file_path else None
    model = SpeechRecognitionModel("jonatasgrosman/wav2vec2-large-xlsr-53-english")

    audio_paths = [file_path]
    transcriptions = model.transcribe(audio_paths)

    return String(s=transcriptions[0]["transcription"])

Find this Flojoy Block on GitHub

Example

Having problems with this example app? Join our Discord community and we will help you out!

In this example, we perform speech to text with the RECORD AUDIO and SPEECH_2_TEXT blocks.

This example requires a working microphone.

Note the the RECORD AUDIO requires a path and file_name. The path will be extracted from the file you choose, it does not have to match the file name chosen. The file will therefore be {path}/{file_name}.wav. The SPEECH_2_TEXT node however requires you choose the exact file. If the file is not created yet, you may have to run the RECORD AUDIO block first.