🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
Updated
Jun 11, 2024 - Python
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
Port of OpenAI's Whisper model in C/C++
A voice-operated emailing mobile application that allows you to compose and send email messages through voice commands.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
End-to-End Speech Processing Toolkit
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Customizable TTS Chat Bot using OpenAI & Google Cloud TTS/ElevenLabs
Talk to Rawan voice-to-voice using speech recognition or text-to-speech, with elevenlabs technology and chatgpt on the web.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
OBS plugin for local speech recognition and captioning using AI
Tools for handling speech data in machine learning projects.
Official Python SDK for Deepgram's automated speech recognition APIs.
Official repository for the Opensource Textdataset for NMT for local langues in West Africa (EWE Corpus)
A library for real-time voice processing in web browsers
Running speech to text model (whisper.cpp) in Unity3d on your local machine.
A simple speech-to-text and text-to-speech program/frontend.
Add a description, image, and links to the speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speech-recognition topic, visit your repo's landing page and select "manage topics."