asr

Here are 1,020 public repositories matching this topic...

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Jun 3, 2024
Python

R3gm / SoniTranslate

Star

Synchronized Translation for Videos. Video dubbing

text-to-speech translation tts speech-to-text stt audio-processing asr document-translator dubbing diarization automatic-dubbing subtitle-to-speech translate-audio translate-video video-dubbing

Updated Jun 2, 2024
Python

mkiol / dsnote

Star

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

text-to-speech translator translation offline machine-translation sailfishos tts speech-synthesis speech-recognition speech-to-text nmt linux-desktop stt asr flatpak-applications

Updated Jun 2, 2024
C++

DmitryRyumin / ICASSP-2023-24-Papers

Star

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Updated Jun 2, 2024
Python

unnumsykar / knowledge-transfer-GenAI

Star

how to compress large knowledge base (.mp4, .mp3, .wav) and transfer it into readable, short, summarized form for effective knowledge transfer

asr gpt-4 genai-usecase

Updated Jun 2, 2024

k2-fsa / sherpa-onnx

Star

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift

android windows macos linux raspberry-pi ios text-to-speech csharp cpp dotnet speech-to-text aarch64 mfc risc-v asr arm32 onnx vits openkylin

Updated Jun 2, 2024
C++

m-bain / whisperX

Star

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

speech speech-recognition speech-to-text whisper asr

Updated Jun 2, 2024
Python

CheshireCC / faster-whisper-GUI

Star

faster_whisper GUI with PySide6

openai vad whisper asr transcribe voice-transcription faster-whisper whisperx

Updated Jun 2, 2024
Python

swapnil233 / qualsearch-nextjs

Star

Comprehensive qualitative data analysis software for UX research. User interview tagging, AI-supported analysis, team management, etc.

ux transcription asr ux-testing ux-research caqdas diarization ux-analytics deepgram thematic-analysis automatic-speaker-recognition

Updated Jun 1, 2024
TypeScript

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

Updated Jun 1, 2024
Python

PaddlePaddle / PaddleSpeech

Star

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Jun 1, 2024
Python

wenet-e2e / wenet

Star

Production First and Production Ready End-to-End Speech Recognition Toolkit

pytorch transformer speech-recognition automatic-speech-recognition production-ready whisper asr conformer e2e-models

Updated Jun 1, 2024
Python

deepgram / deepgram-python-sdk

Star

Official Python SDK for Deepgram's automated speech recognition APIs.

python speech-recognition hacktoberfest asr deepgram automated-speech-recognition

Updated May 31, 2024
Python

AssemblyAI / assemblyai-node-sdk

Star

The AssemblyAI JavaScript SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

nodejs ai speech-to-text transcription asr assemblyai llm

Updated May 31, 2024
TypeScript

KevKibe / African-Whisper

Star

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

speech speech-recognition speech-to-text whisper asr speech-translation speech-transcription

Updated May 31, 2024
Python

achrafash / voice-studio

Star

An audio labeling tool for transcription, diarization, VAD, and more.

data machine-learning labeling asr

Updated May 31, 2024
TypeScript

k2-fsa / sherpa

Star

Speech-to-text server framework with next-gen Kaldi

python cpp websocket pytorch speech-recognition transducer asr ctc end-to-end-asr

Updated May 31, 2024
C++

inworld-ai / inworld-web-sdk

Star

Web SDK for Inworld.ai. Integrate AI characters into your browser.

ai character tts speech-recognition npc asr

Updated Jun 1, 2024
TypeScript

AssemblyAI / assemblyai-ruby-sdk

Star

The AssemblyAI Ruby SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

ruby ai speech-to-text transcription stt asr assemblyai llm

Updated May 30, 2024
Ruby

AssemblyAI / assemblyai-java-sdk

Star

The AssemblyAI Java SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

java ai speech-to-text transcription stt asr assemblyai llm

Updated May 31, 2024
Java

Improve this page

Add a description, image, and links to the asr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the asr topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

asr

Here are 1,020 public repositories matching this topic...

NVIDIA / NeMo

R3gm / SoniTranslate

mkiol / dsnote

DmitryRyumin / ICASSP-2023-24-Papers

unnumsykar / knowledge-transfer-GenAI

k2-fsa / sherpa-onnx

m-bain / whisperX

CheshireCC / faster-whisper-GUI

swapnil233 / qualsearch-nextjs

speechbrain / speechbrain

PaddlePaddle / PaddleSpeech

wenet-e2e / wenet

deepgram / deepgram-python-sdk

AssemblyAI / assemblyai-node-sdk

KevKibe / African-Whisper

achrafash / voice-studio

k2-fsa / sherpa

inworld-ai / inworld-web-sdk

AssemblyAI / assemblyai-ruby-sdk

AssemblyAI / assemblyai-java-sdk

Improve this page

Add this topic to your repo