Research on Automatic Speech Recognition for dysarthric speech
-
Updated
Aug 1, 2024 - Jupyter Notebook
Research on Automatic Speech Recognition for dysarthric speech
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
Production First and Production Ready End-to-End Speech Recognition Toolkit
On-device speech-to-text engine powered by deep learning
On-device streaming speech-to-text engine powered by deep learning
Text To Speech (TTS) and Automatic Speech Recognition (ASR).
Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper ... fast!
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Thonburian Whisper: Open models for fine-tuned Whisper in Thai. Try our demo on Huggingface space:
⚡ TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.
Synthetic data augmentation technique via LLM for Automatic Speech Recognition fine tuning.
🐍📦 Rapidly calculate and analyze the Word Error Rate (WER) with this powerful yet lightweight Python package.
Al Ajwad is a graduation project submitted to the Department of Computers and Systems Engineering, Minia University as partial fulfilment for a B.Sc. degree. It is an ASR model trained to recognize the Tajweed rules of The Holy Quran recitation.
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
A modification on the Sharif Emotional Speech Database
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
[UAI 2024 paper] DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution.
Add a description, image, and links to the automatic-speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the automatic-speech-recognition topic, visit your repo's landing page and select "manage topics."