Questions tagged [voice-recognition]
Voice Recognition means identification of the person talking and is frequently misapplied to mean "Speech Recognition" - identification of what is being said.
voice-recognition
1,459
questions
0
votes
0
answers
9
views
Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training
I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...
-4
votes
0
answers
20
views
How can I access to these elements? [closed]
LIVEKIT_URL=
LIVEKIT_API_KEY=
LIVEKIT_API_SECRET=
DEEPGRAM_API_KEY=
enter image description here the screen shot from my account related to the project
Is the URL at the top of the page? I can'...
0
votes
0
answers
46
views
SpeechToText (Voice Recognition) result is empty string on Android device but not on Windows .NET MAUI
Can anyone solve or explain this problem.
Voice Recognition result is ok on Windows but return empty string on Android devices.
Is there possible some permission problem or what else might cause this ...
0
votes
0
answers
38
views
Android, how to launch application via Google Assistant?
I'm implementing an application and they are providing their app through their Website, not from Google Play Console.
I just wanted to integrate Google Assistant into that application. All I need to ...
0
votes
0
answers
54
views
How to input MediaRecorder webm opus bytes to Whisper model?
I am recording voice, on the client side, using MediaRecorder, and sending the resulting blob of (webm, opus) bytes to the server using a WebSocket, with this code:
<script type="text/...
0
votes
0
answers
19
views
Sending file from Raspberry Pi Pico to voice recognition API
I have a project of a bedside lamp with a voice assistant using special requests. I have this code on my PC :
import os
import shutil
import time
import requests
import speech_recognition as sr
from ...
0
votes
0
answers
62
views
Synthesizing Audio with Unseen Speakers Using Pre-trained VITS Model
I've been using a pre-trained VITS model (VCTK dataset) for text-to-speech synthesis. I've successfully obtained a list of available speakers using the command:
!tts --model_name tts_models/en/vctk/...
1
vote
1
answer
49
views
To address the issue of the microphone picking up both the user's voice and the NVDA (Nonvisual Desktop Access) voice output while recording
I have implemented a voice-to-text feature in my application, and I am utilizing the connected earphone microphone for voice input using navigator.mediaDevices.getUserMedia({audio: true}) API. The ...
2
votes
1
answer
236
views
offline voice recognition for Arabic Language in flutter?
I am using speech_to_text and it's very good online
, but i want it work offline how to make this in Flutter?
according to readme file of package
in google App
install google app
Settings > Voice ...
0
votes
0
answers
22
views
Duplicate Recording Requests in Browser Extension for Mic/Screen sourced Audio Transcription
I have a website where a function is the user can transcribe audio from their mic and pc audio at the same time, to run other things (core function). Since he browser won't let computer audio to be ...
0
votes
0
answers
83
views
Speechbrains SpeakerRecognition saves short cuts/links/symlinks of used audio files in working directory
I use the speaker recognition of speechbrain using the Python language:
from speechbrain.inference.speaker import SpeakerRecognition
and I load a model in the following way
model = SpeakerRecognition....
-1
votes
1
answer
120
views
Do some LLMs understand the voice directly, or do they have to go through a text transcription stage? [closed]
I want to interact with an LLM via voice.
In order to select the right model, I'd like to know if there are LLMs that understand voice directly.
If not, I'll have to transcribe the user's voice into ...
0
votes
0
answers
35
views
Android SpeechRecognizer not working with Chinese
I can't get the SpeechRecognizer to work with Chinese even though I have the language downloaded under "Offline Voice Recognition" under google settings.
Other languages like English work ...
0
votes
0
answers
25
views
Hotwords won't trigger on bumblebee-hotword-node
I'm trying to setup bumblebee (bumblebee-hotword-node) within a server that will listen to a discord server for the 'bumblebee' hotword derived from https://github.com/SteTR/Emost-Bot/tree/master
So ...
0
votes
0
answers
21
views
Why doesn't video-conferencing with subtitles exist?
Voice-to-text has existed for a long time, and is built into every major desktop and mobile OS.
Video conferencing exists on many platforms and across many protocols.
Language translation tools can ...