Skip to main content

Questions tagged [voice-recognition]

Voice Recognition means identification of the person talking and is frequently misapplied to mean "Speech Recognition" - identification of what is being said.

voice-recognition
0 votes
0 answers
9 views

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...
dw26's user avatar
  • 1
-4 votes
0 answers
20 views

How can I access to these elements? [closed]

LIVEKIT_URL= LIVEKIT_API_KEY= LIVEKIT_API_SECRET= DEEPGRAM_API_KEY= enter image description here the screen shot from my account related to the project Is the URL at the top of the page? I can'...
Raha's user avatar
  • 1
0 votes
0 answers
46 views

SpeechToText (Voice Recognition) result is empty string on Android device but not on Windows .NET MAUI

Can anyone solve or explain this problem. Voice Recognition result is ok on Windows but return empty string on Android devices. Is there possible some permission problem or what else might cause this ...
Weissu's user avatar
  • 419
0 votes
0 answers
38 views

Android, how to launch application via Google Assistant?

I'm implementing an application and they are providing their app through their Website, not from Google Play Console. I just wanted to integrate Google Assistant into that application. All I need to ...
testivanivan's user avatar
  • 1,318
0 votes
0 answers
54 views

How to input MediaRecorder webm opus bytes to Whisper model?

I am recording voice, on the client side, using MediaRecorder, and sending the resulting blob of (webm, opus) bytes to the server using a WebSocket, with this code: <script type="text/...
Bob Bobson's user avatar
0 votes
0 answers
19 views

Sending file from Raspberry Pi Pico to voice recognition API

I have a project of a bedside lamp with a voice assistant using special requests. I have this code on my PC : import os import shutil import time import requests import speech_recognition as sr from ...
Melki Youssef's user avatar
0 votes
0 answers
62 views

Synthesizing Audio with Unseen Speakers Using Pre-trained VITS Model

I've been using a pre-trained VITS model (VCTK dataset) for text-to-speech synthesis. I've successfully obtained a list of available speakers using the command: !tts --model_name tts_models/en/vctk/...
Adil Ahmed Chowdhury's user avatar
1 vote
1 answer
49 views

To address the issue of the microphone picking up both the user's voice and the NVDA (Nonvisual Desktop Access) voice output while recording

I have implemented a voice-to-text feature in my application, and I am utilizing the connected earphone microphone for voice input using navigator.mediaDevices.getUserMedia({audio: true}) API. The ...
Moni's user avatar
  • 21
2 votes
1 answer
236 views

offline voice recognition for Arabic Language in flutter?

I am using speech_to_text and it's very good online , but i want it work offline how to make this in Flutter? according to readme file of package in google App install google app Settings > Voice ...
Tarek_ElsaWy's user avatar
0 votes
0 answers
22 views

Duplicate Recording Requests in Browser Extension for Mic/Screen sourced Audio Transcription

I have a website where a function is the user can transcribe audio from their mic and pc audio at the same time, to run other things (core function). Since he browser won't let computer audio to be ...
D M's user avatar
  • 13
0 votes
0 answers
83 views

Speechbrains SpeakerRecognition saves short cuts/links/symlinks of used audio files in working directory

I use the speaker recognition of speechbrain using the Python language: from speechbrain.inference.speaker import SpeakerRecognition and I load a model in the following way model = SpeakerRecognition....
Tütü's user avatar
  • 3
-1 votes
1 answer
120 views

Do some LLMs understand the voice directly, or do they have to go through a text transcription stage? [closed]

I want to interact with an LLM via voice. In order to select the right model, I'd like to know if there are LLMs that understand voice directly. If not, I'll have to transcribe the user's voice into ...
Heloddius's user avatar
0 votes
0 answers
35 views

Android SpeechRecognizer not working with Chinese

I can't get the SpeechRecognizer to work with Chinese even though I have the language downloaded under "Offline Voice Recognition" under google settings. Other languages like English work ...
Andrey Starenky's user avatar
0 votes
0 answers
25 views

Hotwords won't trigger on bumblebee-hotword-node

I'm trying to setup bumblebee (bumblebee-hotword-node) within a server that will listen to a discord server for the 'bumblebee' hotword derived from https://github.com/SteTR/Emost-Bot/tree/master So ...
Jack blundell's user avatar
0 votes
0 answers
21 views

Why doesn't video-conferencing with subtitles exist?

Voice-to-text has existed for a long time, and is built into every major desktop and mobile OS. Video conferencing exists on many platforms and across many protocols. Language translation tools can ...
Shane Millsom's user avatar

15 30 50 per page
1
2 3 4 5
98