Newest 'voice-recognition' Questions

0 votes

0 answers

9 views

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...

dw26

1

asked Jul 17 at 1:45

-4 votes

0 answers

20 views

How can I access to these elements? [closed]

LIVEKIT_URL= LIVEKIT_API_KEY= LIVEKIT_API_SECRET= DEEPGRAM_API_KEY= enter image description here the screen shot from my account related to the project Is the URL at the top of the page? I can'...

Raha

1

asked Jul 16 at 15:20

0 votes

0 answers

46 views

SpeechToText (Voice Recognition) result is empty string on Android device but not on Windows .NET MAUI

Can anyone solve or explain this problem. Voice Recognition result is ok on Windows but return empty string on Android devices. Is there possible some permission problem or what else might cause this ...

Weissu

419

asked Jun 30 at 9:08

0 votes

0 answers

38 views

Android, how to launch application via Google Assistant?

I'm implementing an application and they are providing their app through their Website, not from Google Play Console. I just wanted to integrate Google Assistant into that application. All I need to ...

testivanivan

1,318

asked Jun 18 at 16:12

0 votes

0 answers

54 views

How to input MediaRecorder webm opus bytes to Whisper model?

I am recording voice, on the client side, using MediaRecorder, and sending the resulting blob of (webm, opus) bytes to the server using a WebSocket, with this code: <script type="text/...

Bob Bobson

1

asked Jun 9 at 21:22

0 votes

0 answers

19 views

Sending file from Raspberry Pi Pico to voice recognition API

I have a project of a bedside lamp with a voice assistant using special requests. I have this code on my PC : import os import shutil import time import requests import speech_recognition as sr from ...

Melki Youssef

1

asked May 16 at 19:11

0 votes

0 answers

62 views

Synthesizing Audio with Unseen Speakers Using Pre-trained VITS Model

I've been using a pre-trained VITS model (VCTK dataset) for text-to-speech synthesis. I've successfully obtained a list of available speakers using the command: !tts --model_name tts_models/en/vctk/...

Adil Ahmed Chowdhury

365

asked May 12 at 21:29

1 vote

1 answer

49 views

To address the issue of the microphone picking up both the user's voice and the NVDA (Nonvisual Desktop Access) voice output while recording

I have implemented a voice-to-text feature in my application, and I am utilizing the connected earphone microphone for voice input using navigator.mediaDevices.getUserMedia({audio: true}) API. The ...

Moni

21

asked May 12 at 13:44

2 votes

1 answer

236 views

offline voice recognition for Arabic Language in flutter?

I am using speech_to_text and it's very good online , but i want it work offline how to make this in Flutter? according to readme file of package in google App install google app Settings > Voice ...

Tarek_ElsaWy

47

asked May 4 at 16:15

0 votes

0 answers

22 views

Duplicate Recording Requests in Browser Extension for Mic/Screen sourced Audio Transcription

I have a website where a function is the user can transcribe audio from their mic and pc audio at the same time, to run other things (core function). Since he browser won't let computer audio to be ...

D M

13

asked Apr 30 at 14:53

0 votes

0 answers

83 views

Speechbrains SpeakerRecognition saves short cuts/links/symlinks of used audio files in working directory

I use the speaker recognition of speechbrain using the Python language: from speechbrain.inference.speaker import SpeakerRecognition and I load a model in the following way model = SpeakerRecognition....

Tütü

3

asked Apr 12 at 11:26

-1 votes

1 answer

120 views

Do some LLMs understand the voice directly, or do they have to go through a text transcription stage? [closed]

I want to interact with an LLM via voice. In order to select the right model, I'd like to know if there are LLMs that understand voice directly. If not, I'll have to transcribe the user's voice into ...

Heloddius

3

asked Apr 1 at 10:49

0 votes

0 answers

35 views

Android SpeechRecognizer not working with Chinese

I can't get the SpeechRecognizer to work with Chinese even though I have the language downloaded under "Offline Voice Recognition" under google settings. Other languages like English work ...

Andrey Starenky

64

asked Mar 12 at 7:06

0 votes

0 answers

25 views

Hotwords won't trigger on bumblebee-hotword-node

I'm trying to setup bumblebee (bumblebee-hotword-node) within a server that will listen to a discord server for the 'bumblebee' hotword derived from https://github.com/SteTR/Emost-Bot/tree/master So ...

Jack blundell

1

asked Mar 8 at 14:26

0 votes

0 answers

21 views

Why doesn't video-conferencing with subtitles exist?

Voice-to-text has existed for a long time, and is built into every major desktop and mobile OS. Video conferencing exists on many platforms and across many protocols. Language translation tools can ...

Shane Millsom

9

asked Mar 8 at 4:31

Collectives™ on Stack Overflow

Questions tagged [voice-recognition]

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

How can I access to these elements? [closed]

SpeechToText (Voice Recognition) result is empty string on Android device but not on Windows .NET MAUI

Android, how to launch application via Google Assistant?

How to input MediaRecorder webm opus bytes to Whisper model?

Sending file from Raspberry Pi Pico to voice recognition API

Synthesizing Audio with Unseen Speakers Using Pre-trained VITS Model

To address the issue of the microphone picking up both the user's voice and the NVDA (Nonvisual Desktop Access) voice output while recording

offline voice recognition for Arabic Language in flutter?

Duplicate Recording Requests in Browser Extension for Mic/Screen sourced Audio Transcription

Speechbrains SpeakerRecognition saves short cuts/links/symlinks of used audio files in working directory

Do some LLMs understand the voice directly, or do they have to go through a text transcription stage? [closed]

Android SpeechRecognizer not working with Chinese

Hotwords won't trigger on bumblebee-hotword-node

Why doesn't video-conferencing with subtitles exist?

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [voice-recognition]

Related Tags