5

to recognize speech by Google server, I use SpeechRecognizer class in combination with RecognitionListener as suggested in Stephan's answer to this question . In addition, I try to capture the audio signal being recognized by using onBufferReceived() callback from RecognitionListener like:

byte[] sig = new byte[500000] ;
int sigPos = 0 ;
...
public void onBufferReceived(byte[] buffer) {
  System.arraycopy(buffer, 0, sig, sigPos, buffer.length) ;
  sigPos += buffer.length ;
}
...

This seems working fine, except when SpeechRecognizer fails connecting to the Google server, when a chunk of audio is not copied into the above-mentioned sig array, and an HTTP connection time-out exception is thrown. SpeechRecognizer eventually connects to the Google server and recognition results indicate that a complete audio signal was received; only the sig array is missing some audio chunk(s).

Does anybody experience the same problem? Any hint for solution? Thank you!

1
  • how did you process sig to get the original audio signal back to recognize missing chunks?
    – CompEng88
    Commented Mar 14, 2012 at 7:34

3 Answers 3

1

I tend to say this might be a inconsistency in the behavior of the recognition service, maybe even a bug in the Android version you use. However, the documentation states, that it is not guaranteed that this method is called so it would fit into the specification. What I noticed so far is the following (on Android 2.3.4): I get the bytes while recording, but if there is for example a SocketTimeout it tries to resend the data to the server after some time, but without calling onBufferReceived again for the same data. The code used to test that was the same as the one you have linked in your posting.

Why do you think some chunks are missing from the audio you received in the method? If it were only a few chunks missing, it might even be the case, that the recognition worked although those chunks were missing.

1

In modern versions onBufferReceieved does not work, you can check record/save audio from voice recognition intent instead.

1

Best way to achieve this is round the other way. Capture your audio data using the AudioRecord, (I'd recommend using VOICE_COMMUNICATION rather than MIC as an input so you get really clean audio), then pass it through to the SpeechRecognizer. :)

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.