public class VoiceActivityDetector extends Object implements SpeechProcessor
VoiceActivityDetector is a speech pipeline component that implements Voice Activity Detection (VAD) using the webrtc native component. The detector processes each frame and sets the speech context to speech/nonspeech based on the results of the VAD algorithm. The VAD implementation is based on the webrtc VAD in the Chromium browser. It supports 16-bit PCM samples.
This pipeline component supports the following configuration properties:
The detector uses a simple consecutive value filter to eliminate noisy transitions.
Modifier and Type | Field and Description |
---|---|
static String |
DEFAULT_MODE
default voice detection mode (high precision).
|
Constructor and Description |
---|
VoiceActivityDetector(SpeechConfig config)
constructs a new trigger instance.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
destroys the unmanaged VAD instance.
|
void |
process(SpeechContext context,
ByteBuffer frame)
processes a frame of audio.
|
void |
reset()
resets all state internal to the stage.
|
public static final String DEFAULT_MODE
public VoiceActivityDetector(SpeechConfig config)
config
- the pipeline configuration instancepublic void close()
close
in interface AutoCloseable
public void process(SpeechContext context, ByteBuffer frame)
process
in interface SpeechProcessor
context
- the current speech contextframe
- the audio frame to detectpublic void reset()
SpeechProcessor
reset
in interface SpeechProcessor
Copyright © 2020. All rights reserved.