Overview
KrispVivaFilter is an audio processor that isolates the user’s voice in real-time audio streams using Krisp VIVA SDK. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by filtering out background noise and other voices using Krisp’s voice isolation algorithms.
Optionally supports TTS detection to delay voice isolation until bot speech playback has stopped, preventing suppression artifacts on subsequent human speech. Enable this feature by providing tts_model_path or setting the KRISP_VIVA_TTS_MODEL_PATH environment variable.
To use Krisp, you need a Krisp SDK license. Get started at Krisp.ai.
Installation
See the Krisp guide to learn how to install the Krisp VIVA SDK.Environment Variables
You need to provide the path to the Krisp noise cancellation model file (.kef extension). This can either be done by setting theKRISP_VIVA_FILTER_MODEL_PATH environment variable or by setting the model_path in the constructor.
For SDK v1.6.1+, you also need to provide a Krisp API key via the api_key constructor parameter or the KRISP_VIVA_API_KEY environment variable.
Optionally, provide a TTS detection model path via KRISP_VIVA_TTS_MODEL_PATH or the tts_model_path constructor parameter to enable delayed voice isolation.
Constructor Parameters
Path to the Krisp NC model file (.kef extension).You can set the
model_path directly. Alternatively, you can set the KRISP_VIVA_FILTER_MODEL_PATH environment variable to the model file path.Voice isolation level for the filter
Krisp SDK API key for licensing (required for SDK v1.6.1+). If empty, falls
back to the
KRISP_VIVA_API_KEY environment variable.Path to the Krisp TTS detection model file (.kef extension). If
None, uses KRISP_VIVA_TTS_MODEL_PATH environment variable. When not set, TTS detection is disabled and NC starts immediately.Probability threshold (0–1) above which a frame is classified as containing TTS. Only used when
tts_model_path is set.Seconds to wait for TTS before starting NC. If no TTS is detected within this window the NC filter activates immediately. Only used when
tts_model_path is set.Supported Sample Rates
The filter supports the following sample rates:- 8000 Hz
- 16000 Hz
- 24000 Hz
- 32000 Hz
- 44100 Hz
- 48000 Hz
Notes
When TTS detection is enabled (tts_model_path is set), the filter passes audio through unmodified until bot speech clears or the timeout expires. This prevents the noise cancellation filter from suppressing real human speech that immediately follows bot TTS playback. Once TTS is no longer detected (or the timeout elapses), noise cancellation activates for the remainder of the session.
Input Frames
Specific control frame to toggle filtering on/off