Skip to main content

Overview

KrispVivaFilter is an audio processor that isolates the user’s voice in real-time audio streams using Krisp VIVA SDK. It inherits from BaseAudioFilter and processes audio frames to improve audio quality by filtering out background noise and other voices using Krisp’s voice isolation algorithms. Optionally supports TTS detection to delay voice isolation until bot speech playback has stopped, preventing suppression artifacts on subsequent human speech. Enable this feature by providing tts_model_path or setting the KRISP_VIVA_TTS_MODEL_PATH environment variable. To use Krisp, you need a Krisp SDK license. Get started at Krisp.ai.

Installation

See the Krisp guide to learn how to install the Krisp VIVA SDK.

Environment Variables

You need to provide the path to the Krisp noise cancellation model file (.kef extension). This can either be done by setting the KRISP_VIVA_FILTER_MODEL_PATH environment variable or by setting the model_path in the constructor. For SDK v1.6.1+, you also need to provide a Krisp API key via the api_key constructor parameter or the KRISP_VIVA_API_KEY environment variable. Optionally, provide a TTS detection model path via KRISP_VIVA_TTS_MODEL_PATH or the tts_model_path constructor parameter to enable delayed voice isolation.

Constructor Parameters

model_path
str
default:"None"
Path to the Krisp NC model file (.kef extension).You can set the model_path directly. Alternatively, you can set the KRISP_VIVA_FILTER_MODEL_PATH environment variable to the model file path.
noise_suppression_level
int
default:"100"
Voice isolation level for the filter
api_key
str
default:"\"\""
Krisp SDK API key for licensing (required for SDK v1.6.1+). If empty, falls back to the KRISP_VIVA_API_KEY environment variable.
tts_model_path
str
default:"None"
Path to the Krisp TTS detection model file (.kef extension). If None, uses KRISP_VIVA_TTS_MODEL_PATH environment variable. When not set, TTS detection is disabled and NC starts immediately.
tts_threshold
float
default:"0.5"
Probability threshold (0–1) above which a frame is classified as containing TTS. Only used when tts_model_path is set.
tts_detection_timeout
float
default:"3.0"
Seconds to wait for TTS before starting NC. If no TTS is detected within this window the NC filter activates immediately. Only used when tts_model_path is set.

Supported Sample Rates

The filter supports the following sample rates:
  • 8000 Hz
  • 16000 Hz
  • 24000 Hz
  • 32000 Hz
  • 44100 Hz
  • 48000 Hz

Notes

When TTS detection is enabled (tts_model_path is set), the filter passes audio through unmodified until bot speech clears or the timeout expires. This prevents the noise cancellation filter from suppressing real human speech that immediately follows bot TTS playback. Once TTS is no longer detected (or the timeout elapses), noise cancellation activates for the remainder of the session.

Input Frames

FilterEnableFrame
Frame
Specific control frame to toggle filtering on/off
from pipecat.frames.frames import FilterEnableFrame

# Disable voice isolation
await worker.queue_frame(FilterEnableFrame(False))

# Re-enable voice isolation
await worker.queue_frame(FilterEnableFrame(True))

Usage Example

from pipecat.audio.filters.krisp_viva_filter import KrispVivaFilter
from pipecat.transports.daily.transport import DailyParams, DailyTransport

transport = DailyTransport(
    room_url,
    token,
    "Respond bot",
    DailyParams(
        audio_in_enabled=True,
        audio_in_filter=KrispVivaFilter(), # Enable Krisp voice isolation
        audio_out_enabled=True,
    ),
)