Push to Talk
Push to Talk (PTT) is a communication method where audio is only transmitted while the user actively holds down a button. This is useful in scenarios where you want to give users explicit control over when they are broadcasting, such as in gaming, conferencing, or radio-style communication.
This guide explains how to configure an AudioInputvolumeGate setting.
Understanding VAD Defaults
Before implementing PTT, it's important to understand the default Voice Activity Detection (VAD) settings. The SDK provides VAD_DEFAULTS with two key components:
voiceActivity (unit: %)
Controls voice activity detection based on the probability that the audio contains speech:
attackThreshold: 0.9- The probability (90%) required to start transmissionreleaseThreshold: 0.8- The probability (80%) below which transmission stops
volumeGate (unit: dBFS)
Controls volume-based gating to filter out quiet sounds:
attackThreshold: -30- The volume level (in dBFS) required to open the gate and start transmissionreleaseThreshold: -40- The volume level (in dBFS) below which the gate closes and transmission stops
Creating an AudioInput for PTT
The best way to implement Push to Talk is to configure the InputSettings AudioInputvolumeGate to -90. This very low threshold effectively keeps the gate closed until you explicitly open it when the PTT button is pressed.
Adding AudioInput to a Room
Once you've created the AudioInput Room
Audio is only transmitted after an AudioInput has been attached to a room using room.addAudioInput(audioInput). With the PTT configuration, the volumeGate prevents transmission until the PTT button is pressed.
voiceActivity Variants
There are two approaches when implementing PTT, depending on whether you keep voiceActivity enabled or disabled.
Option 1: Keep voiceActivity enabled (Recommended)
Keeping voiceActivity enabled has the advantage that you won't transmit background noise even while the PTT button is held down. The downside is that the onAudioActivity indicator won't be active when pressing the PTT button if there's no detected speech.
This is the recommended approach for most use cases.
Option 2: Disable voiceActivity
If you want the audio activity indicator to always reflect the PTT button state (active when pressed, regardless of speech detection), you can disable voiceActivity:
With this configuration, audio will transmit whenever the PTT button is pressed, regardless of whether speech is detected. This may result in transmitting background noise.
Implementing PTT Button Logic
To complete the PTT implementation, you need to handle the button press and release events. When the user presses the PTT button, set volumeGate to false to disable gating and allow transmission. When released, set it back to -90 to close the gate.
Make sure to handle the pointerleave event as well. This ensures the gate closes if the user moves their pointer away from the button while holding it down.
Complete Example
Here's a complete example bringing all the pieces together: