MeetStream Guide: Per-Participant Audio Streams
MeetStream Guide: Per-Participant Audio Streams
This guide explains how to capture individual audio recordings per participant in a meeting. Instead of a single mixed audio track, you receive a separate WebM audio file for each speaker.
Supported platforms: Google Meet, Zoom.
1) What you need
- A MeetStream API key (refer to Dashboard Setup).
- A meeting link (Google Meet or Zoom).
- Set
audio_separate_streams: truein your Create Bot request.
2) Create a bot with per-participant audio
Use the Create Bot endpoint: https://docs.meetstream.ai/api-reference/ap-is/bot-endpoints/create-bot
Minimal example
Full example with recording config
Parameter reference
Tip: You can also pass
audio_separate_streamsnested insiderecording_config:The top-level parameter takes precedence if both are set.
3) What happens during the meeting
Once the bot joins, it automatically:
- Captures each participant’s audio as a separate file.
- Records up to 16 concurrent speaker streams.
- Continues recording throughout the meeting — no configuration needed per participant.
The bot also continues to produce the standard mixed audio file (audio.wav) regardless of whether audio_separate_streams is enabled.
Note on platform behaviour: The mechanism for capturing per-participant audio differs between Zoom and Google Meet and affects the isolation level of each file. See Section 7 for details.
4) Retrieve per-participant audio streams
After the bot leaves and audio processing completes, call:
Response when processing is complete
Response when processing is still in progress
HTTP status 202 is returned when the bot has not yet left the meeting. Poll again after the bot exits.
Response when no audio streams are available
This is returned when audio_separate_streams was not enabled, or the meeting ended before any speech was detected.
5) Download the audio files
Each segment in the response contains a url field — a presigned S3 URL that allows direct download without additional authentication.
Important: Presigned URLs expire after 10 minutes. If a URL has expired, call
get_audio_streamsagain to get fresh URLs.
File format
6) Understanding segments
Each participant has one segment (segment_index: 0) covering the full duration of their speech in the meeting. Silence gaps between speech periods are preserved with actual silence, so the file’s timeline aligns with the real meeting timeline.
Use duration_seconds to determine how long a participant was speaking.
7) Per-participant audio by platform
The quality of speaker isolation differs by platform due to how each platform delivers audio.
Zoom produces the cleanest per-participant files. Google Meet files contain the meeting’s mixed audio during the periods when that participant was the active speaker.
8) Using both mixed and per-participant audio
Per-participant audio and the standard mixed audio are always captured simultaneously. You do not need to choose one or the other.
This produces:
- A mixed audio WAV (all participants combined) — retrieve via the Get Bot Audio endpoint (
/api/v1/bots/<BOT_ID>/get_audio). - Per-participant WebM files — retrieve via the Get Audio Streams endpoint (
/api/v1/bots/<BOT_ID>/get_audio_streams).
9) Using per-participant audio with per-participant video
Both flags can be enabled together:
This produces separate audio and video files per participant. The files are not muxed together — audio and video are delivered as independent files. Match them by participant_name across the two API responses.
10) Webhook notifications
If you have webhooks configured, you will receive an audio.processed event when audio processing (including per-participant stream generation) completes. Poll get_audio_streams until audio_streams_available is true if you prefer polling over webhooks.
11) Troubleshooting
-
audio_streams_available: falsewithaudio_status: "Success"?- Confirm that
audio_separate_streams: truewas set when the bot was created. - The meeting may have ended before any speech was detected.
- Confirm that
-
Missing a participant’s audio?
- The bot captures up to 16 concurrent speaker streams. Meetings with more than 16 active speakers may result in some participants not being captured.
- Participants who never spoke will not appear in the response.
-
Getting status
in_progress?- The bot is still in the meeting. Audio streams are generated after the bot exits. Wait for the bot to leave and poll again.
-
Presigned URL returns 403 Forbidden?
- The URL has expired (10-minute lifetime). Call
get_audio_streamsagain for fresh URLs.
- The URL has expired (10-minute lifetime). Call
-
Audio file sounds like the whole meeting, not just one person?
- This is expected on Google Meet — see Section 7. The file contains the meeting audio during that participant’s active speaking windows. On Zoom, files are fully isolated.
