Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(closed captions): Integration in the SDKs + UI Cookbooks #1508

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
96ed9b9
docs: closed captions React cookbook
myandrienko Oct 3, 2024
1d26c9b
fix syntax highlighting
myandrienko Oct 3, 2024
ed47579
docs: closed captions (#1497)
szuperaz Oct 3, 2024
a32257a
Merge branch 'main' into docs/cc-react
oliverlaz Oct 3, 2024
4e0b750
feat: move closed caption handling to the SDK [wip]
oliverlaz Oct 3, 2024
933e754
feat: handle closed captions in the SDK
oliverlaz Oct 4, 2024
1895625
Merge branch 'main' into docs/cc-react
oliverlaz Oct 4, 2024
462c6fd
chore: more assertions
oliverlaz Oct 4, 2024
aaca354
chore: more tests
oliverlaz Oct 4, 2024
31ee11a
Update ts quickstart to use SDK state
szuperaz Oct 4, 2024
741e48c
feat: add cc start/stop endpoints
oliverlaz Oct 4, 2024
aa56be1
Update JS docs
szuperaz Oct 4, 2024
64eb786
Add start/stop endpoints
szuperaz Oct 4, 2024
10d5cb8
feat: enrich closed captions with speaker name
oliverlaz Oct 4, 2024
933cdbd
Merge branch 'main' into docs/cc-react
oliverlaz Oct 4, 2024
cf645da
Integrate speaker_name field to JS app
szuperaz Oct 4, 2024
da3bbbe
Remove unnecessary code
szuperaz Oct 4, 2024
7432c06
fix: use optimistic updates
oliverlaz Oct 7, 2024
1d0d2a6
fix: bubble the error up
oliverlaz Oct 7, 2024
a73c465
Merge branch 'main' into docs/cc-react
oliverlaz Oct 7, 2024
fb1e300
fix: update docs, missing hook, toggle button
oliverlaz Oct 7, 2024
c0fbeae
feat: add ClosedCaptions component for React Native
oliverlaz Oct 7, 2024
ee3720e
chore: missing imports
oliverlaz Oct 7, 2024
067c47e
chore: align cookbooks
oliverlaz Oct 7, 2024
6901fda
docs: update example, add screenshot
oliverlaz Oct 8, 2024
f09d1ec
feat: update to the newest OpenAPI schema; update the cookbooks
oliverlaz Oct 8, 2024
e44bfd6
fix: use correct types, update cookbook
oliverlaz Oct 8, 2024
f1cccbb
Merge branch 'refs/heads/main' into docs/cc-react
oliverlaz Oct 22, 2024
1dce3c2
Merge branch 'main' into docs/cc-react
oliverlaz Nov 1, 2024
026c1fa
feat: update schema
oliverlaz Nov 1, 2024
9e9bbbe
Merge branch 'main' into docs/cc-react
myandrienko Nov 7, 2024
e2e3828
update openapi types
myandrienko Nov 7, 2024
579950a
Merge branch 'main' into docs/cc-react
myandrienko Nov 14, 2024
740f195
revert doc updates
myandrienko Nov 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -45,32 +45,35 @@ Otherwise, `call.state` observables will emit empty values and you won't get rea

Here is an excerpt of the call state properties:

| Reactive value | Static value | Description |
| ------------------------ | ----------------------- | ------------------------------------------------------------------------------------------------------------ |
| `backstage$` | `backstage` | `true` when the call runs in `backstage` mode |
| `blockedUserIds$` | `blockedUserIds` | The list of blocked user IDs. |
| `callingState$` | `callingState` | Provides information about the call state. For example, `RINGING`, `JOINED` or `RECONNECTING`. |
| `callStatsReport$` | `callStatsReport` | When stats gathering is enabled, this observable will emit a new value at a regular (configurable) interval. |
| `createdAt$` | `createdAt` | The time the call was created. |
| `createdBy$` | `createdBy` | The user who created the call. |
| `custom$` | `custom` | Custom data attached to the call. |
| `dominantSpeaker$` | `dominantSpeaker` | The participant that is the current dominant speaker of the call. |
| `egress$` | `egress` | The egress data of the call (for broadcasting and livestreaming). |
| `endedAt$` | `endedAt` | The time the call was ended. |
| `endedBy$` | `endedBy` | The user who ended the call. |
| `hasOngoingScreenShare$` | `hasOngoingScreenShare` | It will return `true` if at least one participant is sharing their screen. |
| `ingress$` | `ingress` | The ingress data of the call (for broadcasting and livestreaming). |
| `members$` | `members` | The list of call members |
| `ownCapabilities$` | `ownCapabilities` | The capabilities of the local participant. |
| `pinnedParticipants$` | `pinnedParticipants` | The participants that are currently pinned. |
| `recording$` | `recording` | The recording state of the call. |
| `session$` | `session` | The data for the current call session. |
| `settings$` | `settings` | The settings of the call. |
| `startedAt$` | `startedAt` | The actual start time of the current call session. |
| `startsAt$` | `startsAt` | The time the call is scheduled to start. |
| `thumbnails$` | `thumbnails` | The thumbnails of the call. |
| `transcribing$` | `transcribing` | The transcribing state of the call. |
| `updatedAt$` | `updatedAt` | The time the call was updated. |
| Reactive value | Static value | Description |
| ------------------------------ | ----------------------------- | ------------------------------------------------------------------------------------------------------------ |
| `backstage$` | `backstage` | `true` when the call runs in `backstage` mode |
| `blockedUserIds$` | `blockedUserIds` | The list of blocked user IDs. |
| `callingState$` | `callingState` | Provides information about the call state. For example, `RINGING`, `JOINED` or `RECONNECTING`. |
| `callStatsReport$` | `callStatsReport` | When stats gathering is enabled, this observable will emit a new value at a regular (configurable) interval. |
| `captioning$` | `captioning` | Provides information whether closed-captions are running for this call or not. |
| `closedCaptions$` | `closedCaptions` | The closed captions state of the call. |
| `createdAt$` | `createdAt` | The time the call was created. |
| `createdBy$` | `createdBy` | The user who created the call. |
| `custom$` | `custom` | Custom data attached to the call. |
| `dominantSpeaker$` | `dominantSpeaker` | The participant that is the current dominant speaker of the call. |
| `egress$` | `egress` | The egress data of the call (for broadcasting and livestreaming). |
| `endedAt$` | `endedAt` | The time the call was ended. |
| `endedBy$` | `endedBy` | The user who ended the call. |
| `hasOngoingScreenShare$` | `hasOngoingScreenShare` | It will return `true` if at least one participant is sharing their screen. |
| `ingress$` | `ingress` | The ingress data of the call (for broadcasting and livestreaming). |
| `members$` | `members` | The list of call members |
| `ownCapabilities$` | `ownCapabilities` | The capabilities of the local participant. |
| `pinnedParticipants$` | `pinnedParticipants` | The participants that are currently pinned. |
| `recording$` | `recording` | The recording state of the call. |
| `session$` | `session` | The data for the current call session. |
| `sessionParticipantsByUserId$` | `sessionParticipantsByUserId` | The participants of the call session by user ID. |
| `settings$` | `settings` | The settings of the call. |
| `startedAt$` | `startedAt` | The actual start time of the current call session. |
| `startsAt$` | `startsAt` | The time the call is scheduled to start. |
| `thumbnails$` | `thumbnails` | The thumbnails of the call. |
| `transcribing$` | `transcribing` | The transcribing state of the call. |
| `updatedAt$` | `updatedAt` | The time the call was updated. |

:::note
Your IDE of choice may help you to discover the other properties of the call state.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
id: closed-captions
title: Closed Captions
description: How to add closed captions to your calls
---

The Stream API supports adding real-time closed captions (subtitles for participants) to your calls. This guide shows you how to implement this feature on the client side.

## Prerequisites

Make sure that the closed caption feature is enabled in your app's dashboard. The closed caption feature can be set on the call type level, and the available options are:

- `available`: the feature is available for your call and can be enabled.
- `disabled`: the feature is not available for your call. In this case, it's a good idea to "hide" any UI element you have related to closed captions.
- `auto-on`: the feature is available and will be enabled automatically once the user is connected to the call.

It's also possible to override the call type's default when creating a call:

```ts
await call.getOrCreate({
data: {
settings_override: {
transcription: {
mode: 'available',
closed_caption_mode: 'available',
},
},
},
});
```

You can check the current value like this:

```typescript
console.log(call.state.settings?.transcription.closed_caption_mode);
```

## Enabling, disabling and tweaking closed captions

If you set `closed_caption_mode` to `available` you need to enable closed caption events when you want to see captions:

```typescript
await call.startTranscription(); // enable closed captions
await call.stopTranscription(); // disable closed captions

call.updateClosedCaptionSettings({
retentionTimeInMs: 2700, // the duration a caption can stay in the queue
queueSize: 2, // number of captions that can be stored in the queue
});
```

## Check if closed captions are enabled

```tsx
const call = client.call(type, id);
const isCaptioningInProgress = call.state.captioning;

console.log(
`Closed captions are ${isCaptioningInProgress ? 'enabled' : 'disabled'}`,
);

// alternatively, you can listen to the captioning state changes
call.state.captioning$.subscribe((isCaptioningInProgress) => {
console.log(
`Closed captions are ${isCaptioningInProgress ? 'enabled' : 'disabled'}`,
);
});
```

## Displaying the captions

You can access the most recent captions using the call state:

```typescript
import { StreamCallClosedCaption } from '@stream-io/video-client';

const subscription = call.state.closedCaptions$.subscribe((captions) =>
updateDisplayedCaptions(captions),
);

const updateDisplayedCaptions = (captions: StreamCallClosedCaption[]) => {
const innerHTML = captions
.map((caption) => `<b>${caption.speaker_name}:</b> ${caption.text}`)
.join('<br>');
};

subscription.unsubscribe(); // remember to unsubscribe
```

This is how an example closed caption looks like:

```json
{
"text": "Thank you, guys, for listening.",
// When did the speaker start speaking
"start_time": "2024-09-25T12:22:21.310735726Z",
// When did the speaker finish saying the caption
"end_time": "2024-09-25T12:22:24.310735726Z",
"speaker_id": "zitaszuperagetstreamio",
"speaker_name": "Zita"
}
```

## See it in action

To see it all in action check out our TypeScript sample application on [GitHub](https://github.com/GetStream/stream-video-js/tree/main/sample-apps/client/ts-quickstart) or in [Codesandbox](https://codesandbox.io/p/sandbox/eloquent-glitter-99th3v).
2 changes: 1 addition & 1 deletion packages/client/openapitools.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
"$schema": "../../node_modules/@openapitools/openapi-generator-cli/config.schema.json",
"spaces": 2,
"generator-cli": {
"version": "7.5.0"
"version": "7.8.0"
}
}
47 changes: 46 additions & 1 deletion packages/client/src/Call.ts
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,14 @@ import type {
SendCallEventResponse,
SendReactionRequest,
SendReactionResponse,
StartClosedCaptionsResponse,
StartHLSBroadcastingResponse,
StartRecordingRequest,
StartRecordingResponse,
StartTranscriptionRequest,
StartTranscriptionResponse,
StatsOptions,
StopClosedCaptionsResponse,
StopHLSBroadcastingResponse,
StopLiveResponse,
StopRecordingResponse,
Expand All @@ -81,6 +83,7 @@ import {
AudioTrackType,
CallConstructor,
CallLeaveOptions,
ClosedCaptionsSettings,
JoinCallData,
PublishOptions,
TrackMuteType,
Expand Down Expand Up @@ -552,6 +555,7 @@ export class Call {
this.dynascaleManager.setSfuClient(undefined);

this.state.setCallingState(CallingState.LEFT);
this.state.dispose();

// Call all leave call hooks, e.g. to clean up global event handlers
this.leaveCallHooks.forEach((hook) => hook());
Expand Down Expand Up @@ -1688,7 +1692,48 @@ export class Call {
};

/**
* Sends a `call.permission_request` event to all users connected to the call. The call settings object contains infomration about which permissions can be requested during a call (for example a user might be allowed to request permission to publish audio, but not video).
* Starts the closed captions of the call.
*/
startClosedCaptions = async (): Promise<StartClosedCaptionsResponse> => {
const trx = this.state.setCaptioning(true); // optimistic update
try {
return await this.streamClient.post<StartClosedCaptionsResponse>(
`${this.streamClientBasePath}/start_closed_captions`,
);
} catch (err) {
trx.rollback(); // revert the optimistic update
throw err;
}
};

/**
* Stops the closed captions of the call.
*/
stopClosedCaptions = async (): Promise<StopClosedCaptionsResponse> => {
const trx = this.state.setCaptioning(false); // optimistic update
try {
return await this.streamClient.post<StopClosedCaptionsResponse>(
`${this.streamClientBasePath}/stop_closed_captions`,
);
} catch (err) {
trx.rollback(); // revert the optimistic update
throw err;
}
};

/**
* Updates the closed caption settings.
*
* @param config the closed caption settings to apply
*/
updateClosedCaptionSettings = (config: Partial<ClosedCaptionsSettings>) => {
this.state.updateClosedCaptionSettings(config);
};

/**
* Sends a `call.permission_request` event to all users connected to the call.
* The call settings object contains information about which permissions can be requested during a call
* (for example, a user might be allowed to request permission to publish audio, but not video).
*/
requestPermissions = async (
data: RequestPermissionRequest,
Expand Down
38 changes: 34 additions & 4 deletions packages/client/src/gen/coordinator/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,12 @@ export interface APIError {
* @memberof APIError
*/
more_info: string;
/**
* Flag that indicates if the error is unrecoverable, requests that return unrecoverable errors should not be retried, this error only applies to the request that caused it
* @type {boolean}
* @memberof APIError
*/
unrecoverable?: boolean;
}
/**
*
Expand Down Expand Up @@ -1457,7 +1463,6 @@ export interface CallSessionEndedEvent {
*/
type: string;
}

/**
* This event is sent when the participant counts in a call session are updated
* @export
Expand Down Expand Up @@ -4303,9 +4308,11 @@ export const OwnCapability = {
SEND_AUDIO: 'send-audio',
SEND_VIDEO: 'send-video',
START_BROADCAST_CALL: 'start-broadcast-call',
START_CLOSED_CAPTIONS_CALL: 'start-closed-captions-call',
START_RECORD_CALL: 'start-record-call',
START_TRANSCRIPTION_CALL: 'start-transcription-call',
STOP_BROADCAST_CALL: 'stop-broadcast-call',
STOP_CLOSED_CAPTIONS_CALL: 'stop-closed-captions-call',
STOP_RECORD_CALL: 'stop-record-call',
STOP_TRANSCRIPTION_CALL: 'stop-transcription-call',
UPDATE_CALL: 'update-call',
Expand Down Expand Up @@ -4718,7 +4725,6 @@ export interface PrivacySettings {
*/
typing_indicators?: TypingIndicators;
}

/**
*
* @export
Expand Down Expand Up @@ -4820,7 +4826,6 @@ export interface PushNotificationSettingsResponse {
*/
disabled_until?: string;
}

/**
*
* @export
Expand Down Expand Up @@ -5490,6 +5495,19 @@ export interface SortParamRequest {
*/
field?: string;
}
/**
*
* @export
* @interface StartClosedCaptionsResponse
*/
export interface StartClosedCaptionsResponse {
/**
*
* @type {string}
* @memberof StartClosedCaptionsResponse
*/
duration: string;
}
/**
*
* @export
Expand Down Expand Up @@ -5593,6 +5611,19 @@ export interface StatsOptions {
*/
reporting_interval_ms: number;
}
/**
*
* @export
* @interface StopClosedCaptionsResponse
*/
export interface StopClosedCaptionsResponse {
/**
* Duration of the request in milliseconds
* @type {string}
* @memberof StopClosedCaptionsResponse
*/
duration: string;
}
/**
*
* @export
Expand Down Expand Up @@ -6375,7 +6406,6 @@ export interface UserMuteResponse {
*/
user?: UserResponse;
}

/**
*
* @export
Expand Down
Loading
Loading