Diarize connector

Bearer Token transcriptionmediaproductivityanalytics

Connect to Diarize to transcribe and diarize audio and video content from YouTube, X, Instagram, and TikTok. Submit transcription jobs and retrieve...

Diarize connector

Install the SDK
Section titled “Install the SDK”
- Node.js
- Python
Terminal window
1 npm install @scalekit-sdk/node
Terminal window
1 pip install scalekit
Full SDK reference: Node.js | Python
Set your credentials
Section titled “Set your credentials”

Add your Scalekit credentials to your .env file. Find values in app.scalekit.com > Developers > API Credentials.
.env
```
SCALEKIT_ENVIRONMENT_URL=<your-environment-url>
SCALEKIT_CLIENT_ID=<your-client-id>
SCALEKIT_CLIENT_SECRET=<your-client-secret>
```
Set up the connector
Section titled “Set up the connector”

Register your Diarize credentials with Scalekit so it can authenticate requests on your behalf. You do this once per environment.
Dashboard setup steps
Register your Diarize API key with Scalekit so it can authenticate and proxy transcription requests on behalf of your users. Unlike OAuth connectors, Diarize uses API key authentication — there is no redirect URI or OAuth flow.
1. Get a Diarize API key
  - Sign in to diarize.io and go to Settings → API Keys.
  - Click + Create New Key, give it a name (e.g., Agent Auth), and confirm.
  - Copy the key value — store it securely, as you will not be able to view it again.
2. Create a connection in Scalekit
  - In Scalekit dashboard, go to AgentKit > Connections > Create Connection. Find Diarize and click Create.
  - Note the Connection name — you will use this as connection_name in your code (e.g., diarize).
  - Click Save.
3. Add a connected account
  
  Connected accounts link a specific user identifier in your system to a Diarize API key. Add accounts via the dashboard for testing, or via the Scalekit API in production.
  
  Via dashboard (for testing)
  - Open the connection you created and click the Connected Accounts tab → Add account.
  - Fill in:
    - Your User’s ID — a unique identifier for this user in your system (e.g., user_123)
    - API Key — the Diarize API key you copied in step 1
  - Click Create Account.
  Via API (for production)
  Node.js
  Python
  1 // Never hard-code API keys — read from secure storage or user input 2 const diarizeApiKey = getUserDiarizeKey(); // retrieve from your secure store 3 4 await scalekit.actions.upsertConnectedAccount({ 5 connectionName: 'diarize', 6 identifier: 'user_123', // your user's unique ID 7 credentials: { token: diarizeApiKey }, 8 });
  1 # Never hard-code API keys — read from secure storage or user input 2 diarize_api_key = get_user_diarize_key() # retrieve from your secure store 3 4 scalekit_client.actions.upsert_connected_account( 5 connection_name="diarize", 6 identifier="user_123", 7 credentials={"token": diarize_api_key} 8 )
  In production, call upsert_connected_account (Python) / upsertConnectedAccount (Node.js) when a user connects their Diarize account — for example, after they paste their API key into a settings page in your app.
Diarize supports YouTube, X (Twitter), Instagram, and TikTok URLs. Direct audio or video file URLs are not supported — the URL must point to a public post on one of these platforms.

1
import { ScalekitClient } from '@scalekit-sdk/node'
2
import 'dotenv/config'
3

4
const scalekit = new ScalekitClient(
5
  process.env.SCALEKIT_ENV_URL,
6
  process.env.SCALEKIT_CLIENT_ID,
7
  process.env.SCALEKIT_CLIENT_SECRET,
8
)
9
const actions = scalekit.actions
10

11
const connector = 'diarize'
12
const identifier = 'user_123'
13

14
// Make your first call
15
const result = await actions.executeTool({
16
  connector,
17
  identifier,
18
  toolName: 'diarize_get_job_status',
19
  toolInput: { job_id: 'YOUR_JOB_ID' },
20
})
21
console.log(result)

1
import os
2
from scalekit.client import ScalekitClient
3
from dotenv import load_dotenv
4
load_dotenv()
5

6
scalekit_client = ScalekitClient(
7
    env_url=os.getenv("SCALEKIT_ENV_URL"),
8
    client_id=os.getenv("SCALEKIT_CLIENT_ID"),
9
    client_secret=os.getenv("SCALEKIT_CLIENT_SECRET"),
10
)
11
actions = scalekit_client.actions
12

13
connection_name = "diarize"
14
identifier = "user_123"
15

16
# Make your first call
17
result = actions.execute_tool(
18
    tool_input={"job_id":"YOUR_JOB_ID"},
19
    tool_name="diarize_get_job_status",
20
    connection_name=connection_name,
21
    identifier=identifier,
22
)
23
print(result)

What you can do

Connect this agent connector to let your agent:

Get job status — Retrieve the current status of a transcription job by its job ID
Transcript download — Download the transcript output for a completed transcription job in JSON, TXT, SRT, or VTT format, including speaker diarization, segments, and word-level timestamps
Create transcription job — Submit a new transcription and diarization job for an audio or video URL (YouTube, X, Instagram, TikTok)

Common workflows

Tool calling

Use this connector when you want an agent to transcribe and diarize audio or video from YouTube, X, Instagram, or TikTok.

Use diarize_create_transcription_job to submit a URL for transcription. Returns an id (job ID) and an estimatedTime (in seconds) for how long processing will take.
Use diarize_get_job_status to poll until status is COMPLETED or FAILED. Use estimatedTime to set a sensible timeout — do not give up before that time has elapsed.
Use diarize_download_transcript to retrieve the result once complete. Choose json for structured speaker diarization data, or txt, srt, vtt for plain-text and subtitle formats.

Python
Node.js

1
import time
2

3
# Step 1: Submit a transcription job
4
create_result = actions.execute_tool(
5
    connection_name='diarize',
6
    identifier='user_123',
7
    tool_name="diarize_create_transcription_job",
8
    tool_input={
9
        "url": "https://www.youtube.com/watch?v=example",
10
        "language": "en",   # optional — omit for auto-detection
11
        "num_speakers": 2,  # optional — improves speaker diarization
12
    },
13
)
14
job_id = create_result.result["id"]
15
estimated_seconds = create_result.result.get("estimatedTime", 120)
16
deadline = time.time() + estimated_seconds * 2
17
print(f"Job {job_id} submitted. Estimated: {estimated_seconds}s")
18

19
# Step 2: Poll until complete
20
while True:
21
    if time.time() > deadline:
22
        raise TimeoutError(f"Job {job_id} timed out after {estimated_seconds * 2}s")
23
    time.sleep(15)
24
    status_result = actions.execute_tool(
25
        connection_name='diarize',
26
        identifier='user_123',
27
        tool_name="diarize_get_job_status",
28
        tool_input={"job_id": job_id},
29
    )
30
    status = status_result.result["status"]
31
    print("Status:", status)
32
    if status == "COMPLETED":
33
        break
34
    if status == "FAILED":
35
        raise RuntimeError(f"Job {job_id} failed")
36

37
# Step 3: Download the diarized transcript
38
transcript_result = actions.execute_tool(
39
    connection_name='diarize',
40
    identifier='user_123',
41
    tool_name="diarize_download_transcript",
42
    tool_input={"job_id": job_id, "format": "json"},
43
)
44
# handle the transcript_result

1
// Step 1: Submit a transcription job
2
const createResult = await actions.executeTool({
3
  connector: 'diarize',
4
  identifier: 'user_123',
5
  toolName: 'diarize_create_transcription_job',
6
  toolInput: {
7
    url: 'https://www.youtube.com/watch?v=example',
8
    language: 'en',   // optional — omit for auto-detection
9
    num_speakers: 2,  // optional — improves speaker diarization
10
  },
11
});
12
const jobId = createResult.data.id;
13
const estimatedSeconds = createResult.data.estimatedTime ?? 120;
14
const deadline = Date.now() + estimatedSeconds * 2 * 1000;
15
console.log(`Job ${jobId} submitted. Estimated: ${estimatedSeconds}s`);
16

17
// Step 2: Poll until complete
18
let status = 'PENDING';
19
while (status !== 'COMPLETED' && status !== 'FAILED') {
20
  if (Date.now() > deadline) throw new Error(`Job ${jobId} timed out after ${estimatedSeconds * 2}s`);
21
  await new Promise(r => setTimeout(r, 15_000));
22
  const statusResult = await actions.executeTool({
23
    connector: 'diarize',
24
    identifier: 'user_123',
25
    toolName: 'diarize_get_job_status',
26
    toolInput: { job_id: jobId },
27
  });
28
  status = statusResult.data.status;
29
  console.log('Status:', status);
30
}
31
if (status === 'FAILED') throw new Error(`Job ${jobId} failed`);
32

33
// Step 3: Download the diarized transcript
34
const transcriptResult = await actions.executeTool({
35
  connector: 'diarize',
36
  identifier: 'user_123',
37
  toolName: 'diarize_download_transcript',
38
  toolInput: { job_id: jobId, format: 'json' },
39
});
40
// handle the transcriptResult

Tool list

Use the exact tool names from the Tool list below when you call execute_tool. If you’re not sure which name to use, list the tools available for the current user first.

diarize_create_transcription_job # Submit a new transcription and diarization job for an audio or video URL (YouTube, X, Instagram, TikTok). Returns a job ID that can be used to check status and download results. 5 params

Submit a new transcription and diarization job for an audio or video URL (YouTube, X, Instagram, TikTok). Returns a job ID that can be used to check status and download results.

Name Type Required Description

url string required The URL of the audio or video content to transcribe (e.g. YouTube, X, Instagram, TikTok link)

language string optional Language code for transcription (e.g. 'en', 'es', 'fr'). Defaults to auto-detection if not provided.

num_speakers integer optional Expected number of speakers in the audio. Helps improve diarization accuracy.

schema_version string optional Optional schema version to use for tool execution

tool_version string optional Optional tool version to use for execution

diarize_download_transcript # Download the transcript output for a completed transcription job in JSON, TXT, SRT, or VTT format, including speaker diarization, segments, and word-level timestamps. 4 params

Download the transcript output for a completed transcription job in JSON, TXT, SRT, or VTT format, including speaker diarization, segments, and word-level timestamps.

Name Type Required Description

job_id string required The unique ID of the completed transcription job

format string optional Output format for the transcript. Supported formats: 'json', 'txt', 'srt', 'vtt'.

schema_version string optional Optional schema version to use for tool execution

tool_version string optional Optional tool version to use for execution

diarize_get_job_status # Retrieve the current status of a transcription job by its job ID. Returns job state (pending, processing, completed, failed), metadata, and an estimatedTime field (in seconds) indicating how long processing is expected to take. Use estimatedTime to determine polling frequency and max wait duration — for example, a 49-minute episode may have an estimatedTime of ~891s (~15 mins), so the agent should wait at least that long before giving up. 3 params

Retrieve the current status of a transcription job by its job ID. Returns job state (pending, processing, completed, failed), metadata, and an estimatedTime field (in seconds) indicating how long processing is expected to take. Use estimatedTime to determine polling frequency and max wait duration — for example, a 49-minute episode may have an estimatedTime of ~891s (~15 mins), so the agent should wait at least that long before giving up.

Name Type Required Description

job_id string required The unique ID of the transcription job to check

schema_version string optional Optional schema version to use for tool execution

tool_version string optional Optional tool version to use for execution

Diarize connector

Diarize connector

Install the SDK

Set your credentials

Set up the connector

Get a Diarize API key

Create a connection in Scalekit

Add a connected account

Make your first call

What you can do

Common workflows

Tool list