Documentation

JavaScript
import { LiveTranscription } from '@vocalstack/js-sdk';

const sdk = new LiveTranscription({ apiKey: 'YOUR-API-KEY' });

const stream = await sdk.connect({
  // Optional: Integrate this stream with a Polyglot session
  polyglot_id: 'YOUR-POLYGLOT-SESSION-ID',
  // Optional: language of the speech spoken
  // (this can be used to improve the transcription accuracy)
  language: 'en',
  // Optional: Translate the transcription to these languages
  translations: ['de'],
  // Optional: Stop the stream after this many seconds of inactivity
  timeout_period_s: 60,
  // Optional: Hard stop the stream after this many seconds
  max_duration_s: 300,
});

// Start the stream
stream.start();

// Get audio data from a microphone and send it to the stream
// stream.sendBuffer(buffer);
// *** This is a placeholder for the actual implementation ***

// Manually stop the stream (in this example, after 60 seconds)
// If max_duration_s is set, stopping the stream is optional
setTimeout(() => stream.stop(), 60000);

// Listen for stream transcription data
stream.onData((response) => {
  const { status, data } = response;
  console.log(status); // 'waiting', 'processing', 'done', 'stopping' or 'error'
  if (data) {
    console.log(data.timeline); // an object with the transcription timeline
  }
  if (status === 'done') {
    console.log(data.summary); // a summary of the transcription
    console.log(data.keywords); // an array of keywords
    console.log(data.paragraphs); // the entire transcription in paragraph form
  }
});

오디오 스트림 데이터를 얻는 방법은 녹음 작업을 실행할 환경에 따라 다릅니다. 여기에 이것을 수행하는 방법에 대한 몇 가지 예가 있습니다.:

서버에

NextJS에서는 디바이스에서 오디오 데이터를 가져올 수 있는 패키지를 설치해야 하며, 이를 VocalStack API로 전달할 수 있습니다. 예를 들어, 다음과 같습니다.:

JavaScript
const mic = require('mic');

// Create a new instance of the microphone utility
const micInstance = mic();

// Get the audio input stream
const micStream = micInstance.getAudioStream();

// Capture the audio data from the microphone
micStream.on('data', (data) => {
  stream.sendBuffer(data); // send the buffer data to the VocalStack API
});

// Start capturing audio from the microphone
micInstance.start();

웹 브라우저에서

웹 브라우저에서 사용할 수 있습니다. 미디어 레코더 다음 예시에서 볼 수 있는 API. (또한 와 같은 패키지를 사용하는 것이 좋을 수도 있습니다.) recordrtc 브라우저 호환성을 향상시키는 브라우저 호환성 향상.

JavaScript
// Request access to the microphone
const mediaStream = await navigator.mediaDevices.getUserMedia({ audio: true });

// Create a MediaRecorder instance to capture audio data
const mediaRecorder = new MediaRecorder(mediaStream);

// Event handler to process audio data packets
mediaRecorder.ondataavailable = async (event) => {
  const blob = event.data; // this is the audio packet (Blob)
  const buffer = await blob.arrayBuffer(); // convert the Blob to a Buffer
  stream.sendBuffer(buffer); // send the buffer data to the VocalStack API
};

// Start capturing audio, and send it to the stream every second
mediaRecorder.start(1000);

웹 클라이언트에서 VocalStack API에 액세스하려면 인증 토큰을 사용해야 합니다.:

클라이언트 측 인증 토큰

클라이언트 측 요청을 위한 임시 인증 토큰을 만듭니다. API 키를 노출하지 않고도 웹 브라우저에서 API 요청을 안전하게 구현할 수 있습니다.

HLS LiveStream에서 전사

VocalStack API는 Youtube Live, Facebook Live, Twitch와 같은 소스를 포함한 모든 HLS LiveStream URL을 번역하는 데 사용할 수 있습니다. 스트림 URL이 해야 합니다 에 유의하십시오. . m3u8 파일 유효한 HLS (HTTP 라이브 스트리밍) 재생 목록 파일을 나타내는 파일 확장자.

JavaScript
import { LiveTranscription } from '@vocalstack/js-sdk';

const sdk = new LiveTranscription({ apiKey: 'YOUR-API-KEY' });

const stream = await sdk.connect({
  // must be a valid HLS streaming protocol
  livestream_url:
    'http://a.files.bbci.co.uk/media/live/manifesto/audio/simulcast/hls/nonuk/sbr_low/ak/bbc_world_service.m3u8',

  // The rest of these options are the same as for microphone live transcriptons
});

stream.start();

stream.onData((response) => {
  // The response object is the same as the one
  // returned by microphone transcriptions
});

Polyglot와의 통합

Polyglot와 함께 라이브 녹음을 통합하는 것은 을 추가하는 것만큼 간단합니다. 다국어 식별자(_I) 옵션은 위의 예에서 보여주듯이, 번역 요청에 표시됩니다.

혜택

Polyglot는 귀하의 녹음과 관련된 공개 공유 링크를 생성합니다 (링크는 암호로 보호 될 수 있습니다):

사용자는 링크를 사용하여 실시간으로 녹음을 읽을 수 있습니다.
사용자는 실시간으로 녹음을 읽을 언어를 선택할 수 있습니다.
사용자는 나중에 귀하의 녹음을 읽을 수 있으며, 다른 모든 녹음은 특정 Polyglot 세션과 통합됩니다.

번역 데이터 가져오기

보류 중인 혹은 완료된 녹음에서 데이터를 가져옵니다. 이것은 녹음 타임라인, 키워드, 요약 및 단락 세그먼트를 포함합니다.

클라이언트 측 인증 토큰

클라이언트 측 요청을 위한 임시 인증 토큰을 만듭니다. API 키를 노출하지 않고도 웹 브라우저에서 API 요청을 안전하게 구현할 수 있습니다.

Scroll Up

Polyglot

Business

VocalStack의 다국어 번역과 함께 세계의 잠금을 해제!

대규모 AI 모델이 전사에 중요한 이유

문서화

API 참조

Documentation

번역 데이터 가져오기

마이크 또는 LiveStream에서 녹음

번역 세션

번역 번역

URL에서 오디오를 번역

클라이언트 측 인증 토큰

녹음 요청 및 응답

다국어 세션을 녹음하고 발표

마이크 또는 LiveStream에서 녹음

마이크에서 녹음하기

서버에

웹 브라우저에서

HLS LiveStream에서 전사

Polyglot와의 통합

혜택

화이트 라벨링

더 알아보기

다음 단계