ryOS ryOS / Docs
GitHub Launch

Media API

The Media API provides endpoints for text-to-speech synthesis, audio transcription, and YouTube music search. These endpoints power the audio-related features in ryOS applications like iPod, Karaoke, and voice input.

All endpoints run on Vercel Node.js runtime and implement rate limiting to prevent abuse.

/api/speech is implemented with the shared apiHandler utility. /api/audio-transcribe is a multipart upload route and still uses explicit body-parser/CORS handling.

Text-to-Speech

Convert text to spoken audio using OpenAI or ElevenLabs voice synthesis.

Endpoint

MethodPathDescription
POST/api/speechConvert text to speech audio

Request

Headers:
HeaderRequiredDescription
Content-TypeYesapplication/json
AuthorizationNoBearer {token} for authenticated requests
X-UsernameNoUsername for rate limit tracking
Body (JSON):
FieldTypeRequiredDescription
textstringYesText to convert to speech
model"openai", "elevenlabs"NoTTS provider (default: "elevenlabs")
OpenAI Options:
FieldTypeDefaultDescription
voicestring"alloy"OpenAI voice name
speednumber1.1Speech speed multiplier
ElevenLabs Options:
FieldTypeDefaultDescription
voice_idstring"kAyjEabBEu68HYYYRAHR"ElevenLabs voice ID
model_idstring"eleven_turbo_v2_5"ElevenLabs model
output_formatstring"mp3_44100_128"Audio format
voice_settingsobjectSee belowVoice customization
Voice Settings Object:
{
  "stability": 0.3,
  "similarity_boost": 0.8,
  "use_speaker_boost": true,
  "speed": 1.1
}
Output Format Options:
  • mp3_44100_128 - MP3 at 44.1kHz, 128kbps
  • mp3_22050_32 - MP3 at 22.05kHz, 32kbps
  • pcm_16000 - PCM at 16kHz
  • pcm_22050 - PCM at 22.05kHz
  • pcm_24000 - PCM at 24kHz
  • pcm_44100 - PCM at 44.1kHz
  • ulaw_8000 - μ-law at 8kHz

Response

Success (200):

Returns audio stream with headers:

HeaderValue
Content-Typeaudio/mpeg
Content-LengthAudio byte length
Cache-Controlno-store
Error (400):
{
  "error": "'text' is required"
}
Rate Limit (429):
{
  "error": "rate_limit_exceeded",
  "scope": "burst" | "daily",
  "limit": 10,
  "windowSeconds": 60,
  "resetSeconds": 45,
  "identifier": "username or anon:ip"
}

Rate Limits

ScopeLimitWindow
Burst10 requests1 minute
Daily50 requests24 hours

> Note: Authenticated admin users bypass rate limiting.

Example

Request:
curl -X POST https://your-domain.com/api/speech \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, welcome to ryOS!",
    "model": "elevenlabs",
    "voice_id": "kAyjEabBEu68HYYYRAHR"
  }' \
  --output speech.mp3
With OpenAI:
curl -X POST https://your-domain.com/api/speech \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, welcome to ryOS!",
    "model": "openai",
    "voice": "nova",
    "speed": 1.0
  }' \
  --output speech.mp3

Audio Transcription

Transcribe audio files to text using OpenAI Whisper.

Endpoint

MethodPathDescription
POST/api/audio-transcribeTranscribe audio to text

Request

Headers:
HeaderRequiredDescription
Content-TypeYesmultipart/form-data
Form Data:
FieldTypeRequiredDescription
audioFileYesAudio file to transcribe
File Constraints:
ConstraintValue
Max Size2 MB
TypeMust start with audio/

Response

Success (200):
{
  "text": "Transcribed text content here"
}
Error (400) - No file:
{
  "error": "No audio file provided"
}
Error (400) - Invalid type:
{
  "error": "Invalid file type. Must be an audio file."
}
Error (400) - Too large:
{
  "error": "File exceeds maximum size of 2MB"
}
Rate Limit (429):
{
  "error": "rate_limit_exceeded",
  "scope": "burst" | "daily",
  "limit": 10,
  "windowSeconds": 60,
  "resetSeconds": 45,
  "identifier": "ip:xxx.xxx.xxx.xxx"
}

Rate Limits

ScopeLimitWindow
Burst10 requests1 minute
Daily50 requests24 hours

Example

Request:
curl -X POST https://your-domain.com/api/audio-transcribe \
  -F "[email protected]"
Response:
{
  "text": "This is the transcribed text from the audio recording."
}
JavaScript (FormData):
const formData = new FormData();
formData.append('audio', audioBlob, 'recording.webm');

const response = await fetch('/api/audio-transcribe', {
  method: 'POST',
  body: formData,
});

const { text } = await response.json();
console.log('Transcription:', text);

YouTube Search

Search YouTube for music videos. Results are filtered to the Music category.

Endpoint

MethodPathDescription
POST/api/youtube-searchSearch YouTube for music

Request

Headers:
HeaderRequiredDescription
Content-TypeYesapplication/json
Body (JSON):
FieldTypeRequiredDefaultDescription
querystringYes-Search query
maxResultsnumberNo10Results to return (1-25)

Response

Success (200):
{
  "results": [
    {
      "videoId": "dQw4w9WgXcQ",
      "title": "Rick Astley - Never Gonna Give You Up",
      "channelTitle": "Rick Astley",
      "thumbnail": "https://i.ytimg.com/vi/dQw4w9WgXcQ/mqdefault.jpg",
      "publishedAt": "2009-10-25T06:57:33Z"
    }
  ]
}
Result Item:
FieldTypeDescription
videoIdstringYouTube video ID
titlestringVideo title
channelTitlestringChannel name
thumbnailstringThumbnail URL (medium quality preferred)
publishedAtstringISO 8601 publish date
Error (400) - Invalid body:
{
  "error": "Invalid request body"
}
Error (403) - API access denied:
{
  "error": "YouTube API access denied",
  "code": 403,
  "hint": "YouTube API access denied. Ensure the API key is valid and YouTube Data API v3 is enabled in Google Cloud Console."
}
Error (500) - Not configured:
{
  "error": "YouTube API is not configured",
    "hint": "Add YOUTUBE_API_KEY to your .env.local file and restart the API server"
}
Rate Limit (429):
{
  "error": "rate_limit_exceeded",
  "scope": "burst" | "daily"
}

Rate Limits

ScopeLimitWindow
Burst20 requests1 minute
Daily200 requests24 hours

API Key Rotation

The endpoint supports multiple YouTube API keys for quota rotation:

  • YOUTUBE_API_KEY - Primary key
  • YOUTUBE_API_KEY_2 - Backup key

When the primary key's quota is exceeded, requests automatically fall back to backup keys.

Example

Request:
curl -X POST https://your-domain.com/api/youtube-search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Taylor Swift Shake It Off",
    "maxResults": 5
  }'
Response:
{
  "results": [
    {
      "videoId": "nfWlot6h_JM",
      "title": "Taylor Swift - Shake It Off",
      "channelTitle": "TaylorSwiftVEVO",
      "thumbnail": "https://i.ytimg.com/vi/nfWlot6h_JM/mqdefault.jpg",
      "publishedAt": "2014-08-19T04:00:02Z"
    }
  ]
}
JavaScript:
const response = await fetch('/api/youtube-search', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    query: 'Daft Punk Around the World',
    maxResults: 10,
  }),
});

const { results } = await response.json();
results.forEach(video => {
  console.log(`${video.title} - ${video.channelTitle}`);
});

Environment Variables

VariableRequired ForDescription
OPENAI_API_KEYTTS (OpenAI), TranscriptionOpenAI API key
ELEVENLABS_API_KEYTTS (ElevenLabs)ElevenLabs API key
YOUTUBE_API_KEYYouTube SearchPrimary YouTube Data API v3 key
YOUTUBE_API_KEY_2YouTube SearchBackup YouTube API key (optional)
REDIS_KV_REST_API_URLRate LimitingUpstash Redis URL
REDIS_KV_REST_API_TOKENRate LimitingUpstash Redis token

Related

  • Song API - Song library management for karaoke