AI System

Multi-provider AI with streaming responses, tool-loop orchestration, and a two-tier memory pipeline.

Providers

Provider	SDK	Models
OpenAI	`@ai-sdk/openai`	`gpt-5.4`
Anthropic	`@ai-sdk/anthropic`	`sonnet-4.6`
Google	`@ai-sdk/google`	`gemini-3-flash`, `gemini-3.1-pro-preview`

Default model: gpt-5.4

Specialized models used by specific flows:

gemini-3-flash-preview (proactive greeting, applet text mode, chat-room auto replies, memory extraction, and daily-notes processing)
gemini-3.1-flash-image-preview (applet image generation)

graph TD
    A[User Message] --> B[Chat API]
    B --> C{Provider Selection}
    C -->|OpenAI| D[gpt-5.4]
    C -->|Anthropic| E[claude-sonnet-4-6]
    C -->|Google| F[gemini-3-flash / gemini-3.1-pro-preview]
    D --> G[AI SDK Stream]
    E --> G
    F --> G
    G --> H[Tool Loop + Response Handler]
    H --> I[UI Update]

Available Tools

Tool	Description
`launchApp`	Open applications (supports Internet Explorer URL + year time-travel launch)
`closeApp`	Close applications
`ipodControl`	iPod playback control: toggle/play/pause/playKnown/addAndPlay/next/previous (+ video/fullscreen/lyrics translation options)
`karaokeControl`	Karaoke playback control (shared music library with iPod, independent playback state)
`generateHtml`	Create HTML applets with title and emoji icon
`aquarium`	Render interactive emoji aquarium in chat
`list`	List VFS items: `/Applets`, `/Documents` (includes document names), `/Applications`, `/Music`, `/Applets Store`
`mapsSearchPlaces`	Server-side Apple MapKit place search; chat shows an inline card — tapping a result opens Maps in ryOS with that place selected; external link opens Apple Maps
`open`	Open files/apps/media from virtual file system
`read`	Read file contents (applets, documents, Applets Store items)
`write`	Create/modify markdown documents (overwrite/append/prepend modes)
`edit`	Edit existing files with precise text replacement
`searchSongs`	Search YouTube for songs (with API-key rotation and retry)
`settings`	Change language, theme, volume, speech, check-for-updates
`stickiesControl`	List/create/update/delete/clear sticky notes
`infiniteMacControl`	Control Infinite Mac emulator (launch system, screen read, mouse/keyboard actions, pause state)
`calendarControl`	Create, update, delete, and list calendar events; create, toggle, delete, and list todos
`contactsControl`	Search, create, update, and delete contacts (with Telegram field support)
`documentsControl`	List, read, write, and edit cloud-synced markdown documents
`memoryWrite`	Unified memory writer (`long_term` or `daily`)
`memoryRead`	Unified memory reader (`long_term` by key or `daily` by date)
`memoryDelete`	Delete long-term memory by key
`songLibraryControl`	Search ryOS song libraries and cached metadata from server (list/search/get/searchYoutube/add; scopes: user/global/any; Telegram/server-side)
`web_search`	OpenAI provider web search (GPT-5.4 only, authenticated users, with geolocation context)
`google_search`	Google provider web search (Gemini 3 Flash only, authenticated users)
`webFetch`	Server-side URL fetch with HTML-to-text extraction for Ryo (sanitized)
`tvControl`	TV lineup and playback: list/tune channels, AI `createChannel` fanout, add/remove videos on custom channels
`cursorCloudAgent`	Async Cursor Cloud repo-agent runs against `ryokun6/ryos` (owner-only): live stream card, PR link, follow-up turns
`listCursorCloudAgentRuns`	List recent Cursor Cloud agent runs with dashboard URLs (owner-only)

API Endpoints

Endpoint	Purpose
`/api/chat`	Main chat with streaming, tool-calling, and context-aware prompt assembly
`/api/ai/extract-memories`	Single-pass extraction of daily notes + long-term memories from chat history
`/api/ai/process-daily-notes`	Background processing of past daily notes into long-term memories
`/api/ai/ryo-reply`	Auto-reply generation for chat rooms
`/api/applet-ai`	Applet AI assistant (text + image mode, multimodal input)
`/api/ie-generate`	Internet Explorer time-travel page generation
`/api/speech`	Text-to-speech synthesis
`/api/audio-transcribe`	Audio transcription
`/api/webhooks/telegram`	Telegram bot webhook for DM chat (image support, web search, AI tool execution)
`/api/cron/telegram-heartbeat`	AI-powered proactive heartbeat messages via Telegram (cron-triggered)
`/api/telegram/link/create`	Generate Telegram account linking code
`/api/telegram/link/status`	Check Telegram link status
`/api/telegram/link/disconnect`	Disconnect linked Telegram account

Architecture

sequenceDiagram
    participant U as User
    participant C as Chat UI
    participant API as /api/chat
    participant M as AI Model
    participant T as Tool Runtime
    participant S as ryOS State

    U->>C: Send message
    C->>API: POST messages + systemState
    API->>M: streamText (static+dynamic prompts)
    M-->>API: Tool call(s)
    API->>T: Execute server tools / emit client tools
    T->>S: Read or mutate state
    S-->>T: Result
    T-->>M: Tool result
    M-->>API: Final tokens
    API-->>C: UI message stream
    C-->>U: Display response

Tool Handlers

Backend tool registry lives in api/chat/tools/:

api/chat/tools/types.ts - Tool constants and TypeScript contracts

api/chat/tools/schemas.ts - Zod input schemas and action-specific validation
api/chat/tools/executors.ts - Server-side executors (generateHtml, searchSongs, memory tools, calendar, stickies, contacts, documents)
api/chat/tools/index.ts - createChatTools() registry with profile-based filtering (all, memory, telegram)

Tool profiles control which tools are available per channel:

all (web chat): Full tool set — all client-side and server-side tools
telegram: Server-side subset — memory, calendar, stickies, contacts, documents, songLibraryControl (all execute server-side via Redis)
memory: Memory tools only

Client execution handlers remain in src/apps/chats/tools/:

appHandlers.ts - Launch/close app execution

ipodHandler.ts / karaokeHandler.ts - Media control execution
calendarHandler.ts - Calendar event management execution
contactsHandler.ts - Contact management execution
settingsHandler.ts - System settings updates
stickiesHandler.ts - Sticky note operations
infiniteMacHandler.ts - Infinite Mac control bridge
tvHandler.ts - TV channel lineup and tuning execution

Shared conversation preparation lives in api/_utils/ryo-conversation.ts:

prepareRyoConversationModelInput() - Unified entry point for both web chat and Telegram channels

Assembles static system prompt, dynamic context (memories, daily notes, system state), and tools
Handles model selection, message enrichment, and OpenAI web search injection
Cursor Cloud agent completion notifications sent to Telegram (formatCursorRunCompletionTelegramMessage in api/chat/tools/cursor-repo-agent.ts) run the summary and title through simplifyTelegramCitationDisplay in api/_utils/telegram-format.ts, append labeled plain URLs (Agent: dashboard, PR: when known), and keep DMs markdown-free; web chat / polling still use the raw summary.

Tool schema highlights

launchApp now enforces that internet-explorer launches must provide both url and year together (or neither), with year-range validation.

ipodControl and karaokeControl schemas enforce action-specific arguments (e.g. addAndPlay requires id; playback-state actions must not include track identifiers).
memoryWrite / memoryRead are unified schemas using a type field:
- long_term (default): key-based memory operations
- daily: journal-style per-day operations
infiniteMacControl supports multimodal screen inspection by returning screen captures that can be converted into model-readable image content.

System Prompts

Core prompt constants are defined in api/_utils/_aiPrompts.ts:

CORE_PRIORITY_INSTRUCTIONS - Priority and memory-override rules

RYO_PERSONA_INSTRUCTIONS - Ryo identity and background
ANSWER_STYLE_INSTRUCTIONS - Style and language behavior
CODE_GENERATION_INSTRUCTIONS - Applet generation constraints
CHAT_INSTRUCTIONS - Chats behavior and memory usage guidance
TELEGRAM_CHAT_INSTRUCTIONS - Telegram DM-specific behavior (plain text only, calendar/stickies/contacts/documents tool guidance)
TOOL_USAGE_INSTRUCTIONS - VFS and tool workflow rules
MEMORY_INSTRUCTIONS - Two-tier memory strategy and tool usage policy
IE_HTML_GENERATION_INSTRUCTIONS - Internet Explorer HTML generation rules

Channel-specific prompt composition (via ryo-conversation.ts):

Web chat (/api/chat): CORE_PRIORITY + ANSWER_STYLE + RYO_PERSONA + CHAT + TOOL_USAGE + MEMORY + CODE_GENERATION, then appends dynamic user/system state.
Telegram (/api/webhooks/telegram): CORE_PRIORITY + ANSWER_STYLE + RYO_PERSONA + TELEGRAM_CHAT + MEMORY, then appends dynamic user context.
/api/applet-ai uses a dedicated compact applet system prompt for embedded UI contexts.
/api/ie-generate splits prompts into static + dynamic sections for year/URL-aware generation.

Memory System

ryOS uses a two-tier Redis-backed memory model:

Tier 1: Daily Notes (journal memory)

Append-only entries grouped by date (YYYY-MM-DD) in user timezone.

Each entry stores:
- Unix timestamp (timestamp)
- UTC ISO timestamp (isoTimestamp)
- Local date/time (localDate, localTime)
- Timezone (timeZone)
- Entry content
Daily notes auto-expire after 30 days (TTL).
Recent notes from the last 3 days are injected into chat prompt context.

Tier 2: Long-Term Memories

Two-layer structure:
- Index: key + summary + updatedAt (always visible to the model)
- Detail: full content + createdAt + updatedAt

Capped at 50 memories per user.
Canonical key guidance (e.g. name, preferences, projects, instructions) is used by extraction pipelines.

Stale-memory cleanup

Long-term hygiene includes automatic cleanup of stale temporary memories.

Temporary context-like memories (e.g. short-lived travel/meeting context) are removed when old enough (default retention: 7 days) and heuristics identify them as transient.
Cleanup runs as part of the daily-notes processing cycle before new extraction.

Daily Notes Processing Pipeline

flowchart TD
    A[Conversation / tool writes] --> B[/api/ai/extract-memories]
    B --> C[Append daily notes + optional long-term updates]
    C --> D[Unprocessed daily notes accumulate]
    D --> E[/api/chat proactive greeting trigger]
    E --> F[/api/ai/process-daily-notes]
    F --> G[Cleanup stale temporary memories]
    G --> H[Process past days only, oldest first]
    H --> I[Extract + consolidate long-term memories]
    I --> J[Mark daily note processed]

Pipeline behavior:

extract-memories performs single-pass extraction from chat history (daily notes + candidate long-term facts).
Daily notes continue collecting while a day is active.
process-daily-notes processes unprocessed past days (excludes today), consolidates overlaps, and marks each processed day.
Chat endpoint can trigger process-daily-notes in the background during proactive greeting flow.
Telegram heartbeat cron also triggers process-daily-notes and extracts memories from new Telegram chat messages since the last heartbeat.

apiHandler Pattern

AI endpoints use a shared api/_utils/api-handler.ts wrapper for consistency:

CORS and allowed-origin checks

Method gating + automatic OPTIONS handling
Optional JSON body parsing
Shared per-request context injection (req, res, redis, logger, origin, user, body)
Unified auth modes (none, optional, required) with optional expired-token allowance
Unified top-level error handling and status logging

Common endpoint configurations in this AI stack:

/api/chat: auth: "optional", allowExpiredAuth: true, parseJsonBody: true, contentType: null

/api/applet-ai: auth: "optional", parseJsonBody: true, contentType: null
/api/ie-generate: auth: "none", parseJsonBody: true, contentType: null
/api/ai/extract-memories: auth: "required", parseJsonBody: true
/api/ai/process-daily-notes: auth: "required", parseJsonBody: true
/api/ai/ryo-reply: auth: "required", parseJsonBody: true
/api/webhooks/telegram: Custom handler (webhook secret validation, Telegram-specific auth via linked accounts)
/api/cron/telegram-heartbeat: Custom handler (cron secret via Authorization: Bearer header)

Additional AI Capabilities

Proactive greetings: /api/chat supports a proactive greeting mode for logged-in users with memories. Uses gemini-3-flash-preview to generate a short, context-aware greeting referencing recent activity or memories. Triggers background daily-note processing on each greeting.

Telegram bot DM chat: /api/webhooks/telegram enables private Telegram DM conversations with Ryo. Supports image attachments (downloaded and injected as multimodal content), web search, and server-side tool execution (memory, calendar, stickies, contacts, documents). Users link accounts via /api/telegram/link/* endpoints. Includes per-user burst and account-window rate limiting.
Telegram heartbeat insights: /api/cron/telegram-heartbeat runs on a 30-minute cron schedule. Analyzes today's daily notes, recent Telegram conversation, and heartbeat history to decide whether to proactively message the user. Processes daily notes and extracts memories from new chat messages before each decision. Uses gating logic to avoid redundant or stale nudges.
Web search: Authenticated users get a search tool based on the selected model: web_search (OpenAI) for gpt-5.4 with geolocation context, or google_search (Google) for gemini-3-flash. Anonymous users do not get search tools.
Chat-room auto replies: /api/ai/ryo-reply generates room messages as ryo with dedicated rate limits.
Applet multimodal AI: /api/applet-ai supports text chat, image attachments in message history, and binary image generation responses.
Infinite Mac visual loop: infiniteMacControl can return screenshots for model-visible state inspection.
Internet Explorer caching: /api/ie-generate stores cleaned generated HTML snapshots in Redis for recent-history retrieval.