AI System
Multi-provider AI with streaming responses, tool-loop orchestration, and a two-tier memory pipeline.
Providers
| Provider | SDK | Models |
|---|---|---|
| OpenAI | @ai-sdk/openai | gpt-5.4 |
| Anthropic | @ai-sdk/anthropic | sonnet-4.6 |
@ai-sdk/google | gemini-3-flash, gemini-3.1-pro-preview |
Default model: gpt-5.4
Specialized models used by specific flows:
gemini-3-flash-preview(proactive greeting, applet text mode, chat-room auto replies, memory extraction, and daily-notes processing)gemini-3.1-flash-image-preview(applet image generation)
graph TD
A[User Message] --> B[Chat API]
B --> C{Provider Selection}
C -->|OpenAI| D[gpt-5.4]
C -->|Anthropic| E[claude-sonnet-4-6]
C -->|Google| F[gemini-3-flash / gemini-3.1-pro-preview]
D --> G[AI SDK Stream]
E --> G
F --> G
G --> H[Tool Loop + Response Handler]
H --> I[UI Update]
Available Tools
| Tool | Description |
|---|---|
launchApp | Open applications (supports Internet Explorer URL + year time-travel launch) |
closeApp | Close applications |
ipodControl | iPod playback control: toggle/play/pause/playKnown/addAndPlay/next/previous (+ video/fullscreen/lyrics translation options) |
karaokeControl | Karaoke playback control (shared music library with iPod, independent playback state) |
generateHtml | Create HTML applets with title and emoji icon |
aquarium | Render interactive emoji aquarium in chat |
list | List VFS items: /Applets, /Documents (includes document names), /Applications, /Music, /Applets Store |
open | Open files/apps/media from virtual file system |
read | Read file contents (applets, documents, Applets Store items) |
write | Create/modify markdown documents (overwrite/append/prepend modes) |
edit | Edit existing files with precise text replacement |
searchSongs | Search YouTube for songs (with API-key rotation and retry) |
settings | Change language, theme, volume, speech, check-for-updates |
stickiesControl | List/create/update/delete/clear sticky notes |
infiniteMacControl | Control Infinite Mac emulator (launch system, screen read, mouse/keyboard actions, pause state) |
calendarControl | Create, update, delete, and list calendar events; create, toggle, delete, and list todos |
contactsControl | Search, create, update, and delete contacts (with Telegram field support) |
documentsControl | List, read, write, and edit cloud-synced markdown documents |
memoryWrite | Unified memory writer (long_term or daily) |
memoryRead | Unified memory reader (long_term by key or daily by date) |
memoryDelete | Delete long-term memory by key |
songLibraryControl | Search ryOS song libraries and cached metadata from server (list/search/get/searchYoutube/add; scopes: user/global/any; Telegram/server-side) |
web_search | OpenAI provider web search (GPT-5.4 only, authenticated users, with geolocation context) |
google_search | Google provider web search (Gemini 3 Flash only, authenticated users) |
webFetch | Server-side URL fetch with HTML-to-text extraction for Ryo (sanitized) |
API Endpoints
| Endpoint | Purpose |
|---|---|
/api/chat | Main chat with streaming, tool-calling, and context-aware prompt assembly |
/api/ai/extract-memories | Single-pass extraction of daily notes + long-term memories from chat history |
/api/ai/process-daily-notes | Background processing of past daily notes into long-term memories |
/api/ai/ryo-reply | Auto-reply generation for chat rooms |
/api/applet-ai | Applet AI assistant (text + image mode, multimodal input) |
/api/ie-generate | Internet Explorer time-travel page generation |
/api/speech | Text-to-speech synthesis |
/api/audio-transcribe | Audio transcription |
/api/webhooks/telegram | Telegram bot webhook for DM chat (image support, web search, AI tool execution) |
/api/cron/telegram-heartbeat | AI-powered proactive heartbeat messages via Telegram (cron-triggered) |
/api/telegram/link/create | Generate Telegram account linking code |
/api/telegram/link/status | Check Telegram link status |
/api/telegram/link/disconnect | Disconnect linked Telegram account |
Architecture
sequenceDiagram
participant U as User
participant C as Chat UI
participant API as /api/chat
participant M as AI Model
participant T as Tool Runtime
participant S as ryOS State
U->>C: Send message
C->>API: POST messages + systemState
API->>M: streamText (static+dynamic prompts)
M-->>API: Tool call(s)
API->>T: Execute server tools / emit client tools
T->>S: Read or mutate state
S-->>T: Result
T-->>M: Tool result
M-->>API: Final tokens
API-->>C: UI message stream
C-->>U: Display response
Tool Handlers
Backend tool registry lives in api/chat/tools/:
api/chat/tools/types.ts- Tool constants and TypeScript contracts
api/chat/tools/schemas.ts- Zod input schemas and action-specific validationapi/chat/tools/executors.ts- Server-side executors (generateHtml,searchSongs, memory tools, calendar, stickies, contacts, documents)api/chat/tools/index.ts-createChatTools()registry with profile-based filtering (all,memory,telegram)
Tool profiles control which tools are available per channel:
all(web chat): Full tool set — all client-side and server-side toolstelegram: Server-side subset — memory, calendar, stickies, contacts, documents, songLibraryControl (all execute server-side via Redis)memory: Memory tools only
Client execution handlers remain in src/apps/chats/tools/:
appHandlers.ts- Launch/close app execution
ipodHandler.ts/karaokeHandler.ts- Media control executioncalendarHandler.ts- Calendar event management executioncontactsHandler.ts- Contact management executionsettingsHandler.ts- System settings updatesstickiesHandler.ts- Sticky note operationsinfiniteMacHandler.ts- Infinite Mac control bridge
Shared conversation preparation lives in api/_utils/ryo-conversation.ts:
prepareRyoConversationModelInput()- Unified entry point for both web chat and Telegram channels
- Assembles static system prompt, dynamic context (memories, daily notes, system state), and tools
- Handles model selection, message enrichment, and OpenAI web search injection
Tool schema highlights
launchAppnow enforces thatinternet-explorerlaunches must provide bothurlandyeartogether (or neither), with year-range validation.
ipodControlandkaraokeControlschemas enforce action-specific arguments (e.g.addAndPlayrequiresid; playback-state actions must not include track identifiers).memoryWrite/memoryReadare unified schemas using atypefield:long_term(default): key-based memory operationsdaily: journal-style per-day operations
infiniteMacControlsupports multimodal screen inspection by returning screen captures that can be converted into model-readable image content.
System Prompts
Core prompt constants are defined in api/_utils/_aiPrompts.ts:
CORE_PRIORITY_INSTRUCTIONS- Priority and memory-override rules
RYO_PERSONA_INSTRUCTIONS- Ryo identity and backgroundANSWER_STYLE_INSTRUCTIONS- Style and language behaviorCODE_GENERATION_INSTRUCTIONS- Applet generation constraintsCHAT_INSTRUCTIONS- Chats behavior and memory usage guidanceTELEGRAM_CHAT_INSTRUCTIONS- Telegram DM-specific behavior (plain text only, calendar/stickies/contacts/documents tool guidance)TOOL_USAGE_INSTRUCTIONS- VFS and tool workflow rulesMEMORY_INSTRUCTIONS- Two-tier memory strategy and tool usage policyIE_HTML_GENERATION_INSTRUCTIONS- Internet Explorer HTML generation rules
Channel-specific prompt composition (via ryo-conversation.ts):
- Web chat (
/api/chat):CORE_PRIORITY+ANSWER_STYLE+RYO_PERSONA+CHAT+TOOL_USAGE+MEMORY+CODE_GENERATION, then appends dynamic user/system state. - Telegram (
/api/webhooks/telegram):CORE_PRIORITY+ANSWER_STYLE+RYO_PERSONA+TELEGRAM_CHAT+MEMORY, then appends dynamic user context. /api/applet-aiuses a dedicated compact applet system prompt for embedded UI contexts./api/ie-generatesplits prompts into static + dynamic sections for year/URL-aware generation.
Memory System
ryOS uses a two-tier Redis-backed memory model:
Tier 1: Daily Notes (journal memory)
- Append-only entries grouped by date (
YYYY-MM-DD) in user timezone.
- Each entry stores:
- Unix timestamp (
timestamp) - UTC ISO timestamp (
isoTimestamp) - Local date/time (
localDate,localTime) - Timezone (
timeZone) - Entry content
- Unix timestamp (
- Daily notes auto-expire after 30 days (TTL).
- Recent notes from the last 3 days are injected into chat prompt context.
Tier 2: Long-Term Memories
- Two-layer structure:
- Index: key + summary +
updatedAt(always visible to the model) - Detail: full content +
createdAt+updatedAt
- Index: key + summary +
- Capped at 50 memories per user.
- Canonical key guidance (e.g.
name,preferences,projects,instructions) is used by extraction pipelines.
Stale-memory cleanup
- Long-term hygiene includes automatic cleanup of stale temporary memories.
- Temporary context-like memories (e.g. short-lived travel/meeting context) are removed when old enough (default retention: 7 days) and heuristics identify them as transient.
- Cleanup runs as part of the daily-notes processing cycle before new extraction.
Daily Notes Processing Pipeline
flowchart TD
A[Conversation / tool writes] --> B[/api/ai/extract-memories]
B --> C[Append daily notes + optional long-term updates]
C --> D[Unprocessed daily notes accumulate]
D --> E[/api/chat proactive greeting trigger]
E --> F[/api/ai/process-daily-notes]
F --> G[Cleanup stale temporary memories]
G --> H[Process past days only, oldest first]
H --> I[Extract + consolidate long-term memories]
I --> J[Mark daily note processed]
Pipeline behavior:
extract-memoriesperforms single-pass extraction from chat history (daily notes + candidate long-term facts).- Daily notes continue collecting while a day is active.
process-daily-notesprocesses unprocessed past days (excludes today), consolidates overlaps, and marks each processed day.- Chat endpoint can trigger
process-daily-notesin the background during proactive greeting flow. - Telegram heartbeat cron also triggers
process-daily-notesand extracts memories from new Telegram chat messages since the last heartbeat.
apiHandler Pattern
AI endpoints use a shared api/_utils/api-handler.ts wrapper for consistency:
- CORS and allowed-origin checks
- Method gating + automatic
OPTIONShandling - Optional JSON body parsing
- Shared per-request context injection (
req,res,redis,logger,origin,user,body) - Unified auth modes (
none,optional,required) with optional expired-token allowance - Unified top-level error handling and status logging
Common endpoint configurations in this AI stack:
/api/chat:auth: "optional",allowExpiredAuth: true,parseJsonBody: true,contentType: null
/api/applet-ai:auth: "optional",parseJsonBody: true,contentType: null/api/ie-generate:auth: "none",parseJsonBody: true,contentType: null/api/ai/extract-memories:auth: "required",parseJsonBody: true/api/ai/process-daily-notes:auth: "required",parseJsonBody: true/api/ai/ryo-reply:auth: "required",parseJsonBody: true/api/webhooks/telegram: Custom handler (webhook secret validation, Telegram-specific auth via linked accounts)/api/cron/telegram-heartbeat: Custom handler (cron secret viaAuthorization: Bearerheader)
Additional AI Capabilities
- Proactive greetings:
/api/chatsupports a proactive greeting mode for logged-in users with memories. Usesgemini-3-flash-previewto generate a short, context-aware greeting referencing recent activity or memories. Triggers background daily-note processing on each greeting.
- Telegram bot DM chat:
/api/webhooks/telegramenables private Telegram DM conversations with Ryo. Supports image attachments (downloaded and injected as multimodal content), web search, and server-side tool execution (memory, calendar, stickies, contacts, documents). Users link accounts via/api/telegram/link/*endpoints. Includes per-user burst and account-window rate limiting. - Telegram heartbeat insights:
/api/cron/telegram-heartbeatruns on a 30-minute cron schedule. Analyzes today's daily notes, recent Telegram conversation, and heartbeat history to decide whether to proactively message the user. Processes daily notes and extracts memories from new chat messages before each decision. Uses gating logic to avoid redundant or stale nudges. - Web search: Authenticated users get a search tool based on the selected model:
web_search(OpenAI) forgpt-5.4with geolocation context, orgoogle_search(Google) forgemini-3-flash. Anonymous users do not get search tools. - Chat-room auto replies:
/api/ai/ryo-replygenerates room messages asryowith dedicated rate limits. - Applet multimodal AI:
/api/applet-aisupports text chat, image attachments in message history, and binary image generation responses. - Infinite Mac visual loop:
infiniteMacControlcan return screenshots for model-visible state inspection. - Internet Explorer caching:
/api/ie-generatestores cleaned generated HTML snapshots in Redis for recent-history retrieval.