mirror of https://github.com/The-Lab-by-Ordinary-Company/utsuwa.git synced 2026-06-23 01:03:19 -06:00

Utsuwa is an open-source alternative to Grok Companion. This is a platform where you can have a virtual AI waifu that learns and grows with you, bundled with optional mechanics inspired by Japanese da https://www.utsuwa.ai

ai ai-companion artificial-intelligence grok-companion v-tuber vrm

Svelte 68%
TypeScript 29.7%
CSS 1.8%
HTML 0.2%
Rust 0.2%
Other 0.1%

Find a file

CJ Dyas 7091c124a6 fix: serve the docs and app subdomains from their own root (#23 ) The landing page was prerendered, so a static / got served on every host and docs.utsuwa.ai / app.utsuwa.ai showed the landing page instead of the docs hub and the app. Subpaths worked because they fall through to the host rewrites, but the root collided with the static landing. Stop prerendering / (it's SSR now and stays host-aware via the reroute hook) and map each subdomain root explicitly, so docs.utsuwa.ai -> /docs and app.utsuwa.ai -> /app.		2026-06-18 21:28:51 -04:00
.github	feat: Tauri desktop app, docs site revamp, and blog (#8 )	2026-02-11 20:28:15 -05:00
docs/plans	feat: landing page, blog restyle, SEO, and production hardening (#10 )	2026-02-17 20:39:41 -05:00
src	fix: serve the docs and app subdomains from their own root (#23 )	2026-06-18 21:28:51 -04:00
src-tauri	chore: prepare v0.2.5 release (#18 )	2026-05-29 12:59:35 -04:00
static	feat: landing page, blog restyle, SEO, and production hardening (#10 )	2026-02-17 20:39:41 -05:00
.env.example	Initial commit	2026-01-25 19:30:50 -05:00
.gitignore	feat: Tauri desktop app, docs site revamp, and blog (#8 )	2026-02-11 20:28:15 -05:00
.npmrc	Initial commit	2026-01-25 19:30:50 -05:00
.nvmrc	feat: documentation hub with search, syntax highlighting, and navigation (#2 )	2026-01-28 22:10:42 -05:00
CHANGELOG.md	chore: prepare v0.2.5 release (#18 )	2026-05-29 12:59:35 -04:00
CODE_OF_CONDUCT.md	Initial commit	2026-01-25 19:30:50 -05:00
CONTRIBUTING.md	feat: landing page, blog restyle, SEO, and production hardening (#10 )	2026-02-17 20:39:41 -05:00
LICENSE	feat: landing page, blog restyle, SEO, and production hardening (#10 )	2026-02-17 20:39:41 -05:00
package.json	chore: prepare v0.2.5 release (#18 )	2026-05-29 12:59:35 -04:00
pnpm-lock.yaml	feat: Tauri desktop app, docs site revamp, and blog (#8 )	2026-02-11 20:28:15 -05:00
README.md	Add star history chart to README (#21 )	2026-05-30 14:07:58 -04:00
SECURITY.md	feat: Tauri desktop app, docs site revamp, and blog (#8 )	2026-02-11 20:28:15 -05:00
svelte.config.js	Fix Ollama local model discovery and browser-origin setup (#17 )	2026-05-29 12:34:51 -04:00
tsconfig.json	Initial commit	2026-01-25 19:30:50 -05:00
vercel.json	fix: serve the docs and app subdomains from their own root (#23 )	2026-06-18 21:28:51 -04:00
vite.config.ts	feat: semantic memory search with local embeddings (#1 )	2026-01-26 21:55:15 -05:00

README.md

Utsuwa (器)

Warning

Utsuwa and The Lab by Ordinary Company have not minted, launched, endorsed, or authorized any cryptocurrency, token, coin, NFT, or blockchain project. We never will. If you see crypto associated with Utsuwa or The Lab, it is a scam. This repository is the only authentic Utsuwa project repository.

Utsuwa is an open-source AI companion with 3D VRM avatars. A platform where you can have a virtual companion that learns and grows with you, bundled with optional mechanics inspired by Japanese dating sim games. Utsuwa is privacy-focused — your data is stored locally and never leaves your device.

"Utsuwa" means "vessel" in Japanese - a container for AI to inhabit visually.

Features

VRM Model Viewer: Load and display VRM 3D avatar models with orbit controls
Model-Centric UI: Full-screen 3D model with unobtrusive overlay controls
3D Speech Bubbles: Chat responses appear as bubbles that track the model's head in 3D space
Chat Interface: Bottom-centered input bar with streaming responses
Voice Input: Speech-to-text via Groq (Whisper) or Web Speech API with real-time audio visualization
LLM Integration: Support for 7 LLM providers including OpenAI, Anthropic, Google, xAI, DeepSeek, Ollama, and LM Studio
Local Model Discovery: Ollama and LM Studio discover installed local models directly from your device
Text-to-Speech: Support for ElevenLabs and OpenAI TTS
Lip-sync: Audio-driven mouth animation synced to TTS playback
Animations: VRMA-based idle and talking animations with automatic blinking
Character Customization: Customize your companion's name, personality, and system prompt
Companion System: Multi-axis relationship tracking with mood, events, and semantic memory
Semantic Memory: Local AI-powered memory search using Transformers.js - finds memories by meaning, not just keywords
Memory Graph: Interactive visualization showing how memories connect semantically
Data Export/Import: Download your data as a save file, restore anytime
Theming: Light and dark mode support with system preference detection
Desktop App (beta, macOS only): Native desktop app with transparent overlay mode — your companion floats on your desktop

Local-First Storage

All your data is stored locally on your device using IndexedDB:

No database setup required
Works offline after initial load
Export/import save files to back up or transfer your data
Settings > Data to manage your save files

Companion System

Build a meaningful relationship with your AI companion through a dating sim-inspired progression system:

Multi-axis Relationships: Track affection, trust, intimacy, comfort, and respect separately
8 Relationship Stages: Progress from Stranger → Acquaintance → Friend → Close Friend → Romantic Interest → Dating → Committed → Soulmate
Dynamic Mood: Real-time emotions with causality tracking (she remembers why she feels a certain way)
Visual Novel Events: Milestone moments, romantic scenes, and choices that matter - with custom dialogue and branching responses
Semantic Memory: Facts are indexed with vector embeddings for meaning-based retrieval - "outdoor activities" finds memories about hiking. Runs locally using Transformers.js, no API calls
Natural Progression: Hybrid system combining app heuristics + LLM suggestions for believable relationship growth
Time-Aware: Your companion notices when you've been away and reacts accordingly

See the Companion System Architecture for full details.

Desktop Application (Beta)

A native desktop app built with Tauri that includes all web features plus:

Overlay Mode: Your companion floats on your desktop with a transparent background
Always-on-Top: The overlay stays visible over all other windows
Draggable Positioning: Click and drag the character to reposition anywhere on screen
Floating Chat: Expandable chat input that appears when you click the chat icon
Window Switching: Seamlessly switch between the full app and overlay mode
Global Hotkeys: Push-to-talk, toggle overlay, and focus chat with keyboard shortcuts

The desktop app uses the same codebase as the web version — your save files are compatible between both.

Supported Providers

LLM Providers (7)

Category	Providers
Cloud	OpenAI, Anthropic, Google Gemini, DeepSeek, xAI (Grok)
Local	Ollama, LM Studio

TTS Providers (2)

Category	Providers
Cloud	ElevenLabs, OpenAI TTS

STT Providers (2)

Category	Providers
Cloud	Groq (Whisper)
Browser	Web Speech API (no API key required)

Voice input is accessed via the microphone button in the chat bar. Groq STT uses Whisper for accurate transcription on any platform (including desktop). Web Speech API works without an API key in Chrome, Edge, and Safari. If a Groq API key is configured, it takes priority automatically.

Getting Started

Note

Utsuwa is in its very early development stages. If you're using the app, save your data often. Early versions may not have backwards-compatible save states and could require manual reformatting.

Try it Online

Use Utsuwa directly at utsuwa.ai — no installation required. Or download the macOS desktop app from GitHub Releases.

Self-Hosting

If you prefer to run Utsuwa locally or host your own instance:

Prerequisites

Node.js 22+
pnpm (recommended) or npm
A modern browser (Chrome, Firefox, Safari, Edge) — for the web version

Installation

# Clone the repository
git clone https://github.com/The-Lab-by-Ordinary-Company/utsuwa.git
cd utsuwa

# Install dependencies
pnpm install

# Start development server
pnpm dev

The app will be available at http://localhost:5173

Running the Desktop App (Beta)

To run the desktop app from source, you'll need the Rust toolchain in addition to the web prerequisites:

# Install Rust (if not already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Run the desktop app
pnpm tauri dev

Configuration

Click the Settings (gear icon) in the sidebar
Navigate to Settings > Character to configure your chat provider:
- Enable Chat (LLM)
- Select a cloud provider and enter your API key
- Or select a local server like Ollama or LM Studio and choose an installed model from the discovered model dropdown
Configure text-to-speech in the same settings area (optional):
- Select a TTS provider
- Enter your API key
- Configure voice settings

All API keys are stored locally on your device and are never sent to any server except the respective API providers.

Loading a VRM Model

Go to Settings > Avatar
Click "Load VRM" to select a local .vrm file
Or enter a URL to load a VRM model from the web

Data Management

Your companion data is stored locally on your device. To back up or transfer your data:

Go to Settings > Data
Click Export Save to download a JSON file with all your data
To restore, click Import Save and select your save file
Choose Replace (wipe and restore) or Merge (add to existing)

Project Structure

utsuwa/
├── src/
│   ├── lib/
│   │   ├── ai/             # LLM response parsing and prompt building
│   │   ├── assets/         # Static assets
│   │   ├── components/     # Svelte components
│   │   ├── config/         # App and docs configuration
│   │   ├── data/           # Event definitions and static data
│   │   ├── db/             # IndexedDB database (Dexie)
│   │   ├── engine/         # Companion engine (state, memory, events)
│   │   ├── services/       # LLM, TTS, STT, storage services
│   │   ├── stores/         # Svelte 5 stores (state management)
│   │   ├── styles/         # Shared CSS (prose, etc.)
│   │   ├── types/          # TypeScript types
│   │   └── utils/          # Utility functions
│   ├── content/
│   │   ├── blog/           # Blog post markdown content
│   │   └── docs/           # Documentation site markdown content
│   └── routes/
│       ├── app/            # Main application routes
│       ├── api/            # API routes
│       ├── blog/           # Blog routes
│       ├── docs/           # Documentation site routes
│       └── overlay/        # Desktop overlay route
├── src-tauri/               # Tauri desktop app (Rust)
├── static/
│   └── models/             # Place default VRM models here
└── package.json

Scripts

pnpm dev          # Start web development server
pnpm build        # Build web app for production
pnpm preview      # Preview production build
pnpm lint         # Type-check the project (svelte-check)
pnpm check        # Same as lint (alias)
pnpm check:watch  # Type-check in watch mode
pnpm tauri dev    # Run the desktop app in development mode
pnpm tauri build  # Build desktop app installer

Roadmap

Completed

VRM model loading and display with orbit controls
3D speech bubbles tracking model head position
Multi-provider LLM support (7 providers)
Multi-provider TTS support (2 providers)
Audio-driven lip-sync
VRMA-based animations (idle, talking, blinking)
Companion system with multi-axis relationships
8-stage relationship progression (Stranger → Soulmate)
Visual novel event system with choices
Semantic memory system with local embeddings (Transformers.js)
Time-based mood and relationship decay/recovery
Local-first IndexedDB storage with export/import
Theme system with light/dark modes
Voice input via Groq STT (Whisper) and Web Speech API
Desktop application with transparent overlay mode (macOS only, Windows/Linux planned)

In Progress / Planned

File, Image, and Video Uploads - Add support for attaching files, images, and videos for multimodal LLM workflows and providers that can use richer context or web-aware tools
OpenAI-Compatible Models - Add a configurable option for OpenAI-compatible model endpoints beyond the currently listed providers
Multi-provider STT - Support for additional speech-to-text providers beyond Groq and Web Speech API
Live2D Support - Alternative to VRM for 2D animated avatars
Windows and Linux Desktop Apps - Expand desktop builds beyond the current macOS beta

Contributing

Contributions are welcome! Please read our Contributing Guide for details on how to submit pull requests, report issues, and contribute to the project.

Security

For information about security considerations and how to report vulnerabilities, please see our Security Policy.

Acknowledgments

Utsuwa is built on the shoulders of these excellent projects:

Inspiration

Airi - The original inspiration for this project. A beautiful AI companion with VRM avatar support.
Amica - Open-source AI companion with VRM support and emotional expressions.
Riko Project by JustRyan - AI waifu project showcasing VRM avatar interactions.

Core Technologies

@pixiv/three-vrm - VRM model loading and rendering for Three.js
xsAI - Unified LLM and TTS provider integration
Three.js - 3D graphics engine
Threlte - Svelte components for Three.js
SvelteKit - Web application framework
Tauri - Desktop application framework
Tailwind CSS - Utility-first CSS framework
Transformers.js - In-browser ML for semantic memory embeddings

UI & Data

bits-ui - Headless UI components for Svelte
Dexie.js - IndexedDB wrapper for local storage
force-graph - Force-directed graph visualization for memory graph
simple-icons - SVG icons for provider logos

3D Effects

n8ao - Ambient occlusion for Three.js
postprocessing - Post-processing effects

License

This project is licensed under the MIT License - see the LICENSE file for details.