112 lines
3.3 KiB
Markdown
112 lines
3.3 KiB
Markdown
|
|
# Project Riko
|
|||
|
|
|
|||
|
|
#### **Patreon Version:** *Windows version 1.1 — 2025-11-06*
|
|||
|
|
|
|||
|
|
Project Riko is an anime-focused LLM project by **Just Rayen**. She listens, remembers, and speaks like your favorite snarky anime companion.
|
|||
|
|
It combines **OpenAI GPT**, **GPT-SoVITS** voice synthesis, and **Faster-Whisper / Groq ASR** into a fully configurable conversational pipeline with real-time streaming responses.
|
|||
|
|
|
|||
|
|
**Tested with Python 3.10 (Windows 10 or higher)**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ✨ Features
|
|||
|
|
|
|||
|
|
* 💬 **LLM-based dialogue** using OpenAI-compatible streaming (real-time responses)
|
|||
|
|
* 🧠 **Persistent conversation memory** with context tracking
|
|||
|
|
* 🔊 **Voice generation** powered by GPT-SoVITS
|
|||
|
|
* 🎧 **Speech recognition** using Faster-Whisper or Groq ASR (free API)
|
|||
|
|
* 🧍♀️ **VRM animated avatar** powered by Three-VRM
|
|||
|
|
* ⚙️ **Simple YAML personality config** for easy customization
|
|||
|
|
* 🚀 **Convenient launch script** (`start_servers.bat`) for quick setup
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🆕 2025-11-06 Update (Windows Version)
|
|||
|
|
|
|||
|
|
**New:**
|
|||
|
|
|
|||
|
|
* 🧩 Added OpenAI-compatible streaming for smoother, real-time conversation
|
|||
|
|
* 🎙️ Integrated **Groq API** for faster and more accurate ASR transcription (free!)
|
|||
|
|
* 🐞 Fixed bug where audio would not play in the client after server launch
|
|||
|
|
* ⚡ Added `start_servers.bat` for easy one-click startup
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ⚙️ Configuration
|
|||
|
|
|
|||
|
|
All prompts and parameters are stored in `config.yaml`.
|
|||
|
|
You can define personalities by editing this file.
|
|||
|
|
|
|||
|
|
```yaml
|
|||
|
|
waifu_name: riko
|
|||
|
|
gpu_acceleration: cpu
|
|||
|
|
history_file: chat_history.json
|
|||
|
|
model: "gpt-4.1-mini"
|
|||
|
|
presets:
|
|||
|
|
default:
|
|||
|
|
system_prompt: |
|
|||
|
|
You are a helpful assistant named Riko.
|
|||
|
|
You speak like a snarky anime girl.
|
|||
|
|
Always refer to the user as "senpai."
|
|||
|
|
|
|||
|
|
asr_context: The following is a conversation between Rayen and Riko
|
|||
|
|
sovits_ping_config:
|
|||
|
|
text_lang: en
|
|||
|
|
prompt_lang: en
|
|||
|
|
ref_audio_path: D:\PyProjects\waifu_project\riko_project_patreon\character_files\main_sample.wav
|
|||
|
|
prompt_text: This is a sample voice for you to get started with. It sounds kind of cute, but make sure there aren’t long silences.
|
|||
|
|
|
|||
|
|
# THE FOLLOWING IS FOR SOVITS V2, V2PRO, V2PROPLUS
|
|||
|
|
# additional_aud:
|
|||
|
|
# - additional_audio1
|
|||
|
|
# - additional_audio2
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🛠️ Setup
|
|||
|
|
|
|||
|
|
For setup, see SETUP_GUIDE.md!
|
|||
|
|
|
|||
|
|
### 💡 Conversation Flow
|
|||
|
|
|
|||
|
|
1. Riko listens to your voice via microphone
|
|||
|
|
2. Transcribes it using Groq ASR (or Faster-Whisper)
|
|||
|
|
3. Sends it to GPT (with conversation memory)
|
|||
|
|
4. Generates a reply in real time (streaming)
|
|||
|
|
5. Synthesizes Riko’s voice using GPT-SoVITS
|
|||
|
|
6. Plays back the audio
|
|||
|
|
7. Animates the VRM avatar
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📌 TODO / Future Improvements
|
|||
|
|
|
|||
|
|
* [x] Live microphone input
|
|||
|
|
* [x] VRM model frontend
|
|||
|
|
* [ ] Emotion/tone control in TTS
|
|||
|
|
* [ ] GUI / full web interface
|
|||
|
|
* [ ] Multi-language support
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🧑🎤 Credits
|
|||
|
|
|
|||
|
|
* **Voice synthesis:** [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)
|
|||
|
|
* **ASR:** [Faster-Whisper](https://github.com/SYSTRAN/faster-whisper) & [Groq API](https://console.groq.com/)
|
|||
|
|
* **LLM:** [OpenAI GPT](https://platform.openai.com)
|
|||
|
|
* **Avatar animation:** [Three-VRM](https://github.com/pixiv/three-vrm)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ⚠️ License Notice
|
|||
|
|
|
|||
|
|
This version is for **personal use only.**
|
|||
|
|
Do **not redistribute, sell, or share** the code — it’s under a **custom early access license.**
|
|||
|
|
A public open-source release will come later.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
Enjoy~
|
|||
|
|
— **Rayen 💻✨**
|