5.1 KiB
Setup Guide
Prerequisites
Before starting, ensure you have the following installed:
- Python 3.10 - Download from Microsoft Store or python.org
- VS Code - Download here
- Node.js and npm - Download here (includes npx)
- GPT-SoVITS - One-click installer
1. Project Setup
Create Virtual Environment
-
Open VS Code
-
File → New Window
-
File → Open Folder (select your project directory)
-
Press Ctrl+Shift+P to open the command palette
-
Type
Python: Create Environmentand select itNote: If you don't see this option, install the Python extension:
- Go to the Extensions sidebar (Ctrl+Shift+X)
- Search for "Python" and install it
- Close and reopen VS Code, then try again
-
Select Venv and choose Python 3.10
-
Uncheck "Install dependencies from requirements.txt" (we'll do this manually)
-
Click OK
Install Dependencies
-
Open a new terminal: Terminal → New Terminal
-
Verify your virtual environment is active (you should see
.venvin the prompt):(.venv) F:\your_project_path> -
Install dependencies using uv (faster) or pip:
# Option 1: Using uv (recommended - faster) pip install uv uv pip install -r requirements.txt # Option 2: Using pip pip install -r requirements.txtThis should take 30 seconds to 1 minute depending on your system.
2. API Configuration
Create .env File
Create a .env file in the root directory with the following content:
OPENAI_API_KEY="sk-proj-YOUR_API_KEY"
GROQ_API_KEY="YOUR_GROQ_API_KEY"
Get API Keys
-
OpenAI API Key:
- Sign up at OpenAI Platform
- Add $5 credit (should last 1-2 months for typical usage)
- Copy your API key to the
.envfile
Note: You can customize this to use a local AI model if preferred (streaming code doesn't support this yet, but local model support is planned)
-
Groq API Key (Free):
- Sign up at Groq Console
- Copy your API key to the
.envfile
3. Configuration
Character Configuration
There are two main configuration files:
A. character_config.yaml
- Set the AI prompt
- Configure ASR (Automatic Speech Recognition) context
- Add reference audio sample (must be 3-10 seconds long)
- Enter the text spoken in the audio file
B. client/config.js
- Change the 3D model
- Adjust mouth audio threshold
- Place model files in
client/models/directory - Update the filename in config
- Important: Model must be in VRM 1.0 format (export setting in VRoid Studio)
4. Starting the Servers
Option A: Automatic Start (Recommended)
- Edit
start_server.bat - Change the following line to match your GPT-SoVITS installation path:
set SOVITS_PATH=D:\PyProjects\GPT-SoVITS-v3lora-20250228\GPT-SoVITS-v3lora-20250228 - Run the script:
- In terminal:
start_server.bat - Or double-click the file in File Explorer
- In terminal:
- Do not close any of the terminal windows that open
Option B: Manual Start
If automatic start doesn't work:
-
Start the Python server:
cd server python server.py -
Start the animation server (open a second terminal):
cd client npx vite -
Open your browser and go to: http://localhost:5173
You should see a 3D model floating on screen.
5. Running the Chat
-
Run the main chat script:
python main_chat.py -
Troubleshooting: If you encounter issues, run the setup check script:
cd server python check_setup.py
6. Customization
Facial Expressions
Currently, the model's face defaults to "smug". You can change this or implement your own emotion classification.
To change the expression, edit main_chat.py:
for chunk in stream_text_chunks(messages):
print("[chunk]", chunk)
# Accumulate final text
full_assistant_text += (chunk + " ")
# Prepare TTS text and emotion
tts_read_text = clean_llm_output(chunk)
# Option 1: Use emotion detection (plug in your own model)
# emotion = get_emotion(chunk, None, None)
# expression = map_emotion_to_expression(emotion)
# Option 2: Set manually (current implementation)
emotion = "relaxed"
expression = "relaxed"
Supported VRM 1.0 expressions:
happyangrysadrelaxedsurprisedneutral
Summary
- ✅ Install prerequisites (Python 3.10, VS Code, Node.js, GPT-SoVITS)
- ✅ Create virtual environment and install dependencies
- ✅ Configure API keys in
.env - ✅ Customize
character_config.yamlandclient/config.js - ✅ Start servers (automatic or manual)
- ✅ Run
main_chat.py - ✅ (Optional) Customize facial expressions
For issues, run server/check_setup.py to diagnose problems.