Emotional AI Voice Chat

Fast conversation with emotional AI

Click to show/hide video

LocalEmotionalAIVoiceChat.mp4

This project implements a real-time ai conversation system with emotional text-to-speech (TTS) capabilities. It uses a large language model (LLM) for generating responses and a TTS engine with voice-cloning for voice output.

Features

Real-time speech-to-text input
Cnversation generation powered by: Ollama, LMStudio, OpenAI, Anthropic or llama.cpp Webserver
Emotion-aware realtime text-to-speech output
Configurable system and user personas

Requirements

Python <=3.10 (3.10.9 is recommended)
CUDA-enabled GPU

Installation

Clone the repository
Open _install_win.bat, chang the path behind PYTHON_EXE to the path to your Python 3.10.9 executable

start _install_win.bat

Select your LLM provider:

open main.py and enter your desired LLM provider under llm_provider in class Config ("llamacpp" or "ollama" or "lmstudio" or "openai" or "anthropic")
llama.cpp:
- start "install_win.bat" in the llm_llamacpp folder to install llama cpp webserver
- also start "download_model.bat" in the llm_llamacpp folder to download the openhermes-2.5-mistral-7b.Q5_K_M.gguf model we use for inference
- open start_llamacpp_server.bat in the llm_llamacpp folder, adjust especially --n_gpu_layers 25 to your environment and GPU capabilities
- start "start_llamacpp_server.bat" in the main or the llm_llamacpp folder to start the server
ollama:
- start "install_win.bat" in the llm_ollama folder to install ollama
- start "start_ollama_server.bat" in the main or the llm_ollama folder to start the server
lmstudio:
- install and start LMStudio, load a model and start the local server
openai:
- start "install_win.bat" in the llm_openai folder to openai python library
- put your openai key in the environment variable "OPENAI_API_KEY"
anthropic:
- start "install_win.bat" in the llm_anthropic folder to anthropic python library
- put your anthropic key in the environment variable "ANTHROPIC_API_KEY"

Download the specific Lasinya XTTS voice model from huggingface: start the download_tts_model.py which will download the needed files. Then open tts_config.json and enter the filepath to the model files there.

CUDA Installation

These steps are recommended for those who require better performance and have a compatible NVIDIA GPU.

Note: To check if your NVIDIA GPU supports CUDA, visit the official CUDA GPUs list.

Install NVIDIA CUDA Toolkit:
- Visit NVIDIA CUDA Downloads for the latest version, or NVIDIA CUDA Toolkit Archive for version 11.8.
- Select your operating system, system architecture, and OS version.
- Download and install the software.
Install NVIDIA cuDNN:
- Visit NVIDIA cuDNN Archive.
- Download and install the appropriate version for your CUDA Toolkit.

Install ffmpeg:

Download from the ffmpeg Website, or use a package manager:

# Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# Arch Linux
sudo pacman -S ffmpeg

# MacOS (using Homebrew)
brew install ffmpeg

# Windows (using Chocolatey)
choco install ffmpeg

# Windows (using Scoop)
scoop install ffmpeg

Install PyTorch with CUDA support: Choose the appropriate command based on your CUDA version:
- For CUDA 11.8:
```
pip install torch==2.3.1+cu118 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118
```
- For CUDA 12.X:
```
pip install torch==2.3.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
```
Replace 2.3.1 with the version of PyTorch that matches your system and requirements.

Usage

Run the main script:

python main.py

The system will start a conversation based on the configured scenario. Speak into your microphone to interact with the AI character.

Note: When starting the application, you may see warnings similar to:

[ctranslate2] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.

FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.

These warnings are normal and do not affect the functionality of the system. There's no need to worry about them.

Configuration

Adjust chat_params.json to modify character and user descriptions, and conversation scenario
Adjust llm_xxx/completion_params.json to modify LLM completion parameters

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
lib		lib
llm_anthropic		llm_anthropic
llm_llamacpp		llm_llamacpp
llm_lmstudio		llm_lmstudio
llm_ollama		llm_ollama
llm_openai		llm_openai
reference_wavs		reference_wavs
README.md		README.md
_install_win.bat		_install_win.bat
_start_client.bat		_start_client.bat
_start_llama_server.bat		_start_llama_server.bat
_start_ollama_server.bat		_start_ollama_server.bat
_start_venv.bat		_start_venv.bat
chat_params.json		chat_params.json
main.py		main.py
requirements.txt		requirements.txt
tts_config.json		tts_config.json
tts_handler.py		tts_handler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emotional AI Voice Chat

Features

Requirements

Installation

CUDA Installation

Usage

Configuration

About

Releases

Packages

Languages

KoljaB/LocalEmotionalAIVoiceChat

Folders and files

Latest commit

History

Repository files navigation

Emotional AI Voice Chat

Features

Requirements

Installation

CUDA Installation

Usage

Configuration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages