AYITA System Deployment Guide
AYITA can be deployed as a local AI assistant in a privacy-preserving environment. This guide provides step-by-step instructions for setting up AYITA using Docker or through manual installation.
System Requirements
AYITA requires sufficient computing resources to run LLMs, RAG processing, and fine-tuning workflows efficiently.
Component |
Minimum Requirements |
Recommended Configuration |
---|---|---|
CPU |
4 Cores (x86-64) |
8+ Cores (x86-64, ARM64) |
GPU |
Optional (for CPU inference) |
NVIDIA GPU (RTX 3090/4090, A100) with CUDA support |
RAM |
16GB |
32GB+ |
VRAM (GPU Memory) |
8GB |
24GB+ |
Storage |
50GB SSD |
200GB+ SSD/NVMe (for large LLMs) |
OS |
Linux / macOS / Windows (WSL2) |
Ubuntu 22.04 / macOS 13+ / Windows Server (with WSL2) |
Docker Support |
Required for containerized deployment |
Recommended for isolated environments |
Deployment via Docker
The easiest way to deploy AYITA is by using Docker, which isolates dependencies and simplifies installation.
Step 1: Install Docker & Docker Compose
Linux/macOS: .. code-block:: bash
sudo apt update && sudo apt install -y docker docker-compose
Windows: Download Docker Desktop from https://www.docker.com/products/docker-desktop/
Step 2: Pull the AYITA Docker Image
docker pull realmdata/ayita:latest
Step 3: Run AYITA as a Container
docker run --gpus all -p 8000:8000 -d realmdata/ayita
- Once running, AYITA will be available at:
http://localhost:8000
Manual Installation
For users who prefer to run AYITA without Docker, a manual setup is required.
Step 1: Install Dependencies
AYITA depends on Python, Haystack, and various AI libraries. Run the following to install requirements:
sudo apt update && sudo apt install -y python3 python3-venv
python3 -m venv ayita-env
source ayita-env/bin/activate
pip install --upgrade pip
pip install farm-haystack transformers torch sentence-transformers
Step 2: Download AYITA Source Code
Clone the AYITA repository and install it:
git clone https://github.com/realmdata/ayita.git
cd ayita
pip install -r requirements.txt
Step 3: Launch AYITA
python run.py --gpu
AYITA will start at http://localhost:8000.
Configuring Local Models
By default, AYITA supports local LLMs, allowing fully private execution.
Supported Model Backends:
GPT-J / GPT-NeoX / Llama 2 (via transformers)
Mistral / Falcon (optimized for small VRAM usage)
Fine-tuned Models using PEFT (Parameter Efficient Fine-Tuning)
To specify a custom model, use:
python run.py --model llama-2-13b --gpu
For embedding-based retrieval (RAG):
python run.py --use-rag
Managing Memory & Optimization
AYITA supports memory-efficient deployment, including:
LoRA / PEFT fine-tuning for adapting LLMs with minimal VRAM.
Quantization (bitsandbytes) to run large models on consumer GPUs.
Streaming RAG responses for improved performance.
To enable quantization:
python run.py --quantization 8bit
To enable fast RAG processing:
python run.py --rag-cache
Running AYITA in a Private Network
For enterprise users, AYITA can be deployed on an internal network with authentication:
docker run -p 8000:8000 -e AUTH=true realmdata/ayita
Access will require a username/password, preventing unauthorized usage.
Next Steps
AYITA is now ready to use! 🎉 For further details:
[Developer Guide](developer_guide.html) – Learn about APIs and external integrations. [AYITA Use Cases](use_cases.html) – Explore real-world applications. [Fine-Tuning Guide](fine_tuning.html) – Customize AYITA for your needs.
—