> For the complete documentation index, see [llms.txt](https://docs.vergeos-demo.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vergeos-demo.com/automate-protect-and-extend/private-ai/overview.md).

# Private AI Overview

VergeOS Private AI enables you to deploy and run large language models (LLMs) locally within your VergeOS environment. Models run entirely on your infrastructure with no external API calls or data transmission.

## Architecture

Private AI consists of five core components:

```mermaid
graph LR
    A[Client Application] --> B[OpenAI-Compatible API]
    B --> C[Assistant]
    C --> D[Model]
    D --> E[Worker]
```

### Models

Models are the LLM files that provide AI capabilities. VergeOS supports:

* **Curated models**: Pre-configured models available for one-click installation (Llama, Gemma, Phi, Qwen, and others)
* **Custom models**: Any GGUF-format model from Hugging Face or other sources

Models define resource requirements (CPU cores, RAM, GPU allocation) and inference parameters (context size, parallel requests).

### Assistants

Assistants are configured instances that define how users and applications interact with a model. Each assistant specifies:

* Which model to use
* System prompt (behavioral instructions)
* Temperature and other generation parameters
* Context scoring for RAG scenarios
* Workspace files for document context

Multiple assistants can use the same underlying model with different configurations.

### Workers

Workers are the inference engines that run models. The system manages two types:

* **AI-Helper Worker**: Handles API requests and routing (starts automatically)
* **Model Workers**: Execute inference for each running model (scale automatically based on Min/Max Workers settings)

### Chat Sessions

Chat sessions maintain conversation history and context. Sessions can be:

* Created through the VergeOS UI for interactive testing
* Managed programmatically via the API for application integration

### OpenAI-Compatible API

The API provides standard OpenAI endpoints at `https://<your-vergeos-url>/v1`, enabling integration with:

* Any OpenAI client library (Python, JavaScript, Go, etc.)
* IDEs and development tools
* Existing applications built for OpenAI/Ollama

See [OpenAI-Compatible API](/automate-protect-and-extend/private-ai/open-ai-router.md) for endpoint documentation.

## Prerequisites

* VergeOS 26.0 or later
* Sufficient RAM for model files (varies by model, typically 5-50GB)
* Storage for model downloads

### GPU Support

GPU acceleration is recommended for production workloads. VergeOS supports GPUs from any vendor (NVIDIA, AMD, Intel) through Resource Groups. Models can also run on CPU-only systems, though inference will be slower.

## Quick Start

1. Navigate to **AI → Models**
2. Click **Click to Install** on a curated model, or click **New Model** for custom models
3. Configure resource allocation (cores, RAM, GPU)
4. An assistant is created automatically with the new model
5. Test via **AI → Assistants → \[Your Assistant] → Chat**

For detailed setup instructions, see the [Configuration Guide](/automate-protect-and-extend/private-ai/configuration.md).

## Related Documentation

* [Configuration](/automate-protect-and-extend/private-ai/configuration.md) - Model and assistant setup
* [OpenAI-Compatible API](/automate-protect-and-extend/private-ai/open-ai-router.md) - API endpoints and integration
* [Chat Sessions](/automate-protect-and-extend/private-ai/chat-sessions.md) - Interactive UI usage


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.vergeos-demo.com/automate-protect-and-extend/private-ai/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
