> For the complete documentation index, see [llms.txt](https://docs.vergeos-demo.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vergeos-demo.com/learn-the-platform/module-7-multi-tenancy/04-resource-allocation.md).

# Resource Allocation & Scaling

## Tenant Nodes: Virtual Hosts for Virtual Data Centers

Every VergeOS tenant runs on one or more **tenant nodes** — virtual servers that simulate physical VergeOS hosts. Each tenant node provides dedicated compute (CPU cores), memory (RAM), and networking to the tenant's workloads, while maintaining full isolation through the tenant's encapsulated network.

Understanding how tenant nodes work is the key to right-sizing tenant deployments and scaling them over time.

### Tenant Node Characteristics

| Characteristic                      | Description                                                                                                              |
| ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| **Simulated hosts**                 | Tenant nodes replicate the functionality of physical VergeOS nodes inside the tenant's Virtual Data Center               |
| **Secure inter-host communication** | Tenant nodes communicate over the tenant's protected encapsulated network, even when running on different physical hosts |
| **Mobility**                        | Tenant nodes live-migrate between physical hosts for maintenance, load balancing, and automatic failover                 |
| **Matched resource allocation**     | Tenant nodes can target different clusters with different hardware profiles (standard, vGPU, high-memory)                |
| **Non-disruptive scaling**          | Cores and RAM can be increased or decreased on a running tenant node without restarting it                               |

### Tenant Node Limits

Per-tenant-node defaults and maximums:

| Resource | Default | Maximum             |
| -------- | ------- | ------------------- |
| Cores    | 4       | 1,048,576           |
| RAM      | 16 GB   | 5,242,880 MB (5 TB) |

Cluster `Max RAM per machine` and `Max cores per machine` settings can further constrain what a given tenant node can actually consume.

Networking also has a tenant-level cap: a maximum of **28 host network segments** can be extended into a single tenant as Layer 2 connections (eligible types: internal, external, BGP, VPN, and physical bridged).

### No Manual Overhead Calculation

VergeOS automatically accounts for hypervisor and storage overhead. The memory you assign to a tenant node is **fully available** to that tenant for distributing among its own workloads — no manual overhead calculation required.

## Single-Node vs Multi-Node Tenants

The first planning decision is whether a tenant needs one node or several.

### Single-Node Tenants (Preferred Default)

A single tenant node is the simplest and most common configuration. It is the recommended starting point whenever a tenant's compute and memory requirements fit within a single node.

Single-node tenants still provide redundancy through VergeOS's built-in **watchdog mechanism**:

* If the physical host running the tenant node fails, the watchdog automatically restarts the tenant node on another physical host
* During planned maintenance, a temporary tenant node is created to live-migrate workloads with no service interruption
* Additional tenant nodes can be added later, non-disruptively, as needs grow

{% hint style="success" %}
**Start Simple**

If RAM and core requirements can be met with a single tenant node and there are no network or device needs requiring multiple physical hosts, a single node is preferable for simplicity.
{% endhint %}

### When Multi-Node Tenants Are Needed

Multiple tenant nodes become necessary in specific scenarios:

1. **Compute exceeds cluster maximums** — The amount of cores and RAM assignable to a single tenant node is limited by cluster settings (*Max RAM per machine* and *Max cores per machine*). When a tenant needs more than one node's worth, add additional nodes.
2. **Clustered applications** — Web farms, Hadoop clusters, database primary/replica pairs, and other distributed applications that require workloads to run on different physical hosts for HA, load balancing, or parallel processing.
3. **Mixed hardware capabilities** — When a tenant needs both standard compute and specialized hardware (vGPU, PCI passthrough, USB devices), deploy tenant nodes on different clusters with the appropriate hardware.
4. **Regulatory separation** — Compliance requirements may mandate that certain workloads run on physically separate hosts.

## Right-Sizing Strategy

VergeOS tenants support **disturbance-free resource scaling** — you can add cores, RAM, nodes, and storage to a running tenant without affecting workloads. This means you should:

* **Provision for current and near-term needs**, not speculative future growth
* **Scale organically** as actual demand increases
* **Avoid over-provisioning** — unused resources allocated to one tenant cannot serve others

```mermaid
flowchart TD
    A["Assess Workload Requirements"] --> B{"Requirements fit\nsingle node?"}
    B -- Yes --> C["Deploy Single-Node Tenant"]
    B -- No --> D{"Multi-node reason?"}
    D -- "Exceeds cluster max" --> E["Add nodes to\nsame cluster"]
    D -- "HA / clustered apps" --> F["Add nodes with\nHA Group anti-affinity"]
    D -- "Mixed hardware" --> G["Add nodes on\ndifferent clusters"]
    D -- "Regulatory" --> H["Add nodes on\nseparate physical hosts"]
    C --> I["Monitor & Scale\nVertically First"]
    E --> I
    F --> I
    G --> I
    H --> I
    I --> J{"Need more\nresources?"}
    J -- "Fits in existing node" --> K["Increase Cores/RAM\non existing node"]
    J -- "Exceeds node capacity" --> L["Add another\ntenant node"]
    K --> I
    L --> I

    style A fill:#e8f5e9,stroke:#2e7d32
    style C fill:#e3f2fd,stroke:#1565c0
    style I fill:#fff3e0,stroke:#e65100
```

## Example Configurations

The following examples illustrate real-world tenant node planning decisions.

### Example 1: Small Single-Node Tenant

**Scenario:** 3 VMs, no special requirements. Host cluster allows Max RAM 64 GB, Max Cores 16.

| Setting      | Value                                                        |
| ------------ | ------------------------------------------------------------ |
| Tenant Nodes | 1                                                            |
| Cores        | 8                                                            |
| RAM          | 16 GB                                                        |
| Scaling Path | Add cores/RAM up to 64 GB / 16 cores, then add a second node |

**Rationale:** A single node provides sufficient resources. Watchdog failover ensures redundancy without additional complexity.

### Example 2: Mid-Sized HA Web Applications

**Scenario:** Customer-facing web apps requiring multi-instance HA. Host cluster allows Max RAM 128 GB, Max Cores 16.

| Setting      | Value                                                                       |
| ------------ | --------------------------------------------------------------------------- |
| Tenant Nodes | 2                                                                           |
| Node 1       | 64 GB RAM, 12 cores (2 web servers + DB primary)                            |
| Node 2       | 64 GB RAM, 12 cores (2 web servers + DB replica)                            |
| HA Groups    | Anti-affinity rules ensure web/DB instances stay on separate physical hosts |

**Rationale:** Although one node could hold all the resources, two nodes ensure web servers and database components run on different physical hosts for application-level HA.

### Example 3: Mixed Workload with GPU

**Scenario:** Standard compute, high-performance video rendering, and GPU-accelerated processing. Three host clusters available: Standard (64 GB max), vGPU (64 GB max), Premium (128 GB max).

| Setting      | Value                                                   |
| ------------ | ------------------------------------------------------- |
| Tenant Nodes | 4                                                       |
| Node 1       | 64 GB, 8 cores — Standard Cluster (file servers)        |
| Node 2       | 64 GB, 8 cores — Standard Cluster (management tools)    |
| Node 3       | 64 GB, 16 cores — vGPU Cluster (video rendering)        |
| Node 4       | 48 GB, 8 cores — Premium Cluster (editing workstations) |

**Rationale:** Multiple nodes allow placement on clusters with matching hardware capabilities. Each tenant node targets the cluster that best fits its workload.

### Example 4: Enterprise Distributed Analytics

**Scenario:** Distributed analytics platform requiring multi-host deployment for load balancing and redundancy. Host cluster allows Max RAM 96 GB, Max Cores 16.

| Setting      | Value                                                           |
| ------------ | --------------------------------------------------------------- |
| Tenant Nodes | 4                                                               |
| Nodes 1–3    | 64 GB, 12 cores each (1 app server + 1 DB server per node)      |
| Node 4       | 32 GB, 8 cores (2 data processing servers)                      |
| HA Groups    | Anti-affinity ensures application instances span physical hosts |

**Rationale:** Four tenant nodes guarantee application instances run across multiple physical hosts while maintaining the ability to run all services within the tenant.

## Increasing Tenant Resources

VergeOS provides three non-disruptive methods for adding resources to a running tenant.

### Add Cores/RAM to an Existing Node

Changes take effect **immediately** on the tenant node — no restart required.

1. Navigate to the **Tenant Dashboard** → **Nodes**
2. Double-click the target node → click **Edit**
3. Modify the **Cores** and/or **RAM** fields
4. Click **Submit**

{% hint style="info" %}
**Cluster Limits**

The maximum cores and RAM per tenant node are determined by the cluster's *Max RAM per machine* and *Max cores per machine* settings. Max out existing nodes before adding new ones, unless workload balance requires otherwise.
{% endhint %}

{% hint style="info" %}
**Resource Change Validation**

When cores or RAM are changed on a tenant node, VergeOS runs `validatecluster.gcs` against the primary cluster (and `validateclusterfailover.gcs` against the failover cluster, if one is configured). Changes that would exceed either cluster's per-machine limits — or that the running host cannot satisfy with free RAM — are rejected, preventing over-commitment.
{% endhint %}

### Add a New Tenant Node

1. Navigate to the **Tenant Dashboard** → **Nodes** → **New**
2. Configure **Cores**, **RAM**, **Cluster**, and **Failover Cluster**
3. Select **On Power Loss** behavior (Last State, Leave Off, or Power On)
4. Click **Submit**

{% hint style="warning" %}
**Preferred Node**

Setting a *Preferred node* is **not recommended** for tenant nodes. Incorrect configuration can adversely affect built-in redundancy. Consult VergeOS Support if needed.
{% endhint %}

### Provision Additional Storage

**New storage tier:**

1. Tenant Dashboard → **Add Storage** → select **Tier** → enter **Provisioned** amount → **Submit**

**Expand existing tier:**

1. Tenant Dashboard → scroll to **Storage** section → click **Edit** on the desired tier
2. Enter the new **total** provisioned amount (e.g., change 50 GB to 75 GB to add 25 GB)

## Reducing Tenant Resources

### Reduce Cores/RAM

Cores and RAM can be reduced on a running tenant node without powering it off. However, if those resources are currently in use by tenant VMs, the **actual reclaim is deferred** until the VMs are shut down.

**Example:** You reduce a tenant node's RAM from 32 GB to 28 GB, but VMs are currently using all 32 GB. The setting changes immediately, but the 4 GB difference is not reclaimed until VMs release that memory.

### Delete a Tenant Node

1. Power off or migrate all VMs from the node
2. Power off the tenant node
3. Navigate to **Tenant Dashboard** → **Nodes** → select the node → **Delete**

{% hint style="warning" %}
**Minimum Node Requirement**

A tenant must always have at least one node. Before deleting any tenant node, ensure at least one other node remains and that all workloads have been migrated off the node being removed.
{% endhint %}

## Scaling Paths: Vertical First, Then Horizontal

The recommended scaling strategy for tenants follows a clear progression:

### Step 1: Scale Vertically

Increase cores and RAM on existing tenant nodes up to the cluster maximum. This is the simplest path with zero disruption.

### Step 2: Scale Horizontally

When existing nodes are maxed out, add new tenant nodes. Place them on the same cluster for general expansion or on different clusters for specialized hardware.

### Step 3: Add Storage

Expand provisioned storage independently of compute. Add capacity to an existing tier or provision a new storage tier.

### Step 4: Rebalance

If resource distribution becomes uneven across nodes, balance RAM/cores between nodes rather than maxing one and minimally provisioning another.

{% hint style="info" %}
**Coming from VMware or Nutanix?**

On VMware and Nutanix, "scaling a tenant" usually means resizing a quota and trusting the scheduler. In VergeOS, you size dedicated tenant nodes directly.
{% endhint %}

| Platform | Isolation model                                                                                    | Scaling action                                                                                  |
| -------- | -------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| VergeOS  | Each tenant is a VDC with dedicated tenant nodes (encapsulated network + isolated storage volumes) | Edit a tenant node's cores/RAM live, or add a node — system accounts for overhead automatically |

## Best Practices

| Practice                        | Guidance                                                                                                           |
| ------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| **Start with one node**         | Use single-node tenants by default; add nodes only when required                                                   |
| **Right-size for now**          | Provision for current/near-term needs, not speculative future growth                                               |
| **Max out before adding**       | Increase existing node resources before adding new nodes (unless workload balance requires otherwise)              |
| **Balance resources**           | When two nodes are needed, distribute resources evenly rather than maxing one and minimizing the other             |
| **Use HA Groups**               | For multi-node tenants with HA requirements, configure anti-affinity rules so VMs distribute across physical hosts |
| **Match clusters to workloads** | Place tenant nodes on clusters with hardware that matches the workload (GPU, high-memory, standard)                |
| **Monitor and adjust**          | Use tenant dashboards and usage reports to identify when scaling is needed                                         |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.vergeos-demo.com/learn-the-platform/module-7-multi-tenancy/04-resource-allocation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
