Skip to main content
Virtual GPU clusters provide GPU computing resources through virtualization, offering flexibility in configuration and resource allocation. Each cluster consists of one or more virtual machines with dedicated GPU access. For an overview of GPU cluster types and their differences, see About GPU Cloud.

Cluster architecture

Each Virtual GPU cluster consists of one or more virtual machine nodes. All nodes are created from an identical template (image, network settings, disk configuration). After creation, individual nodes can have their disk and network configurations modified independently. For flavors with InfiniBand support, high-speed inter-node networking is configured automatically. This enables efficient distributed training across multiple nodes without manual network configuration. Each node has:
  • A network boot disk (required). At least one network disk is required as the boot volume for the operating system.
  • A local data disk added by default. This non-replicated disk is dedicated to temporary storage.
  • Optional network data disks that can be attached during creation or added later. Network disks persist independently of node state.
  • Optional Vast fileshare integration for shared storage across nodes.
The local data disk is a non-replicated volume that comes with every Virtual GPU instance. This disk:
  • Cannot be modified, detached, or used as a boot volume
  • Is strictly bound to the specific virtual machine and its configuration
  • Is wiped when the node is reconfigured, powered off (shelved), or deleted
Use this disk only for temporary data. Store all important data on network disks or NFS storage to prevent data loss.

Create a Virtual GPU cluster

To create a Virtual GPU cluster, complete the following steps in the Gcore Customer Portal.
  1. In the Gcore Customer Portal, navigate to GPU Cloud.
  2. In the sidebar, expand GPU Clusters and select Virtual Clusters.
  3. Click Create Cluster.

Step 1. Select region

In the Region section, select the data center location for the cluster. Regions are grouped by geography (Asia-Pacific, EMEA). Each region card shows its availability status.
GPU model availability and pricing vary by region. Check other regions if a required GPU model is not available. For help with availability, contact the sales team.

Step 2. Select cluster type

In the Cluster type section, select Virtual to create a Virtual GPU cluster. Selecting Virtual displays flavors and options specific to Virtual GPU clusters. Selecting Physical switches to Bare Metal GPU cluster creation.

Step 3. Configure cluster capacity

Cluster capacity determines the hardware specifications for each node in the cluster.
  1. Select the GPU Model. Available models depend on the region.
  2. Enable or disable Show out of stock to filter available flavors.
  3. Select a flavor. Each flavor card displays GPU configuration, vCPU count, RAM capacity, and pricing.

Step 4. Set the number of nodes

In the Number of Nodes section, specify how many virtual machines to provision in the cluster.
Number of Instances selector showing increment and decrement buttons
Each node is a separate virtual machine with the selected flavor configuration. When created, all nodes have identical configurations. The minimum is 1 node, maximum is 999 nodes per cluster.

Step 5. Select image

The image defines the operating system and pre-installed software for cluster nodes.
Image selector with Public and Custom tabs and image dropdown
  1. In the Image section, choose the operating system:
    • Public: Pre-configured images with NVIDIA drivers and CUDA toolkit
    • Custom: Custom images uploaded to the account
The default Ubuntu images include pre-installed NVIDIA drivers and CUDA toolkit. Images with the eni suffix in the name are configured for InfiniBand interconnect.
  1. Note the default login credentials displayed below the image selector.

Step 6. Configure disks

Each Virtual GPU cluster node has the following storage:
  • Network boot disk (required): At least one network disk is required as the boot volume for the operating system. This disk persists across power cycles.
  • Local data disk: Added automatically to every node. This non-replicated disk is dedicated to temporary storage only. It cannot be edited, removed, or used as a boot volume. Data on this disk is deleted when the node is shelved, reconfigured, or deleted.
  • Network data disks (optional): Additional persistent storage that survives power cycles. Up to 8 network disks of different sizes and types can be attached.
  • Vast fileshare (optional): Shared storage integration for data accessible across multiple nodes.
Volumes configuration section with disk type, size, and Your plan summary
To configure disks:
  1. Configure the boot disk:
    • Disk type: Select from available storage types in the region
    • Size: Minimum size depends on the selected image
  2. To add additional network data disks, click Add Disk.
  3. For each additional disk, select:
    • Disk type: Select from available storage types in the region
    • Size: Minimum 120 GiB
  4. Repeat to add more disks (maximum 8 disks total).
All configured network disks are attached to every node in the cluster at creation.
Store important data on network disks or NFS storage, not on the local data disk. The local disk is wiped when the node is shelved, reconfigured, or deleted, resulting in permanent data loss.
After cluster creation, network disks can be managed individually per node. Additional disks can be added to specific nodes, and existing disks can be detached from individual nodes.

Step 7. Configure network settings

Network settings define how the cluster communicates with external services and other resources. At least one interface is required.
  1. In the Network settings section, configure the network interface:
TypeAccessUse case
PublicDirect internet access with dynamic public IPDevelopment, testing, quick access to cluster
PrivateInternal network only, no external accessProduction workloads, security-sensitive environments
Dedicated publicReserved static public IPProduction APIs, services requiring stable endpoints
To add additional interfaces, click Add Interface. For detailed networking configuration, see Create and manage a network.

Step 8. Configure SSH key

In the SSH key section, select an existing key from the dropdown or create a new one. Keys can be uploaded or generated directly in the portal.
SSH key selector and Firewall settings section
If generating a new key pair, save the private key immediately as it cannot be retrieved later.

Step 9. Set additional options

The Additional options section provides optional settings: user data scripts for automated configuration and metadata tags for resource organization.

Step 10. Name and create the cluster

The final step assigns a name to the cluster and initiates provisioning.
  1. In the Cluster Name section, enter a name or use the auto-generated one.
  2. Review the Estimated cost panel on the right.
  3. Click Create Cluster.
Once all nodes reach Power on status, the cluster is ready for use.
Cluster-level settings (image, default networks) cannot be changed after creation. New nodes added via scaling inherit the original configuration. To change these settings, create a new cluster.

Connect to the cluster

After the cluster is created, use SSH to access the nodes. The default username is ubuntu.
ssh ubuntu@<node-ip-address>
Replace <node-ip-address> with the public or floating IP shown in the cluster details. For nodes with only private interfaces, connect through a bastion host or VPN, or use the Gcore Customer Portal console.

Verify cluster status

After connecting, verify that GPUs are available and drivers are loaded:
nvidia-smi
The output displays all available GPUs, driver version, and CUDA version. If no GPUs appear, check that the image includes the correct NVIDIA drivers for the GPU model.

Manage cluster power state

Virtual GPU clusters support power management operations at the cluster level. Unlike Bare Metal clusters, powering off a Virtual GPU cluster releases compute resources (shelving), which stops billing but also removes data from local disks.
Power tab showing Power on, Power off, and Soft reboot options

Power off (shelve) a cluster

Powering off a Virtual GPU cluster shelves all nodes. Shelving releases CPU, RAM, GPU, and local disk resources. No charges apply while nodes are shelved.
  1. In the cluster list or cluster details page, locate the cluster.
  2. Click the Power Off action.
  3. Confirm the operation.
When a node is shelved, the local data disk is deleted and all data on it is lost. Only network disks persist. Save important data to network disks or external storage before powering off.

Power on a cluster

Powering on a shelved cluster attempts to allocate resources for all nodes.
  1. In the cluster list or cluster details page, locate the powered-off cluster.
  2. Click the Power On action.
Restart is not guaranteed after shelving. If the requested resources (GPU model, flavor) are not available in the region at the time of power on, the operation fails. For workloads requiring guaranteed availability, use Bare Metal clusters.

Change cluster flavor

Virtual GPU clusters support changing the flavor after creation, allowing adjustment of GPU, CPU, and RAM allocation without recreating the cluster. This is not available for Bare Metal clusters.
  1. Power off (shelve) the cluster. Flavor changes require the cluster to be in shelved state.
  2. In the cluster details page, click Change Flavor.
  3. Select the new flavor from available options.
  4. Confirm the change.
  5. Power on the cluster.
Flavor availability depends on the region and current capacity. If the desired flavor is not available, the change cannot be applied.

Manage node disks

After cluster creation, disks can be managed individually for each node. This allows adding storage to specific nodes or removing unused disks.
Cluster details page showing Overview, Power, Volumes, Networking, Tags, and Delete tabs

Add a disk to a node

  1. Navigate to the cluster details page.
  2. Open the Volumes tab.
  3. Select the node to attach a disk to.
  4. Click Add Volume and configure the disk type and size.
  5. Click Create to attach the new disk.

Remove a disk from a node

  1. Navigate to the cluster details page.
  2. Open the Volumes tab.
  3. Locate the disk to remove and click the detach action.
  4. Confirm the operation.
Detaching a disk does not delete it. The disk remains in the account and continues to incur storage charges until explicitly deleted.

Delete a cluster

When deleting a Virtual GPU cluster, disks attached to nodes can optionally be preserved.
Delete Cluster dialog with options to select which volumes to delete
  1. In the cluster list or cluster details page, click Delete.
  2. In the confirmation dialog:
    • To delete all disks along with the cluster, leave the Delete disks checkbox selected.
    • To preserve disks for later use, clear the Delete disks checkbox.
  3. Confirm the deletion.
Preserved disks remain in the account and continue to incur storage charges. They can be attached to other instances or clusters.

Automating cluster management

The Gcore Customer Portal is suitable for creating and managing individual clusters. For automated workflows—such as CI/CD pipelines, infrastructure-as-code, or batch provisioning—use the GPU Virtual API. The API allows:
  • Creating and deleting clusters programmatically
  • Starting, stopping, and rebooting clusters
  • Changing cluster flavor
  • Managing volumes attached to cluster servers
  • Querying available GPU flavors and regions
For authentication, request formats, and code examples, see the GPU Virtual API reference.