mCloud GPU - NVIDIA A100 (40GB) - mCloud - Enterprise

GPU Compute

NVIDIA A100 Tensor Core GPU

Cost Effective Resources & Infrastructure

Reduce your cloud costs without sacrificing performance or reliability. mCloud delivers enterprise grade cloud infrastructure at a fraction of the cost of AWS, Google Cloud, or Azure.

Fault-Tolerant Tier IV Data Centre

Micron21 operates Australia’s first Tier IV-certified data centre, offering 100% uptime, redundant power, and high availability architecture.

24/7 Australian-based Expert Support

Our cloud specialists provide 24/7 Australian-based support, ensuring seamless deployments and efficient troubleshooting.

NVIDIA A100 (40 GB)

High-Performance Computing

$1,618 AUD / month

Minimum Specifications

Provided as a High-Availability mCloud Virtual Cloud Server

GPU: NVIDIA A100 (40 GB)
GPU Compute: Dedicated
vCPU: 12 Cores - XEON Gold
RAM: 64 GB - DDR4
Storage: 500 GB - NVMe SSD
Bandwidth: 2 TB p/m
IP Address: Included
DDoS Protection: Shield

Overview

Enterprise Acceleration, Right-Sized

The A100 40GB pairs NVIDIA's Ampere architecture and third-generation Tensor Cores with 40GB of high-bandwidth HBM2 memory. It supports the full range of math precisions, from FP64 for HPC to INT8 for inference, making it a single accelerator that adapts to almost any data-centre workload, while Multi-Instance GPU lets one card serve up to seven isolated jobs at once.

40GB HBM2

GPU memory

1,555 GB/s

Memory bandwidth

MIG instances @ 5GB

2,000+

Accelerated applications

Capabilities

What Makes the A100 40GB Different

Third-Gen Tensor Cores

Up to 312 TFLOPS of deep-learning performance and 20× the Tensor throughput of the previous Volta generation for training and inference.

Multi-Instance GPU (MIG)

Partition a single card into as many as seven fully isolated 5GB instances, each with its own memory, cache, and compute: ideal for multi-tenant serving.

Structural Sparsity

Tensor Cores exploit sparsity in AI models to deliver up to 2× higher performance, most notably for inference but also during training.

Next-Gen NVLink

Connect two GPUs over an NVLink bridge at 600 GB/s, double the previous generation's throughput, for workloads that outgrow a single card.

40GB HBM2 Memory

1,555 GB/s of memory bandwidth keeps the Tensor Cores fed, with enough capacity for production inference, fine-tuning, and mid-sized training runs.

Every Math Precision

One accelerator for every job: FP64 and TF32 for HPC and training, BF16/FP16 for deep learning, and INT8 for high-throughput inference.

Specifications

Technical Specifications

Compute & Tensor Cores

FP649.7 TFLOPS
FP64 Tensor Core19.5 TFLOPS
FP3219.5 TFLOPS
TF32 Tensor Core156 TFLOPS 312 TFLOPS with sparsity
BFLOAT16 Tensor Core312 TFLOPS 624 TFLOPS with sparsity
FP16 Tensor Core312 TFLOPS 624 TFLOPS with sparsity
INT8 Tensor Core624 TOPS 1,248 TOPS with sparsity

Memory, Platform & Form Factor

GPU memory40GB HBM2
Memory bandwidth1,555 GB/s
Max thermal design power250W
Multi-Instance GPUUp to 7 @ 5GB
InterconnectNVLink 600 GB/s PCIe Gen4 64 GB/s
Form factorPCIe
GPU architectureNVIDIA Ampere

Specifications per the NVIDIA A100 Tensor Core GPU datasheet (r4). Peak rates marked “with sparsity” require structural-sparsity-enabled models.

Performance

Built to Accelerate

Each figure compares the A100 against a different reference point, as published by NVIDIA. Note the baseline beneath each number.

20×

Higher performance

vs the prior NVIDIA Volta generation, across AI training and inference.

245×

AI inference throughput

BERT-Large inference vs a CPU-only server (INT8 with sparsity).

11×

HPC throughput

Across top HPC apps vs P100, a four-year generational gain.

2×

Sparse-model speed-up

From structural sparsity in Tensor Cores, primarily for inference.

Where It Fits

The Right Card for the Job

Ideal workloads

What the A100 40GB runs best

Production AI inference: high-throughput serving with INT8 and structural sparsity.
Fine-tuning & mid-sized training: accelerate models that fit comfortably within 40GB.
HPC & scientific computing: full FP64 and TF32 precision for simulation and research.
Data analytics: speed up large ETL, SQL, and machine-learning pipelines.
Multi-tenant serving: partition into seven MIG instances to maximise utilisation.

Why run it on mCloud

Australian, on-demand, supported

Tier IV data centre: Australia's first, with redundant power and high availability.
Fast underlying platform: NVMe storage and 100Gbps networking feed the GPU.
24/7 Australian support: local cloud specialists for deployment and troubleshooting.
Pay for what you use: scale instances up or down to match demand.
OpenStack & IaC ready: automate provisioning with the API and Terraform.

Deploy the A100 40GB today

Spin up GPU compute on our Tier IV Australian cloud, or talk to our specialists about sizing the right configuration for your workload.

NVIDIA A100 40GB

Data-centre acceleration for AI inference, fine-tuning, HPC, and analytics - with up to 20× the performance of the previous generation, available on demand from our Tier IV Australian cloud.