Dedicated GPU vs Contended GPU - Get the benefits of GPU compute without the associated costs

22 Sep 2025, by Slade Baylis

With the power and adaptability of Artificial Intelligence (AI) and Machine Learning (ML), it’s little wonder that many different companies are looking to develop their own AI tools, either for their own internal use, or alternatively for use by the public.  These include tools to simplify monotonous and repetitive tasks that would otherwise have to be performed painstakingly by hand, or alternatively, attempts to create the next big AI must-have app to take the internet by storm.  Whatever the goal, the demand for GPU compute has also necessarily increased in lock-step with this increased demand and interest in developing AI tools.

Necessary for the efficient training of AI models - which includes but is not limited to Large Language Models (LLMs) like ChatGPT and Grok - GPU compute greatly speeds up the massive dataset processing that is required for training efficiently.  The problem however is that GPU compute doesn’t come cheap.  As reported by the Australian Financial Review1 earlier this month, a new Australian artificial intelligence venture, called “Sovereign Australia AI”, launched with the goal of providing an Australian-based and locally developed alternative to the global AI models already out there.  In this endeavour, they invested in 256 Nvidia Blackwell B200 GPUs - with the retail rate of these cards landing somewhere between $30,000 and $70,000 each!

That’s why this month we’ll be covering one of our newest offerings on our mCloud platform, that of “GPU Cloud Compute”.  The good news is that - through GPU passthrough technology - we’re now able to offer the option of both “dedicated” and “contended” GPU compute on our mCloud platform, both at much more affordable rates than the hyperscalers!

What is GPU Cloud Compute?

Put simply, GPU Cloud  Compute is simply the ability to utilise GPUs (Graphics Processing Units) within cloud infrastructure to perform certain types of tasks more efficiently and quickly.  Acting alongside the CPU (Central Processing Unit), GPU compute is able to exponentially increase the performance of certain types of workloads, such as rendering, video editing, training AI and ML models, mining cryptocurrencies, and gaming.

Originally, due to GPU compute being unable to support virtualised environments such as those utilised within public cloud infrastructure, it was only available for use within your own physical dedicated servers.   However, GPU manufacturers realised the opportunities that would become available if they could develop their virtualisation capabilities and allow their GPU to be simultaneously shared across multiple users.  That doing so would allow them to be hosted directly within data centres around the world, and empower many different types of new services, such as high-performance computing for remote cloud-hosted desktop environments, cloud-based gaming services, and eventually the efficient training of AI models.

With this realisation, different technologies were developed to allow for GPU compute on cloud infrastructure.  The first necessary development was GPU passthrough technology, which allows virtual machines to access the full capabilities of a physical GPU as if it were directly attached to the system, bypassing the hypervisor’s emulation layer and providing near-native performance.  Afterwards, different approaches to providing “shared” or “contended” access to GPU were developed to allow dedicated GPU hardware to be provided on-demand.  One such approach is Nvidia’s Multi-Instance GPU (MIG) technology, wherein GPUs are partitioned into up to seven separate instances that can be allocated to separate virtual workloads as required.

GPU compute for cloud environments was first made available on cloud infrastructure back in 2010, when AWS became the first to offer GPUs on demand on their public cloud.  Since then, it has become more and more common, with Google in 2017 also making GPU Cloud Compute available on their own cloud platform. However, even with the amount time that’s passed since these early days of cloud GPU, unfortunately the cost of access to it has continued to be incredibly high, which is why we’ve aimed to rectify that with our mCloud platform. 

Contended & Dedicated GPU vs Hourly-based Hyperscaler Approaches

It shouldn’t be a surprise that many of the hyperscalers – such as AWS, Azure, and Google Cloud -  now offer GPU compute to meet these new demands, as there is money to be made with the increasing demand for it.  Though it also shouldn’t be a surprise that they are also incredibly expensive for all but enterprise-level organisations – whom are likely trying to get their own share of the billions being invested in new AI ventures. In the first half of 2025 alone, a total of $104 billion was invested in AI ventures according to CNBC2!

With each of these hyperscalers, they bill on an hourly cost model, with the cost per hour varying depending on the instance chosen; the CPU, Memory, and Storage available to that instance; the generation of GPU that instance has access to; and finally, the length of time you wish to commit to.  However, for their standard GPU G4dn EC2 instances, which are powered by NVIDIA T4 GPUs, the cost ranges from $0.37853 USD per hour to $1.204 USD per hour (as of 17/09/2025).  Over a single year, that totals to between $4,966.37 AUD and $15,796.69 AUD, and this is for a GPU that retails for around $2,820 AUD to buy outright, before any specials or deals are applied!  Unfortunately, the case is much the same regardless of which of the hyperscalers you choose to go with.

That’s why previously our recommended approach has been to look to utilise dedicated cards in your own physical hardware.  That was before the launch of our own GPU Cloud Servers however, as with this service we’re able to offer similar savings that you would see from using your own dedicated GPU hardware, but without the required upfront investment!

Contended GPU vs Dedicated GPU

With our “dedicated” and “contended” GPU cloud server offerings, customers are now able to choose what would suit them best.  They can either get themselves the full “dedicated” power of one or more GPUs, or in circumstances where they need to reduce their cloud costs for lower time-sensitive workloads, then they can instead choose to acquire a portion of a physical GPU via utilising our “contended” GPU platform. 

This unique offering, in which customers can acquire “contended compute” is a new one, allowing users to set their desired guaranteed minimum compute to what they require, but still having access to more processing power if it’s available.  With this approach you’re able to set a desired minimum – starting at 10% and further increasing in increments of 10% - with the good news being that you are able to utilise up to the full processing power of the available GPU if other users on the same platform aren’t currently using those resources.  This is achieved through time-sliced access to the GPU, based on those guaranteed minimums, ensuring that each user gets at least their minimum guaranteed compute.

It should also be noted that there is a tipping point with this approach, one wherein the performance you require of a single card would cost more on a contended platform that it would if you were utilising a dedicated option.  For our own contended platform that point is around the 60% mark, so if you find yourself requiring a minimum of 70% of the total card’s processing power, then it becomes more cost-effective to look at securing your own dedicated GPUs. 

As one final note, for any workloads that require more than a single GPU’s worth of processing power, utilising your own dedicated cards rather than a contended GPU service is likely going to be the better option.

Have any questions about GPU cloud compute?

If you have any questions about GPU cloud compute, or would like to discuss what would be involved to deploy it on your infrastructure, let us know!  We’re happy to have a chat about your unique environment to discuss and help plan deploying GPU compute on your infrastructure.

You can reach us by phone on 1300 769 972 (Option #1) or via email at sales@micron21.com.

Sources

See it for yourself.

Australia’s first Tier IV Data Centre
in Melbourne!

Speak to our Australian based team.

24 hours a day, 7 days a week
1300 769 972

Sign up for the Micron21 Newsletter