

22 Sep 2025, by Slade Baylis
With the power and adaptability of Artificial Intelligence (AI) and Machine Learning (ML), it’s little wonder that many different companies are looking to develop their own AI tools, either for their own internal use, or alternatively to put them out there for public use. These AI tools include those that simplify monotonous and repetitive tasks that would otherwise have to be performed painstakingly by hand, as well as tools created in an attempt to take the internet by storm through developing the next big AI must-have app! Whatever the goal may be, this huge interest and demand in developing AI tools is resulting in a lock-step increase in demand for "GPU compute".
Necessary for the efficient training of AI models, which includes but is not limited to Large Language Models (LLMs) like ChatGPT and Grok, "GPU compute" greatly speeds up the processing of the massive datasets required for efficient training. However, this efficiency does come with a downside in that GPU compute is not cheap. As highlighted in a report by the Australian Financial Review1 earlier this month, a new Australian artificial intelligence venture, called “Sovereign Australia AI”, launched with the goal of providing an Australian-based and locally developed alternative to the global AI models already out there. In this endeavour, they invested in 256 Nvidia Blackwell B200 GPUs with the retail rate of these cards landing somewhere between $30,000 and $70,000 each. That's right, each!
That’s why this month we’ll be covering one of our newest offerings on our mCloud platform, that of “GPU Cloud Compute”. The good news about our offering is that it enables this technology and its benefits to be within your reach now, so organisations no longer have to miss out. This is because through utilising GPU passthrough technology, via our mCloud platform, we’re now able to offer both “dedicated” and “contended” GPU compute options, and we're able to do so at a much more affordable rate than what those hyperscalers charge out there!
Put simply, GPU Cloud Compute is simply the ability to utilise GPUs (Graphics Processing Units) within cloud infrastructure to perform certain types of tasks more efficiently and quickly. Acting alongside the CPU (Central Processing Unit), GPU compute is able to exponentially increase the performance of certain types of workloads, such as rendering, video editing, training AI and ML models, mining cryptocurrencies, gaming, and much more.
Originally, due to it being unable to support virtualised environments such as those utilised within public cloud infrastructure, GPU compute was only available for use within one's own physical dedicated servers. However with time, GPU manufacturers realised the opportunities that would become available if they could develop their virtualisation capabilities and allow their GPU to be simultaneously shared across multiple users. They realised that doing so would allow them to be hosted directly within data centres around the world, thus allowing and empowering many different types of new services to become accessible, such as high-performance computing for remote cloud-hosted desktop environments, cloud-based gaming services, and eventually the efficient training of AI models.
With this realisation, different technologies were developed to allow for GPU compute on cloud infrastructure. The first necessary development was "GPU passthrough technology" which allowed virtual machines to access the full capabilities of a physical GPU as if it were directly attached to the system, bypassing the hypervisor’s emulation layer and providing near-native performance. Afterwards, different approaches to providing “shared” or “contended” access to GPU were developed to allow dedicated GPU hardware to be provided on-demand. Nvidia’s "Multi-Instance GPU (MIG) technology" is one such approach, whereby GPUs are partitioned into up to seven separate instances that can be allocated to separate virtual workloads as required.
GPU compute for cloud environments was first made available on cloud infrastructure back in 2010, when AWS became the first to offer GPUs on demand on their public cloud. Since then, it has become more and more common, with Google in 2017 also making GPU Cloud Compute available on their own cloud platform. However, even with such considerable amount time that has passed since the early days of cloud GPU, the incredibly high cost of access to it still continues to be a barrier for many organisations. Fortunately, this is where our mCloud platform comes in to rectify this by removing the cost barrier - with mCloud enabling organisations of all sizes to now be able keep up with this technology, and thus have access to and benefit from such new tools.
It shouldn’t be a surprise that many of the hyperscaler companies, such as AWS, Azure, and Google Cloud, now offer GPU compute in order to meet these new demands. After all, with the ever-increasing demand for it, there is a huge amount of money to be made by these hyperscalers whom are all vying to get their own share of the billions being invested into all these new AI ventures! And so it shouldn’t be a surprise that they're incredibly expensive for all but the big enterprise-level organisations. In fact, in the first half of 2025 alone, a total of $104 billion was invested in AI ventures according to CNBC2!
With each of these hyperscalers, they bill on an hourly cost model, with the cost per hour varying depending on the instance chosen; the CPU, Memory, and Storage available to that instance; the generation of GPU that instance has access to; and finally, the length of time you wish to commit to. As an example, for their standard GPU G4dn EC2 instances, which are powered by NVIDIA T4 GPUs, the cost ranges from $0.37853 USD per hour to $1.204 USD per hour (as of 17/09/2025). So over a single year, that totals to between $4,966.37 AUD and $15,796.69 AUD - and this is for a GPU that retails for around $2,820 AUD to buy outright before any specials or deals are applied! Unfortunately, this very costly scenario is much the same case regardless of which hyperscaler you choose to go with.
That’s why previously, our recommended approach was to look into utilising dedicated cards in one's own physical hardware. However, that was before the launch of our own GPU Cloud Servers - as with this service, we’re now able to offer similar savings that you would see from using your own dedicated GPU hardware, but without the required upfront investment.
With our “dedicated” and “contended” GPU cloud server offerings, customers are now able to choose what would suit them best. They can either get themselves the full “dedicated” power of one or more GPUs, or in the circumstances where they need to reduce their cloud costs for lower time-sensitive workloads, then they can instead choose to acquire a portion of a physical GPU via utilising our “contended” GPU platform.
This unique offering, in which customers can acquire “contended compute” is a new one, allowing users to set their desired guaranteed minimum compute to what they require, but still having access to more processing power if it’s available. With this approach you’re able to set a desired minimum – starting at 10% and further increasing in increments of 10% - with the good news being that you're able to utilise up to the full processing power of the available GPU if the other users on the same platform aren’t currently using those resources. This is achieved through time-sliced access to the GPU, based on those guaranteed minimums, ensuring that each user gets at least their minimum guaranteed compute.
It should also be noted that there is a tipping point with this approach, wherein the performance you require of a single card would cost more on a contended platform that it would if you were utilising a dedicated option. For our own contended platform that point is around the 60% mark, so if you find yourself requiring a minimum of 70% of the total card’s processing power, then it becomes more cost-effective to look at securing your own dedicated GPUs.
As one final note, for any workloads that require more than a single GPU’s worth of processing power, utilising your own dedicated cards rather than a contended GPU service is likely going to be the better option.
If you would like to discuss what would be involved to deploy GPU compute onto your infrastructure to your satisfaction and budget, or have any general questions about GPU cloud compute, please let us know! This is our area of expertise and we are more than happy to answer any questions that you may have. We have a local team based here, and being 100% fully Australian owned and operated, we're best equiped to understand the unique circumstances and environment that your organisation operates in.
You can reach out and chat to us by phone on 1300 769 972 (Option #1) or via email at sales@micron21.com.
Simple, transparent pricing from Australia's leading cloud provider