
Google Distributed Cloud can be deployed in customer-controlled environments, including installations that are disconnected from the Internet, which is a key requirement for some government and critical-infrastructure users.
One of the big challenges is that these models are incredibly valuable and they need to be delivered in a trusted, secure environment, said Driggers. “That’s what’s really the most important thing to Google, is this model. So they need to be delivered in a confidential compute manner,” he said.
The model is not stored on a hard drive; it is stored in memory. If there’s any intrusion to the machine, the machine basically turns itself off, and the model is gone, so it cannot be stolen, according to Cirrascale.
Cirrascale said it will provide the hardware configurations, performance tuning and support needed to run Gemini inference at scale as part of its Cirrascale Inference Platform.
The company said the service is aimed at customers that want a production environment without rebuilding existing infrastructure and includes what it described as optimized systems for Gemini inference and ongoing operational support.
“It’s Google’s model. Our secret sauce is being a trusted partner to be able to deliver that model to the clients,” said Driggers. “It’s part of our inference as a service offering. So for our customers, we have a software layer on top of the model that allows them to tailor how they use it, so they can set user queues up and set user limitations.”

