AI/ML on premise and in cloud
AI and Machine Learning (ML) can be a real boost for many companies, especially those with a lot of accumulated data from years of operation. AI platforms are getting easier to use without having data scientists on staff, while at the same time data lifecycle patterns are well established. Most of these AI platforms target cloud based use cases, which leads to the question: What do you do if the data needs to stay on premise?
The illustration below shows a few common approaches to using AI. Something that may be less obvious in the illustration is that data science and AI are a heavy lift, both in terms of cost and intellectual property (IP). AI platforms offered by cloud providers represent significant investments in both cost and IP, which represent a significant cost savings and decreased time to market for AI consumers who can use it on demand.
- Yellow: This has been a common practice for years and focuses on hiring data scientists and other experts to design AI in house. This also often includes large capital expenditures for accelerators, such as GPUs, to train and run machine learning models.
- Green: This has been the most common emerging model as cloud services mature. Data is copied from on premise to cloud and processed on cloud. This can often be done by a software engineer without hiring data science expertise.
- Purple: This model sends data to the cloud only to be processed, but not persisted. This approach is typically excludes training, since training a model requires continual access to a large collection of data. The next section looks closer at what is possible in this model.
What AI is possible without persisting data in cloud?
A growing set of Automated Machine Learning offerings provide services including Vision, Natural Language Processing, Translation and Speech to Text. One reason these are well suited to a general offering is that they can be developed using a large set of common data that isn’t unique to any one person, company or even geography. Language, spoken or written, is one example of a problem that accommodates a reusable model that can be packaged as an automated offering.
Use cases that may be able to take advantage of these offerings include transcription of customer service calls or identification of objects in images. Another solution that is growing in popularity is AI based customer service bots. There is no need to persist source data in the cloud in order to leverage the benefit of the AI service.
It’s important to note that most scenarios where the problem and solution are specific to a company (differentiating in a market), won’t fit in this model. This is because the data required to train the model and the design of the model are not common (shared). In fact, they are often closely guarded trade secrets. As such, cloud providers have no way to invest in and train these models for a consumption based usage model.
Hybrid: Cloud managed on premise
Some cloud vendors have identified a growing number of use cases where enterprises want or need to keep data on premise, but still want the benefit of cloud managed AI. The pattern here looks a little different, with the cloud platform coming in to the datacenter. This makes it possible to keep all data and compute resident, but still benefit from advanced AI and other managed services.
Keep in mind that the above pattern also works when “Datacenter (on premise)” changes to “other cloud”. Some vendor platforms make it as easy to operate against data in another cloud as it would be in their own cloud or on premise.
Cloud security and governance
The primary motivation to do AI on premise usually centers on the (perceived) sensitivity of the data that will be used to train and run the model. In many cases this sensitivity is justified, such as health or financial information. In other cases, the sensitivity may result from improper or missing classification or other metadata. In all of these cases, cloud providers are making big investments ensure their platforms are safe and reliable places to host sensitive data by providing strong controls.
In practice, this extends from the published core AI principles, which aim to make AI easily available for any socially beneficial purpose, while maintaining a strong commitment to privacy. In addition to AI governance, cloud providers are ever more committed to data safety in general, and user trust is forefront in cloud security design.
Cloud is a great place to build an AI practice. Managed AI solutions reduce time to value and increase availability to software engineers and business users. For common problems, like language and some image related tasks, automated models make it possible to leverage fully automated offerings. When data sensitivity is important, cloud AI can be run on premise or the strong controls available can make it easier to move those workloads safely in to cloud.
Why Anthos for the Enterprise
November 17, 2021
Google Cloud pricing data via API
April 1, 2021
Immutable Infrastructure: Production Release
January 26, 2021