Amazon SageMaker is an end-to-end machine studying (ML) platform with wide-ranging options to ingest, rework, and measure bias in information, and prepare, deploy, and handle fashions in manufacturing with best-in-class compute and providers similar to Amazon SageMaker Knowledge Wrangler, Amazon SageMaker Studio, Amazon SageMaker Canvas, Amazon SageMaker Mannequin Registry, Amazon SageMaker Function Retailer, Amazon SageMaker Pipelines, Amazon SageMaker Mannequin Monitor, and Amazon SageMaker Make clear. Many organizations select SageMaker as their ML platform as a result of it gives a typical set of instruments for builders and information scientists. Quite a few AWS impartial software program vendor (ISV) companions have already constructed integrations for customers of their software program as a service (SaaS) platforms to make the most of SageMaker and its numerous options, together with coaching, deployment, and the mannequin registry.
On this put up, we cowl the advantages for SaaS platforms to combine with SageMaker, the vary of attainable integrations, and the method for creating these integrations. We additionally deep dive into the commonest architectures and AWS assets to facilitate these integrations. That is meant to speed up time-to-market for ISV companions and different SaaS suppliers constructing related integrations and encourage prospects who’re customers of SaaS platforms to associate with SaaS suppliers on these integrations.
Advantages of integrating with SageMaker
There are a number of advantages for SaaS suppliers to combine their SaaS platforms with SageMaker:
Customers of the SaaS platform can benefit from a complete ML platform in SageMaker
Customers can construct ML fashions with information that’s in or outdoors of the SaaS platform and exploit these ML fashions
It gives customers with a seamless expertise between the SaaS platform and SageMaker
Customers can make the most of basis fashions accessible in Amazon SageMaker JumpStart to construct generative AI functions
Organizations can standardize on SageMaker
SaaS suppliers can give attention to their core performance and provide SageMaker for ML mannequin improvement
It equips SaaS suppliers with a foundation to construct joint options and go to market with AWS
SageMaker overview and integration choices
SageMaker has instruments for each step of the ML lifecycle. SaaS platforms can combine with SageMaker throughout the ML lifecycle from information labeling and preparation to mannequin coaching, internet hosting, monitoring, and managing fashions with numerous parts, as proven within the following determine. Relying on the wants, any and all elements of the ML lifecycle could be run in both the shopper AWS account or SaaS AWS account, and information and fashions could be shared throughout accounts utilizing AWS Identification and Entry Administration (IAM) insurance policies or third-party user-based entry instruments. This flexibility within the integration makes SageMaker a really perfect platform for patrons and SaaS suppliers to standardize on.
Integration course of and architectures
On this part, we break the combination course of into 4 most important phases and canopy the widespread architectures. Observe that there could be different integration factors along with these, however these are much less widespread.
Knowledge entry – How information that’s within the SaaS platform is accessed from SageMaker
Mannequin coaching – How the mannequin is educated
Mannequin deployment and artifacts – The place the mannequin is deployed and what artifacts are produced
Mannequin inference – How the inference occurs within the SaaS platform
The diagrams within the following sections assume SageMaker is working within the buyer AWS account. Many of the choices defined are additionally relevant if SageMaker is working within the SaaS AWS account. In some circumstances, an ISV might deploy their software program within the buyer AWS account. That is normally in a devoted buyer AWS account, which means there nonetheless must be cross-account entry to the shopper AWS account the place SageMaker is working.
There are a couple of other ways wherein authentication throughout AWS accounts could be achieved when information within the SaaS platform is accessed from SageMaker and when the ML mannequin is invoked from the SaaS platform. The really helpful technique is to make use of IAM roles. An alternate is to make use of AWS entry keys consisting of an entry key ID and secret entry key.
Knowledge entry
There are a number of choices on how information that’s within the SaaS platform could be accessed from SageMaker. Knowledge can both be accessed from a SageMaker pocket book, SageMaker Knowledge Wrangler, the place customers can put together information for ML, or SageMaker Canvas. The commonest information entry choices are:
SageMaker Knowledge Wrangler built-in connector – The SageMaker Knowledge Wrangler connector allows information to be imported from a SaaS platform to be ready for ML mannequin coaching. The connector is developed collectively by AWS and the SaaS supplier. Present SaaS platform connectors embrace Databricks and Snowflake.
Amazon Athena Federated Question for the SaaS platform – Federated queries allow customers to question the platform from a SageMaker pocket book by way of Amazon Athena utilizing a customized connector that’s developed by the SaaS supplier.
Amazon AppFlow – With Amazon AppFlow, you should use a customized connector to extract information into Amazon Easy Storage Service (Amazon S3) which subsequently could be accessed from SageMaker. The connector for a SaaS platform could be developed by AWS or the SaaS supplier. The open-source Customized Connector SDK allows the event of a personal, shared, or public connector utilizing Python or Java.
SaaS platform SDK – If the SaaS platform has an SDK (Software program Improvement Package), similar to a Python SDK, this can be utilized to entry information immediately from a SageMaker pocket book.
Different choices – Along with these, there could be different choices relying on whether or not the SaaS supplier exposes their information by way of APIs, information or an agent. The agent could be put in on Amazon Elastic Compute Cloud (Amazon EC2) or AWS Lambda. Alternatively, a service similar to AWS Glue or a third-party extract, rework, and cargo (ETL) device can be utilized for information switch.
The next diagram illustrates the structure for information entry choices.
Mannequin coaching
The mannequin could be educated in SageMaker Studio by an information scientist, utilizing Amazon SageMaker Autopilot by a non-data scientist, or in SageMaker Canvas by a enterprise analyst. SageMaker Autopilot takes away the heavy lifting of constructing ML fashions, together with function engineering, algorithm choice, and hyperparameter settings, and additionally it is comparatively simple to combine immediately right into a SaaS platform. SageMaker Canvas gives a no-code visible interface for coaching ML fashions.
As well as, Knowledge scientists can use pre-trained fashions accessible in SageMaker JumpStart, together with basis fashions from sources similar to Alexa, AI21 Labs, Hugging Face, and Stability AI, and tune them for their very own generative AI use circumstances.
Alternatively, the mannequin could be educated in a third-party or partner-provided device, service, and infrastructure, together with on-premises assets, supplied the mannequin artifacts are accessible and readable.
The next diagram illustrates these choices.
Mannequin deployment and artifacts
After you’ve got educated and examined the mannequin, you may both deploy it to a SageMaker mannequin endpoint within the buyer account, or export it from SageMaker and import it into the SaaS platform storage. The mannequin could be saved and imported in customary codecs supported by the widespread ML frameworks, similar to pickle, joblib, and ONNX (Open Neural Community Trade).
If the ML mannequin is deployed to a SageMaker mannequin endpoint, extra mannequin metadata could be saved within the SageMaker Mannequin Registry, SageMaker Mannequin Playing cards, or in a file in an S3 bucket. This may be the mannequin model, mannequin inputs and outputs, mannequin metrics, mannequin creation date, inference specification, information lineage data, and extra. The place there isn’t a property accessible within the mannequin package deal, the info could be saved as customized metadata or in an S3 file.
Creating such metadata might help SaaS suppliers handle the end-to-end lifecycle of the ML mannequin extra successfully. This data could be synced to the mannequin log within the SaaS platform and used to trace adjustments and updates to the ML mannequin. Subsequently, this log can be utilized to find out whether or not to refresh downstream information and functions that use that ML mannequin within the SaaS platform.
The next diagram illustrates this structure.
Mannequin inference
SageMaker presents 4 choices for ML mannequin inference: real-time inference, serverless inference, asynchronous inference, and batch rework. For the primary three, the mannequin is deployed to a SageMaker mannequin endpoint and the SaaS platform invokes the mannequin utilizing the AWS SDKs. The really helpful choice is to make use of the Python SDK. The inference sample for every of those is comparable in that the predictor’s predict() or predict_async() strategies are used. Cross-account entry could be achieved utilizing role-based entry.
It’s additionally attainable to seal the backend with Amazon API Gateway, which calls the endpoint by way of a Lambda perform that runs in a protected non-public community.
For batch rework, information from the SaaS platform first must be exported in batch into an S3 bucket within the buyer AWS account, then the inference is finished on this information in batch. The inference is finished by first making a transformer job or object, after which calling the rework() technique with the S3 location of the info. Outcomes are imported again into the SaaS platform in batch as a dataset, and joined to different datasets within the platform as a part of a batch pipeline job.
Another choice for inference is to do it immediately within the SaaS account compute cluster. This may be the case when the mannequin has been imported into the SaaS platform. On this case, SaaS suppliers can select from a spread of EC2 cases which might be optimized for ML inference.
The next diagram illustrates these choices.
Instance integrations
A number of ISVs have constructed integrations between their SaaS platforms and SageMaker. To study extra about some instance integrations, discuss with the next:
Conclusion
On this put up, we defined why and the way SaaS suppliers ought to combine SageMaker with their SaaS platforms by breaking the method into 4 elements and protecting the widespread integration architectures. SaaS suppliers trying to construct an integration with SageMaker can make the most of these architectures. If there are any customized necessities past what has been coated on this put up, together with with different SageMaker parts, get in contact together with your AWS account groups. As soon as the combination has been constructed and validated, ISV companions can be part of the AWS Service Prepared Program for SageMaker and unlock a wide range of advantages.
We additionally ask prospects who’re customers of SaaS platforms to register their curiosity in an integration with Amazon SageMaker with their AWS account groups, as this might help encourage and progress the event for SaaS suppliers.
In regards to the Authors
Mehmet Bakkaloglu is a Principal Options Architect at AWS, specializing in Knowledge Analytics, AI/ML and ISV companions.
Raj Kadiyala is a Principal AI/ML Evangelist at AWS.