Amazon SageMaker is a fully managed service that allows developers and data scientists to build, train, and deploy machine learning models.
Amazon SageMaker includes three modules: Build, Train, and Deploy. The Build module provides a hosted environment to work with your data, experiment with algorithms, and visualize your output. The Train module allows for one-click model training and tuning at high-scale and low cost. The Deploy module provides a managed environment for you to easily host and test models for inference securely and with low latency.
Build
Build highly accurate training datasets
Amazon SageMaker Ground Truth helps customers build highly accurate training datasets quickly using machine learning and reduce data labeling costs by up to 70%. Successful machine learning models are trained using data that has been labeled to teach the model how to make correct decisions. This process can often take months and large teams of people to complete. SageMaker Ground Truth provides an innovative solution to reduce cost and complexity, while also increasing the accuracy of data labeling by bringing together machine learning with a human labeling process called active learning.
Managed Notebooks for Authoring Models
Amazon SageMaker provides fully managed instances running Jupyter notebooks for training data exploration and preprocessing. These notebooks are pre-loaded with CUDA and cuDNN drivers for popular deep learning platforms, Anaconda packages, and libraries for TensorFlow, Apache MXNet, PyTorch, and Chainer.
In just one click, you can access a fully managed machine learning notebook environment using the popular Jupyter open source notebook format.
These notebook workspaces let you explore and visualize your data and document your findings in re-usable workflows using virtually all popular libraries, frameworks, and interfaces. From within the notebook, you can bring in your data already stored in Amazon S3. You can also use AWS Glue to easily move data from Amazon RDS, Amazon DynamoDB, and Amazon Redshift into S3 for analysis. You can write or import your notebook or use one of many pre-built notebooks that are pre-loaded into Amazon SageMaker. Pre-built notebooks are available for all of the built-in machine learning algorithms. Also, notebook templates are available to help you get started with common ML applications and more advanced Amazon SageMaker functionality.
Built-in, High Performance Algorithms
Amazon SageMaker provides high-performance, scalable machine learning algorithms optimized for speed, scale, and accuracy. These algorithms can perform training on petabyte-scale datasets and provide up to 10x the performance of other implementations. You can choose from supervised algorithms where the correct answers are known during training and you can instruct the model where it made mistakes. Amazon SageMaker includes supervised algorithms such as XGBoost and linear/logistic regression or classification, to address recommendation and time series prediction problems. Amazon SageMaker also include support for unsupervised learning (i.e. the algorithms must discover the correct answers on their own), such as with k-means clustering and principal component analysis (PCA), to solve problems like identifying customer groupings based on purchasing behavior.
Amazon SageMaker makes the most common machine learning algorithms automatically available to you. You simply specify your data source, and you can start running k-means clustering for data segmentation, factorization machines for recommendations, time-series forecasting, linear regression, or principal component analysis, right away.
BlazingText Word2Vec | BlazingText implementation of the Word2Vec algorithm for scaling and accelerating the generation of word embeddings from a large number of documents. |
DeepAR | An algorithm that generates accurate forecasts by learning patterns from many related time-series using recurrent neural networks (RNN). |
Factorization Machines | A model with the ability to the estimate all of the interactions between features even with a very small amount of data. |
Gradient Boosted Trees (XGBoost) | Short for “Extreme Gradient Boosting”, XGBoost is an optimized distributed gradient boosting library. |
Image Classification (ResNet) | A popular neural network for developing image classification systems. |
IP Insights | An algorithm to detect malicious users or learn to usage patterns of IP addresses. |
K-Means Clustering | One of the simplest ML algorithms. It’s used to find groups within unlabeled data. |
K-Nearest Neighbor (k-NN) | An index based algorithm to address classification and regression based problems. |
Latent Dirichlet Allocation (LDA) | A model that is well suited to automatically discovering the main topics present in a set of text files. |
Linear Learner (Classification) | Linear classification uses an object’s characteristics to identify the appropriate group that it belongs to. |
Linear Learner (Regression) | Linear regression is used to predict the linear relationship between two variables. |
Neural Topic Modelling (NTM) | A neural network based approach for learning topics from text and image datasets. |
Object2Vec | A neural-embedding algorithm to compute nearest neighbors and to visualize natural clusters. |
Object Detection | Detects, classifies, and places bounding boxes around multiple objects in an image. |
Principal Component Analysis (PCA) | Often used in data pre-processing, this algorithm takes a table or matrix of many features and reduces it to a smaller number of representative features. |
Random Cut Forest | An unsupervised machine learning algorithm for anomaly detection. |
Semantic Segmentation | Partitions an image to identify places of interest by assigning a label to the individual pixels of the image. |
Seqence2Sequence | A general-purpose encoder-decoder for text that is often used for machine translation, text summarization, etc. |
These algorithms have been optimized so that their performance is up to 10x faster than what you’d achieve in traditional implementations. One of the ways we’ve done this is to implement these algorithms so that they don’t need to go back and look at data they’ve already seen. Traditionally, algorithms often pass back through your data set multiple times to reference earlier data. This is ok with small data sets, but the performance hit with large data sets can significantly slow down training. By engineering for a single pass, you’re able to efficiently and cost-effectively train on petabyte-scale data sets.
Broad Framework Support
Amazon SageMaker automatically configures and optimizes TensorFlow, Apache MXNet, Chainer, PyTorch, Scikit-learn, and SparkML so you don’t have to do any setup to start using these frameworks, and we’ll add other major frameworks in the coming months. However, you can always bring any framework you like to Amazon SageMaker by building it into a Docker container that you store in the Amazon EC2 Container Registry.Reinforcement Learning Support with Amazon SageMaker RL
Amazon SageMaker supports reinforcement learning in addition to traditional supervised and unsupervised learning. SageMaker now has built-in, fully-managed reinforcement learning algorithms, including some of the newest and best performing in the academic literature. SageMaker supports RL in multiple frameworks, including TensorFlow and MXNet, as well as newer frameworks designed from the ground up for reinforcement learning, such as Intel Coach, and Ray RL. Multiple 2D and 3D physics simulation environments are supported, including environments based on the open source OpenGym interface. Additionally, SageMaker RL will allow you to train using virtual 3D environments built in Amazon Sumerian and Amazon RoboMaker. To help you get started, SageMaker also provides a range of example notebooks and tutorials.
Most machine learning falls into a category called supervised learning. This method requires a lot of labeled training data, but the models you build are able to make sophisticated decisions. It’s the common approach with computer vision, speech, and language models. Another common-but less used-category of machine learning is called unsupervised learning. Here, algorithms try to identify a hidden structure in unlabeled data. The bar to train an unsupervised model is much lower, but the tradeoff is that the model makes less sophisticated decisions. Unsupervised models are often used to identify anomalies in data, such as abnormal fluctuations in temperature or signs of network intrusion.
Reinforcement learning (RL) has emerged a third, complementary approach to machine learning. RL takes a very different approach to training models. It needs virtually to no labeled training data, but it can still meet (and in some cases exceed) human levels of sophistication. The best thing about RL is that it can learn to model a complex series of behaviors to arrive at a desired outcome, rather than simply making a single decision. One of the most common applications today for RL is training autonomous vehicles to navigate to a destination.
An easy way to understand how RL works is to think of a simple video game where a character needs to navigate a maze collecting flags and avoiding enemies. Instead of a human playing, the algorithm controls the character and plays millions of games. All it needs to know to get started is that the character can move up, down, left and right, and that it will be rewarded by scoring points. The algorithm will then learn how to play to get the highest score possible. It will learn behaviors which improve the score (such as picking up flags or taking advantage of score multipliers), and minimize penalties (such as being hit by an enemy.) Over time, RL algorithms can learn advanced strategies to master the game, such as clearing the lower part of the maze first, how and when to use power-ups, and how to exploit enemy behaviors.
RL can be a force multiplier on traditional machine learning techniques. For example, RL and supervised learning have been combined to create personalized treatment regimens in health care, optimize manufacturing supply chains, improve wind turbine performance, drive autonomous cars, operate robots safely, and even create personalized classes and learning plans for students.
Test and Prototype Locally
The open source Apache MXNet and Tensorflow Docker containers used in Amazon SageMaker are available on Github. You can download these containers to your local environment and use the Amazon SageMaker Python SDK to test your scripts before deploying to Amazon SageMaker training or hosting environments. When you’re ready go from local testing to production training and hosting, a change to a single line of code is all that’s needed.
Train
One-click Training
When you’re ready to train in Amazon SageMaker, simply specify the location of your data in Amazon S3, and indicate the type and quantity of Amazon SageMaker ML instances you need, and get started with a single click in the console. Amazon SageMaker sets up a distributed compute cluster, performs the training, outputs the result to Amazon S3, and tears down the cluster when complete.
Training models is easy with Amazon SageMaker; just specify the location of your data in S3, and Amazon SageMaker will take your algorithm and run it on a training cluster isolated within its own software-defined network, configured to your needs. Just choose the instance type – including P3 GPU instances, which are ideal for fast and efficient training – and Amazon SageMaker will create your cluster in an auto-scaling group; attach EBS volumes to each node; set up the data pipelines; and start training with your TensorFlow, MXNet, Chainer or PyTorch script, Amazon’s own algorithms, or your algorithms provided by your own container. Once finished, it will output the results to S3 and automatically tear down the cluster.
To make it easy to conduct training at scale, we’ve optimized how training data streams from S3. Through the API, you can specify if you’d like all of the data to be sent to each node in the cluster, or if you’d like Amazon SageMaker to manage the distribution of data across the nodes depending on the needs of your algorithm.
Combined with the built-in algorithms, the scalability of training that’s possible with Amazon SageMaker can dramatically reduce the time and cost of training runs.
Managed Spot Training
You can optimize the costs of training your machine learning models and save up to 90% using Managed Spot Training.
Managed Spot Training uses Amazon EC2 Spot instances, which is spare AWS capacity that can be used to manage costs and save up to 90%. This option is ideal when you have flexibility in when your training jobs can run. With Managed Spot Training, Amazon SageMaker manages the Spot capacity, so your training jobs run reliably at up to 90% reduced costs compared to on-demand instances. Training jobs are run as and when compute capacity becomes available so you don’t have to poll continuously for capacity and there is no need to build additional tooling. Managed Spot Training works with Automatic Model Tuning, the built-in algorithms and frameworks that come with Amazon SageMaker, and custom algorithms.
Automatic Model Tuning
Amazon SageMaker can automatically tune your model by adjusting thousands of different combinations of algorithm parameters, to arrive at the most accurate predictions the model is capable of producing.
When you’re tuning your model to be more accurate, you have two big levers to pull, modifying the data inputs you provide the model (for example, taking the log of a number), and adjusting the parameters of the algorithm. These are called hyperparameters, and finding the right values can be tough. Typically, you’ll start with something random and iterate through adjustments as you begin to see what effect the changes have. It can be a long cycle depending on how many hyperparameters your model has.
Amazon SageMaker simplifies this by offering automatic model tuning as an option during training. Amazon SageMaker will use machine learning to tune your machine learning model. It works by learning what affects different types of data have on a model and applying that knowledge across many copies of the model to quickly seek out the best possible outcome. As a developer or data scientist, this means you only really need to be concerned with the adjustments you want to make to the data you feed the model, which greatly reduces the number of things to worry about during training.
When initiating automatic model tuning, you simply specify the number of training jobs through the API and Amazon SageMaker handles the rest.
Train Once, Run Anywhere
Amazon SageMaker Neo allows machine learning models to train once and run anywhere in the cloud and at the edge. Ordinarily, optimizing machine learning models to run on multiple platforms is extremely difficult because developers need to hand-tune models for the specific hardware and software configuration of each platform. Neo eliminates the time and effort required to do this by automatically optimizing TensorFlow, MXNet, PyTorch, ONNX, and XGBoost models for deployment on ARM, Intel, and Nvidia processors today, with support for Cadence, Qualcomm, and Xilinx hardware coming soon. You can access SageMaker Neo from the SageMaker console, and with just a few clicks, produce a model optimized for their cloud instance or edge device. Optimized models run up to two times faster and consume less than one-hundredth of the storage space of traditional models.
Model tracking capability
Amazon SageMaker model tracking helps you organize, find, and evaluate machine learning model experiments, before landing the best model for your use case.
Developing a machine learning model requires continuous experimentation involving different datasets, algorithms, and parameter values, all the while evaluating the impact of small, incremental changes on performance and accuracy. This iterative exercise often leads to explosion of hundreds or even thousands of model training experiments and model versions, slowing down the convergence and discovery of the winning model. In addition, the information explosion makes it very hard to trace back the lineage of a model version i.e. the unique combination of datasets, algorithms and parameters that brewed that model in the first place.
With Amazon SageMaker’s model tracking capabilities, you can now find the best models for your use case by searching on key model attributes, such as the algorithm used, parameter values, and any custom tags. Using custom tags lets you find the models trained for a specific project or created by a specific data science team, helping you categorize and catalog your work. You can also quickly compare and rank your training runs based on their performance metrics such as training loss and validation accuracy. Finally with the model tracking capabilities, you can quickly trace back the complete lineage of a model deployed in live environments right up until the data set used in training or validating the model.
Deploy
One-click Deployment
You can one-click deploy your model onto auto-scaling Amazon ML instances across multiple availability zones for high redundancy. Just specify the type of instance, and the maximum and minimum number desired, and Amazon SageMaker takes care of the rest. It will launch the instances, deploy your model, and set up the secure HTTPS endpoint for your application. Your application simply needs to include an API call to this endpoint to achieve low latency / high throughput inference. This architecture allows you to integrate your new models into your application minutes because model changes no longer require application code changes. Fully-managed Hosting with Auto Scaling
Amazon SageMaker manages your production compute infrastructure on your behalf to perform health checks, apply security patches, and conduct other routine maintenance, all with built-in Amazon CloudWatch monitoring and logging. Batch Transform
Batch Transform enables you to run predictions on large or small batch data. There is no need to break down the data set into multiple chunks or managing real-time endpoints. With a simple API, you can request predictions for a large number of data records and transform the data quickly and easily.Inference Pipelines
Amazon SageMaker enables you to deploy Inference Pipelines so you can pass raw input data and execute pre-processing, predictions, and post-processing on real-time and batch inference requests. Inference Pipelines can be comprised of any machine learning framework, built-in algorithm, or custom containers usable on Amazon SageMaker. You can build feature data processing and feature engineering pipelines with a suite of feature transformers available in the SparkML and Scikit-learn framework containers in Amazon SageMaker, and deploy these as part of the Inference Pipelines to reuse data processing code and easier management of machine learning processes.