Proving the Business Case for the Internet of Things

AWS accelerates machine learning

Steve Rogerson
September 26, 2019
Amazon Web Services (AWS) has announced the general availability of G4 instances, a GPU-powered Amazon Elastic Compute Cloud (EC2) instance to help accelerate machine-learning inference and graphics-intensive workloads, both of which are computationally demanding and benefit from additional GPU acceleration.
G4 instances provide a machine-learning inference for applications such as adding metadata to an image, object detection, recommender systems, automated speech recognition and language translation. They also provide a platform for building and running graphics-intensive applications, such as remote graphics workstations, video transcoding, photo-realistic design and game streaming in the cloud.
Machine learning involves two processes that require compute – training and inference. Training entails using labelled data to create a model that can make predictions, a compute-intensive task that requires powerful processors and high-speed networking. Inference is the process of using a trained machine-learning model to make predictions, which typically requires processing a lot of small compute jobs simultaneously, a task that can be most cost-effectively handled by accelerating computing with energy-efficient NVidia GPUs.
With the launch of P3 instances in 2017, AWS was the first to introduce instances optimised for machine-learning training in the cloud with NVidia V100 Tensor Core GPUs, allowing users to reduce machine-learning training from days to hours. However, inference is what actually accounts for the vast majority of machine learning’s cost. According to users, machine-learning inference can represent up to 90% of overall operational costs for running machine-learning workloads.
The G4 instances feature the latest NVidia T4 GPUs, custom second-generation Intel Xeon Scalable processors, up to 100Gbit/s of networking throughput, and up to 1.8Tbyte of local NVMe storage, to deliver the cost-effective GPU instances for machine-learning inference.
And with up to 65Tflops of mixed-precision performance, G4 instances are not only said to deliver better price/performance for inference, but can also be used cost-effectively for small-scale and entry-level machine- learning training jobs that are less sensitive to time to train.
G4 instances also provide a compute engine for graphics-intensive workloads, offering up to a 1.8x increase in graphics performance and up to 2x video transcoding capability over the previous generation G3 instances. These performance enhancements enable the use of remote workstations in the cloud for running graphics-intensive applications such as Autodesk Maya or 3D Studio Max, as well as create photo-realistic and high-resolution 3D content for movies and games.
“We focus on solving the toughest challenges that hold our customers back from taking advantage of compute intensive applications,” said Matt Garman, vice president at AWS. “AWS offers the most comprehensive portfolio to build, train and deploy machine-learning models powered by Amazon EC2’s broad selection of instance types optimised for different machine-learning use cases. With new G4 instances, we’re making it more affordable to put machine learning in the hands of every developer. And with support for the latest video decode protocols, customers running graphics applications on G4 instances get superior graphics performance over G3 instances at the same cost.”
Those with machine-learning workloads can launch G4 instances using Amazon SageMaker or AWS Deep Learning AMIs, which include machine-learning frameworks such as TensorFlow, TensorRT, MXNet, PyTorch, Caffe2, CNTK and Chainer. G4 instances will also support Amazon Elastic Inference in the coming weeks, which will allow developers to reduce the cost of inference by up to 75% by provisioning the right amount of GPU performance.
Users with graphics and streaming applications can launch G4 instances using Windows, Linux or AWS Marketplace AMIs from NVidia with NVidia Quadro Virtual Workstation software preinstalled. A bare metal version will be available in the coming months.
G4 instances are available in the US East (north Virginia and Ohio), US West (Oregon and north California), Europe (Frankfurt, Ireland and London), and Asia Pacific (Seoul and Tokyo) regions, with availability in additional regions planned in the coming months. G4 instances are available to be purchased as on-demand, reserved instances or spot instances.
Clarifai is an artificial intelligence company that specialises in visual recognition to solve real-world problems.
“We apply machine learning to image and video recognition, helping customers better understand their media assets and apply it across a broad set of applications, such as providing personalised online shopping experience or measuring in-store shopper behaviour,” said Robert Wen, head of engineering at Clarifai. “We provide our customers with a full-featured API that allows them to utilise our pre-trained machine-learning models and make predictions on their data. G4 instances offer a highly cost-effective solution that will enable us to make it more economical for our customers to use AI across a broader set of use cases.”