Proving the Business Case for the Internet of Things

Intel reveals AI accelerator at Hot Chips

Steve Rogerson
August 22, 2019



At this week’s Hot Chips conference at Stanford University, Intel revealed details of an upcoming high-performance artificial intelligence (AI) accelerator – the Intel Nervana neural network processor, with the NNP-T for training and the NNP-I for inference.
 
Intel engineers also presented technical details on hybrid chip packaging technology – Intel Optane DC persistent memory and chiplet technology for optical IO.
 
"To get to a future state of AI everywhere, we'll need to address the crush of data being generated and ensure enterprises are empowered to make efficient use of their data, processing it where it's collected when it makes sense and making smarter use of their upstream resources,” said Naveen Rao, Intel vice president. “Data centres and the cloud need to have access to performant and scalable general-purpose computing and specialised acceleration for complex AI applications. In this future vision of AI everywhere, a holistic approach is needed from hardware to software to applications."
 
Turning data into information and then into knowledge requires hardware architectures and complementary packaging, memory, storage and interconnect technologies that can evolve and support emerging and increasingly complex use cases and AI techniques. Dedicated accelerators such as the Nervana NNPs are built from the ground up, with a focus on AI to provide the right intelligence at the right time.
 
Built from the ground up to train deep learning models at scale, the Nervana NNP-T (neural network processor) is built to prioritise two key real-world considerations: training a network as fast as possible and doing it within a given power budget. This deep learning training processor is built with flexibility in mind, striking a balance among computing, communication and memory.
 
While Intel Xeon scalable processors bring AI-specific instructions and provide a foundation for AI, the NNP-T is architected from scratch, building in features and requirements needed to solve for large models, without the overhead needed to support legacy technology.
 
To account for future deep learning needs, the NNP-T is built with flexibility and programmability so it can be tailored to accelerate a wide variety of workloads, both existing ones today and new ones that will emerge. The NNP-T is code-named Spring Crest.
 
The Nervana NNP-I is for deep learning inference for major data centre workloads. It is purpose-built for inference and is designed to accelerate deep learning deployment at scale, leveraging Intel's 10nm process technology with Ice Lake cores to offer claimed industry-leading performance per watt across all major data centre workloads.
 
Additionally, the NNP-I has a high degree of programmability without compromising performance or power efficiency. As AI becomes pervasive across every workload, having a dedicated inference accelerator that is easy to programme, has short latencies, has fast code porting and includes support for all major deep learning frameworks allows companies to harness the full potential of their data as actionable insights. NNP-I is code-named Spring Hill.
 
Lakefield hybrid cores in a three-dimensional package introduce 3D stacking and hybrid computing architecture for mobile devices. Leveraging Intel's 10nm process and Foveros packaging technology, Lakefield achieves a reduction in standby power, core area and package height over previous generations.
 
Teraphy is an in-package optical IO chiplet for high-bandwidth, low-power communications. Intel and Ayar Labs demonstrated the integration of monolithic in-package optics with a high-performance system-on-chip. The Ayar Labs Teraphy optical IO chiplet is co-packaged with the Intel Stratix 10 FPGA using embedded multi-die interconnect bridge technology, offering high-bandwidth, low-power data communications from the chip package with determinant latency for distances up to 2km.
 
This collaboration will enable different approaches to architecting computing systems for the next phase of Moore's Law by removing the traditional performance, power and cost bottlenecks in moving data.
 
Intel Optane DC persistent memory, now shipping in volume, is the first product in the memory and storage hierarchy's tier called persistent memory. Based on Intel 3D XPoint technology and in a memory module form factor, it can deliver large capacity at near-memory speeds, latency in nanoseconds, while also natively delivering the persistence of storage. The two operational modes (memory mode and app direct mode) as well performance examples show how this tier can support a complete re-architecting of the data supply subsystem to enable faster and new workloads.