It has garnered headlines as a potential force of danger with robots overtaking the world, but most often AI is viewed as a powerful new technology that can automate processes and make our workplaces — and our world — more efficient, productive and intelligent.
AI has the potential to transform nearly every industry, enabling complex tasks like computer vision, image recognition, machine learning, natural language processing and more.
Empowering machines to learn
The amount of data we generate, consume and analyse has grown exponentially over the years, and it shows no signs of slowing down. According to an IDC report from 2017, annual data creation is forecasted to reach 180 zettabytes in 2025, which translates to an astounding 180 trillion gigabytes.
The rapid, accumulative growth of data is happening everywhere, from our personal devices to industrial-size facilities. As the amount of data increases at astronomical rates, the industry needs to identify new, innovative ways to analyse the ever-increasing flow of information, which is where AI comes into play.
Deep learning has been around since the 1980s, but after decades of stagnation, it finally reached a breakthrough in the early 2000s when engineers combined distributed computing with neural network research. Computing power has followed a gradual, but consistent, improving path of performance enhancements and faster clock speeds over the past several decades.
However, the industry is reaching its limit when it comes to material physics, and we are maxing out when it comes to improving performance based on higher clock speeds. But how are we going to push these limits further?
Parallel processing / multi-core architecture / acceleration
One solution to increased performance for processing power lies in parallel processing — that is, multi-core architectures that split a task into parts that are executed simultaneously on separate processors in the same system.
With parallel processing, deep learning has grown faster, enabling a wide-reaching set of applications, from intelligent personal assistants, to smart speakers, to language translation and AI photo filters.
Parallel processing is the opposite of sequential processing (the process of integrating and understanding stimuli in a particular serial order). By splitting a job in different tasks and executing them simultaneously in parallel, a significant boost in performance can be achieved. Making its debut in the context of scientific high-performance computing (HPC) already decades ago, parallel processing together with hardware-driven acceleration has become the key enabler for modern AI.
Multi-core Chips also allow higher performance at lower energy, which can be a big factor in mobile devices that operate on batteries. A multi-core CPU is generally more energy-efficient than a single large monolithic core because the use of multiple processors allows architects to decrease the number of embedded systems — enabling them to surpass Moore’s Law. That’s important because Moore’s Law states that “electrical resistance is increased with smaller circuits, which means that they create more heat.” Therefore if less heat is generated, then less energy will be spent on cooling the system, which means this can save battery life.
Increasingly, mobile processors also feature advanced neural processing units (NPUs) for more powerful and efficient on-device AI, comparable to graphic processing units (GPUs) being used in data centre environments to accelerate machine learning processes at big data scale.
Although it’s not new, another key enabling technology for modern AI is high-bandwidth memory (HBM). HBM is a premium performance memory interface using 3D-stacked SDRAM (synchronous dynamic random-access memory), allowing to maximise data transfer rates in a small form factor that uses less power and has a substantially wider bus when compared to other DRAM solutions.
It works by vertically stacking memory Chips on top of one another, connected through through-silicon vias (TSVs) and micro-bumps. Their reduced footprint means they can be placed right next to a CPU or GPU, enabling greatly accelerated data processing and analysis across a host of applications such as accelerators, high-end graphics cards, high-performance networking, and many more.
Why is this such a big deal? AI has emerged as one of the main sectors set to benefit from the production of more advanced memory solutions, with studies revealing that the scope of deep learning can be significantly enhanced with accelerated processing speed and increases in the amount of memory capacity and bandwidth that a system contains.
Accelerating the storage layer
Over the last ten years, the speed of memory and networks have increased 20- to 100-fold, but the server side is lagging due to the low-level input/output (I/O) of its disk-based storage technology. A major obstacle to overcome with the massive influx of data we are currently experiencing — and are forecasted to keep experiencing.
With the advent of flash replacing spinning storage media at scale, the industry has come up with multiple ways to bridge this gap. Modern protocols like NVMe over fabrics disaggregate compute and storage, enabling a separated capacity scaling adapted to the needs of data hungry applications such as AI, where the need for more data often grows faster than the need for more clock speed on the processing side.
And there’s more out there than just faster storage devices. For instance, smart SSDs allow to offload compute tasks from the CPU to the storage device, significantly accelerating routine tasks with linear scalability. And then there are other novelty approaches such as key value storage, zoned namespaces or Single Root IO Virtualisation (SR-IOV), offering alternative routes to achieve the goal of accelerating the data provisioning in an efficient manner and, most importantly for AI, at scale.
As we roll out bandwidth-intensive AI and deep learning technologies, another vector to removing the data bottleneck in deep learning stacks is in-memory computing.
In-memory computing means using a type of middleware software that allows one to store data in RAM, across a cluster of computers, and process it in parallel. In-memory computing software is purposed to store data in a distributed fashion, where the entire dataset is divided into individual computers’ memory, each storing only a portion of the overall dataset. As the demand to process ever growing datasets in real-time becomes more prevalent, the importance of in-memory computing will only continue to grow.
The goal — systems that interpret the world around them
Perhaps the simplest way to think of AI is as technology that enables devices to perform tasks that require human-like cognition. Image and speech recognition are clear markers of such intelligence, and two areas where AI is rapidly advancing. Increasingly we are finding these technologies , so what is the key to developing AI technology in the years to come and ensuring we keep progressing?
Advancements in on-device AI will play a key part in making connected devices faster and more efficient. Rapid improvements in AI algorithms, hardware and software are making it possible to shift AI services away from the cloud and onto our devices themselves. Localizing these services on mobile devices, appliances, cars and more presents exciting benefits in terms of reliability, privacy and performance. Not only does on-device AI resolve issues related to network connectivity, it’s also much faster than the cloud because it doesn’t require data to be transmitted to and from a server, and it enables biometric and other sensitive data to be safely confined to the user’s device.
Enabling the foundation for this future depends on advanced memory technologies and ensuring our devices are equipped with the storage capacity and processing power to complete tasks simultaneously and in an efficient manner. Our demands of AI and its on-device capabilities is ever growing, and the technologies involved must follow suit to meet this demand.
Richard Walsh has been working within the memory division of Samsung Semiconductor Europe for the past 25 years, covering DRAM, NAND and NOR flash, among other technologies. He holds a Bachelor of Engineering degree in Electronics, Computer Hardware and Software from the University of Limerick.
Untether AI rethinks Von Neumann architecture for AI inference accelerator cards
EW@60: AI computers surpass us
Rutronik introduces AI development kit
Sponsored Content: Bringing AI to the edge