DESIGN TOOLS
STORAGE

AI at the Edge: Future memory and storage in accelerating intelligence

Wil Florentino | November 2024

AI at the Edge: Future memory and storage in accelerating intelligence
 

The expanding use of AI in industry is accelerating more complex approaches — including machine learning (ML), deep learning and even large language models. These advancements offer a glimpse of the massive amounts of data expected to be used at the edge. Although the current focus has been on how to accelerate the neural network operation, Micron is driven on making memory and storage that is refined for AI at the Edge.
 

What is synthetic data? 
 

The IDC1 predicts that, by 2025, there will be 175 zettabytes (1 zettabyte =1 billion terabytes) of new data generated worldwide. These quantities are hard to fathom, yet the advancements of AI will continue to push the envelope for data-starved systems.

In fact, the ever-increasing AI models have been stifled by the amount of real physical data that is obtained from direct measurements or physical images. It’s easy to identify an orange if you have a sample of 10,000 readily available images of oranges. But if you need specific scenes to compare — for example, a random crowd vs. an organized march or anomalies in a baked cookie vs. a perfect cookie — accurate results can be difficult to confirm unless you have all the variant samples to create your baseline model.

The industry is increasingly using synthetic data.2 Synthetic data is artificially generated based on simulation models that, for example, offer statistical realities of the same image. This approach is especially true in industrial vision systems where baselines for physical images are unique and where not enough “widgets” can be found on the web to offer a valid model representation.

 

 

Source: “Forget About Your Real Data – Synthetic Data Is the Future of AI,” Maverick Research, 2021, via “What Is Synthetic Data,” Gerard Andrews, NVIDIA, 2021.


Of course, the challenge is where these new forms of data will reside. Certainly, any new datasets that are created must be stored either in the cloud or, for more unique representations, closer to where data needs to be analyzed – at the edge.
 

Model complexity and the memory wall

 

Finding the optimal balance between algorithmic efficiency and AI model performance is a complex task, as it depends on factors such as data characteristics and volume, resource availability, power consumption, workload requirements and more.

AI models are complex algorithms that can be characterized by their number of parameters: The greater the number of parameters, the more accurate the results. The industry started with a common baseline model, such as ResNet50 as it was easy to implement and became the baseline for network performance. But that model was focused on limited datasets and limited applications. As these transformers have evolved, we see that the evolution of transformers has increased parameters over increased memory bandwidth.3 This outcome is an obvious strain: Regardless of how much data the model can handle, we are limited by the bandwidth of memory and storage available for the model and parameters.

Evolution of the number of parameters of state-of-the-art (SOTA) models over the years, along with the AI accelerator memory capacity (green dots). Source: “AI and Memory Wall,” Amir Gholami, Medium, 2021.


For a quick comparison, we can look at an embedded AI system’s performance in tera operations per second (TOPS). Here we see that AI edge devices less than 100 TOPS may need around 225 GB/s and those above 100 TOPS may require 451 GB/s of memory bandwidth (Table 1).

  Intelligent endpoints Customer premise edge Infrastructure edge
INT 8 TOPS <20 <100 ~100 - 200
Memory BW required* 90 GB/s 225 GB/s 451 GB/s
IO width requirements x16, x32 x64, x128 X256
Memory solutions
Compute DRAM

LPDDR4

4.2 GT/s per pin

LPDDR5 ; LPDDR5x

6.4 GT/s ; 8.5 GT/s per pin

Max transfer rate per pin
Max device BW (x32) 13GB/s 26GB/s ; 34GB/s


Table 1 – Comparing AI system memory bandwidth requirements and memory technology device bandwidth. (* Estimated bandwidth required to saturate DLA for INT8 Resnet 50 model). Micron.


So, one way to optimize that model is to consider higher performing memory that also offers the lowest power consumption.

Memory is keeping up with AI accelerated solutions by evolving with new standards. For example, LPDDR4/4X (low-power DDR4 DRAM) and LPDDR5/5X (low-power DDR5 DRAM) solutions have significant performance improvements over prior technologies

Delivering industry leadership and longevity

Industry’s leading provider of best-in-class LPDRAM

LP5 industry leadership - first to market with FUSA certification
LP5X delivering highest performance at lowest power
LP4 industry’s first to 1A node - delivering cost reduction and longevity
Dedicated longevity fab for continued legacy product support
 

Broad industrial and multimarket portfolio
LPDDR4 and LPDDR5 up to 128Gb
Broad engagement and compatibility with all major CSV partners
LP4 / LP5 is solution of choice for AI chipset ecosystem
LP4 / LP5 offers higher temperature range vs older technologies: (ex. Ti = 100C for industrial)

Micron LPDDR5

LPDDR4 can run up to 4.2 GT/s per pin(giga transfer per second per pin)  and support up to x64 bus width. LPDDR5X offers a 50% increase in performance over the LPDDR4, doubling the performance to as much as 8.5GT/s per pin. In addition, LPDDR5 offers 20% better power efficiency than the LPDDR4X (source: Micron). These are significant developments that can support the need to cater to widening AI edge use cases.
 

What are the storage considerations? 
 

It’s not enough to think that compute resources are limited by the raw TOPs of the processing unit or by the bandwidth of the memory architecture. As ML models are becoming more sophisticated, the number of parameters for the model are expanding exponentially as well.

Machine learning models and datasets expand to achieve better model efficiencies, so higher-performing embedded storage will be needed as well. Typical managed NAND solutions such as e.MMC 5.1 with 3.2 Gb/s are ideal not only for code bring-up but also for remote data storage. In addition, solutions such as UFS 3.1 can run seven times faster — to 23.2 Gb/s — to allow for more complex models.

micron ufs 3.1

New architectures are also pushing functions to the edge that were typically relegated to cloud or IT infrastructure. For example, edge solutions implement a secure layer that offers an air gap between restricted operation data and the IT/cloud domain. AI at the edge also supports intelligent automation such as categorizing, tagging and retrieving stored data.

Memory storage developments such as NVMeTM SSDs that support 3D TLC NAND offer high performance for various edge workloads. For example, Micron’s 7450 NVMe SSD leverages a 176-layer NAND technology that’s ideal for most edge and data center workloads. With 2ms quality of service (QoS) latency, it’s ideal for the performance requirements of SQL server platforms. It also offers FIPS 140-3 Level 2 and TAA compliance for U.S. federal government procurement requirements.
 

The growing ecosystem of AI edge processors

 

Allied Market Research estimates the AI edge processor market will grow to $9.6 billion by 2030.4 Interestingly though, this new cohort of AI processor start-ups are developing ASICs and proprietary ASSPs geared for more space-and-power-constrained edge applications. These new chipsets also need the trade-off balance of performance and power when it comes to memory and storage solutions.

In addition, we see that AI chipset vendors have developed enterprise and data center standard form factor (EDSFF) accelerator cards that can be installed in a 1U solution and located with storage servers adaptable to accelerate any workload — from AI/ML inference to video processing — using the same module.
 

How do you seek the right memory and storage partner?

 

AI is no longer hype but a reality that’s being implemented in all verticals. In one study, 89% of industry already has a strategy or will have a strategy around AI at the edge within the next two years.5

But implementing AI is not a trivial task, and the right technologies and components will make all the difference. Micron’s portfolio of the latest technologies, both in memory and storage, leads the way for industrial customers with our IQ value proposition. If you are designing an AI Edge System, let Micron help get your product to market faster than ever. Contact your local Micron representative or distributor of Micron products (www.micron.com). 


 

1 Source: “The Digitization of the World – From Edge to Core,” IDC/Seagate, 2018.

2 Source: “Forget About Your Real Data – Synthetic Data Is the Future of AI,” Maverick Research, 2021, via “What Is Synthetic Data,” Gerard Andrews, NVIDIA, 2021.

3 Source: “AI and Memory Wall,” Amir Gholami, Medium, 2021.

Source: “Edge AI Processor Market Research, 2030”. Allied Market Research, June 2022.

Source: “Mastering Digital Transformation in Manufacturing,” Jash Bansidhar, Advantech Connect, 2023.“Mastering Digital Transformation in Manufacturing,” Jash Bansidhar, Advantech Connect, 2023.

Sr. Segment Marketing Manager

Wil Florentino

Wil Florentino is a Sr. Segment Marketing Manager for the Industrial Business Unit at Micron Technology. His role includes providing market intelligence and subject matter expertise in Industrial segments such as IIoT and industrial edge computing in support of new product roadmap memory solutions. Mr. Florentino has over 20 years of experience in embedded semiconductor technologies including SoCs, FPGAs, microcontrollers, and memory, primarily focused on industrial applications.