Transform Your Empty M.2 Slot into a 20B LLM Processor Using this AI Module: Featuring 32 GB Memory and 60 TOPs

Unigen has unveiled its latest innovation, the Amaretti AI module, designed to fit within a standard M.2 slot. This compact module offers an impressive performance of up to 60 TOPS and 32 GB of memory, making it compatible with large language models (LLMs) containing up to 20 billion parameters.

Unigen AI Module: Powerful Performance with Minimal Power Consumption

As local AI agents gain traction, a wave of innovative AI products is emerging in the tech landscape. Among them is Unigen, introducing the Amaretti E1. S AI module, which resembles a conventional SSD yet boasts remarkable AI processing capabilities.

This module is powered by the SAKURA-II AI accelerator from EdgeCortix. Initially developed for low-power AI applications, it brings robust capabilities to devices like the Raspberry Pi5 and other ARM-based platforms. Notably, the SAKURA-II features an NPU capable of delivering 60 TOPS of INT8 performance and 30 TFLOPS of BF16 computation. It is equipped with a dual 64-bit LPDDR4x memory controller and offers a substantial 20MB of in-chip SRAM cache, all within a compact 19×19 BGA package that consumes approximately 8-10 watts of power.

An EdgeCortix SAKURA-I1 chip mounted on a S2M2 Rev-C board.

Unigen has successfully integrated the SAKURA-II AI accelerator onto the E1. S board, significantly enhancing its functionality with a memory capacity of up to 32 GB. This module is available in two options—16 GB and 32 GB—offering a remarkable bandwidth of up to 68 GB/s. With a power rating of just 10 watts, users can achieve an impressive 6 TOPS per watt of efficiency.

The generous memory capacity of 32 GB allows this module to adeptly handle AI LLMs boasting up to 20 billion parameters, making it an ideal choice for low-power AI solutions tasked with executing Generative AI and Agentic AI workflows. Additionally, these modules can be stacked in multiple M.2 slots, expanding their overall processing capabilities. For more demanding applications, EdgeCortix also offers a superior PCIe configuration featuring dual chips and extended functionality, but the M.2 solution stands out as a compelling option.

The image shows a Unigen Amaretti E1. S AI Module solid-state drive against a black background. — Amaretti E1. S AI Modules – Product Photos and Promotional Designs

Many PCs, desktops, and laptops have unused M.2 slots that could be harnessed for enhanced AI capabilities. For those seeking localized AI solutions to boost their systems, the Amaretti AI modules present a highly advantageous option.

According to Unigen, the AI module is compatible with all leading AI frameworks, including TensorFlow, PyTorch, ONNX, and Hugging Face. Key features of this module include:

E1. S AI Module
AI Accelerator: SAKURA-II
Up to 1920 TOPS of inference performance when utilized with air-cooled Dual CPU Servers
Power efficiency with just 20% the wattage compared to training GPUs
Support for Generative AI LLMs of up to 20 billion parameters
Lead times of approximately 14 weeks, significantly reducing the wait associated with GPU servers
Memory options up to 32 GB per module

Unigen ships the Amaretti E1. S AI module pre-equipped with a heatsink to ensure optimal performance. While pricing details remain undisclosed, the memory capacity serves as a strong indicator of potential costs.

Source & Images