Nota and FuriosaAI Partner for RNGD NPU Model Optimization

Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks
Nota and FuriosaAI Partner for RNGD NPU Model Optimization
Source: Nota AI

Nota and FuriosaAI Partner for RNGD NPU Model Optimization

Nota and FuriosaAI collaborate to optimize AI models for the RNGD NPU, reducing Llama 3.1 70B memory usage to 35GB via INT4 quantization.

Philip Lee profile image
by Philip Lee

Seoul, South Korea — Nota AI, a provider of artificial intelligence model optimization technology, and FuriosaAI, an AI semiconductor company, said Tuesday they would collaborate on integrating their technologies.

The partnership will integrate Nota's model quantization and hardware-aware optimization platform, NetsPresso, with FuriosaAI's second-generation neural processing unit, RNGD, or Renegade.

Nota will provide quantization technology to reduce the memory footprint and computational demands of AI models running on RNGD hardware.

Quantization by converting models to 4-bit (INT4) precision reduces video memory requirements for large-scale models such as Meta's Llama 3.1 70B, reducing them from approximately 140GB to between 35GB and 40GB.

Nota said the NetsPresso platform can reduce model sizes by up to 90 percent and increase inference speeds by up to 42x, though results vary by architecture.

Technical benchmarks provided by Nota show that inference speed for VGG19 increased from 5.28 frames per second to 222.22 frames per second following optimization, while accuracy declined by 1.14 percentage points from 72.28 percent to 71.14 percent.

For MobileNetV1, speed increased from 28.08 frames per second to 480.77 frames per second, while accuracy decreased by 0.57 percentage points, from 66.68 percent to 66.11 percent.

Nota's optimization pipeline is designed to automate model tailoring for specific chip architectures.

RNGD will be added to Nota's supported hardware list, which includes chips from Arm, Qualcomm, Nvidia, and Renesas.

The companies disclosed a packaged solution combining FuriosaAI's RNGD with the Nota Vision Agent, a Vision-Language Model-based solution that performs real-time monitoring, context-based event summarization, and natural-language queries for video searches.

The integration targets security, medical, retail, and smart building sectors.

The package is configured to operate as a standalone AI appliance for industrial environments.

FuriosaAI said last week it had received an initial production batch of 4,000 RNGD cards manufactured by TSMC and assembled by Asus.

The RNGD chip is designed for inference workloads and is available as a single PCIe card with 180-watt thermal design power or as part of the NXT-RNGD server, which holds eight cards.

Philip Lee profile image
by Philip Lee

Subscribe to The Pickool

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More