Start United States USA — IT Startup claims to boost LLM performance using standard memory instead of GPU...

Startup claims to boost LLM performance using standard memory instead of GPU HBM — but experts remain unconvinced by the numbers despite promising CXL technology

Von

March 25, 2024

120

MemVerge claims its Memory Machine software drives GPU performance
MemVerge, a provider of software designed to accelerate and optimize data-intensive applications, has partnered with Micron to boost the performance of LLMs using Compute Express Link (CXL) technology.
The company’s Memory Machine software uses CXL to reduce idle time in GPUs caused by memory loading.
The technology was demonstrated at Micron’s booth at Nvidia GTC 2024 and Charles Fan, CEO and Co-founder of MemVerge said, “Scaling LLM performance cost-effectively means keeping the GPUs fed with data. Our demo at GTC demonstrates that pools of tiered memory not only drive performance higher but also maximize the utilization of precious GPU resources.”Impressive results
The demo utilized a high-throughput FlexGen generation engine and an OPT-66B large language model. This was performed on a Supermicro Petascale Server, equipped with an AMD Genoa CPU, Nvidia A10 GPU, Micron DDR5-4800 DIMMs, CZ120 CXL memory modules, and MemVerge Memory Machine X intelligent tiering software.