Kioxia and Pliops Unveil Storage Plans for Nvidia's 2025 GTC Event
Revamped Article:
Venture into the fascinating world of AI, where data storage and memory play a pivotal role in powering training and inference. Get a sneak peek of what's coming up at the upcoming Nvidia GPU Technology Conference, as several storage and memory giants unveil their latest innovations to cater to the ever-increasing AI workloads. In this article, we'll shine the spotlight on a couple of intriguing pre-event announcements by Kioxia and Pliops. stayed tuned for more insights at the GTC!
Data centers often leverage various digital storage technologies to strike a balance between performance requirements and cost efficiency. For AI training, fast solid-state drives (SSDs) are commonly employed to feed data to dynamic random-access memory (DRAM), which supports the swift data access needed for GPUs.
Over the past few months, many major SSD manufacturers have rolled out high-capacity Quad-Level Cell (QLC) SSDs, aiming to offer warmer and hot data storage solutions, possibly displacing some Hard Disk Drive (HDD) secondary storage, especially in scenarios where storage density within racks is critical. Kioxia has hopped on the SSD QLC bandwagon.
Kioxia recently unveiled its 122.88TB LC9 series NVMe SSD, designed for AI applications. This SSD, housed in a 2.5-inch form factor, employs the company's 8th-generation 3D QLC 2 terabit die with CMOS directly Bonded to Array, enhancing memory die density. Sporting a PCIe 5.0 interface and dual-port capability, this drive offers greater fault tolerance and supports multiple computer systems. It can deliver up to 128 gigatransfers per second.

Kioxia asserts that high-capacity drives are essential for specific AI workloads, particularly for large language models, training, storing substantial data sets, and instant data retrieval for inference and model fine-tuning. The new SSD is also compatible with Kioxia's AiSAQ technology, which boosts scalable Retrieval Augmented Generation performance by storing vector database elements on SSDs instead of DRAM.
Pliops, a supplier of solid state storage and accelerator products, announced a strategic collaboration with vLLM Production Stack, developed at LMCache Lab at the University of Chicago. This stack is designed to significantly improve LLM interference performance.
Pliops is offering shared storage and efficient vLLM cache offloading, while LMCache Lab provides a scalable framework for multiple instance execution. This combined solution can recover from failed instances, leveraging Pliops' KV storage backend. This collaboration introduces a petabyte tier of memory below High-Bandwidth Memory (HBM) memory for GPU compute applications. Using disaggregated smart storage, computed KV caches are retrieved efficiently, significantly accelerating vLLM inference.
Data centers require storage and memory to furnish data for AI training and inference. At the upcoming GTC, Kioxia will showcase its high-capacity PCIe 5.0 SSD, while Pliops will exhibit its shared storage designed to enhance LLM inference performance.
At the upcoming Nvidia GPU Technology Conference, Kioxia is expected to exhibit its recently announced high-capacity 122.88TB LC9 series NVMe SSD, designed for AI applications. This SSD, featuring Kioxia's 8th-generation 3D QLC technology, will cater to specific AI workloads such as large language models and data-heavy tasks.
Parallel to this, Pliops, a supplier of solid state storage and accelerator products, has announced a strategic collaboration with vLLM Production Stack developed at the University of Chicago's LMCache Lab. This collaboration aims to significantly improve LLM interference performance by using Pliops' shared storage and efficient vLLM cache offloading, offering a petabyte tier of memory for GPU compute applications.
These developments by Kioxia and Pliops demonstrate how digital storage technologies, such as SSDs and caching solutions, are evolving to meet the performance requirements and cost efficiency demands of data centers supporting AI training and inference. These advancements will be showcased at the GTC event.