Home Cases

MinIO Introduces MemKV for Petabyte-Scale AI Inference Memory

All Products

Rack Storage Server
(165)

Huawei Fusion Server
(31)

Dell Poweredge Server
(59)

H3C Server
(31)

Datacom Switches
(96)

WLAN Device
(21)

Smart Wireless Router
(10)

Hard Drive HDD
(78)

Internal Hard Drive SSD
(16)

Geforce Graphic Card
(27)

INTEL CPU Processor
(20)

Server Memory RAM
(6)

Refurbished Storage Server
(6)

SFP Transceiver Module
(4)

Fibre Channel Switch
(42)

Certification

Customer Reviews

The sales staff of Beijing Qianxing Jietong Technology Co.,Ltd are very professional and patient. They can provide quotations quickly. The quality and packaging of the products are also very good. Our cooperation is very smooth.

—— 《Festfing DV》LLC

When I was looking for intel CPU and Toshiba SSD urgently, Sandy from Beijing Qianxing Jietong Technology Co., Ltd gave me a lot of help and got me the products I needed quickly. I really appreciate her.

—— Kitty Yen

Sandy of Beijing Qianxing Jietong Technology Co.,Ltd is a very careful salesman, who can remind me of configuration errors in time when I buy a server. The engineers are also very professional and can quickly complete the testing process.

—— Strelkin Mikhail Vladimirovich

We are very happy with our experience working with Beijing Qianxing Jietong. The product quality is excellent, and delivery is always on time. Their sales team is professional, patient, and very helpful with all our questions. We truly appreciate their support and look forward to a long-term partnership. Highly recommended!

—— Ahmad Navid

Quality： “Great experience with my supplier. The MikroTik RB3011 was already used, but it was in very good condition and everything works perfectly. Communication was fast and smooth, and all my concerns were addressed quickly. Very reliable supplier—highly recommended.”

—— Geran Colesio

I'm Online Chat Now

MinIO Introduces MemKV for Petabyte-Scale AI Inference Memory

May 15, 2026

MinIO has released MemKV, a dedicated context memory store built to resolve a critical bottleneck within large-scale AI inference pipelines. Serving as MinIO’s second flagship solution alongside AIStor, MemKV expands the firm’s data infrastructure into the memory tier. It is engineered to deliver persistent, shared contextual data for agentic AI workloads running on distributed GPU clusters.

MinIO AIStor

As AI systems advance from one-off replies to multi-turn reasoning and automated task execution, sustaining continuous context across inference cycles has grown increasingly essential. Under existing architectures, context data is often discarded due to the limited capacity of GPU-adjacent memory tiers including HBM and DRAM. This compels GPUs to recalculate existing context repeatedly, driving up latency, compute usage and power draw. MinIO defines this redundant workload as the "recompute tax", an inefficiency that worsens exponentially in hyperscale cloud environments.

MemKV is engineered to alleviate this pain point via a shared, persistent memory layer capable of petabyte-scale storage with microsecond-level access latency. By retaining contextual data throughout inference workflows, the platform cuts down redundant computation and boosts overall infrastructure efficiency. Internal benchmark data from MinIO verifies improved time-to-first-token latency under production-grade concurrency. In a typical deployment equipped with 128 GPUs and 128K-token context windows, GPU utilization jumped from approximately 50% to over 90%, translating to substantial annual compute cost reductions.

MinIO’s executives stated that recompute overhead remains unnoticeable in small-scale deployments yet turns into a fundamental structural flaw at enterprise scale. As GPU clusters expand, repeated context regeneration incurs higher power consumption and infrastructure expenses, making specialized memory systems indispensable for sustainable AI operation.

Addressing the Memory-Scale Tradeoff

Legacy AI infrastructure forces developers to compromise between access speed and storage capacity. High-performance memory tiers such as HBM and DRAM deliver microsecond latency but come with tight capacity limits and high costs. In contrast, conventional storage systems offer massive scalability but suffer from millisecond-level latency, making them incompatible with real-time inference and long-context reasoning tasks.

Micron HBM4

MemKV bridges this industry gap by introducing an intermediate shared memory tier that balances ultra-low latency and large storage scalability. Natively compatible with NVIDIA BlueField-4 STX and integrated with NVIDIA Dynamo alongside NIXL tools, the solution enables entire GPU clusters to access unified contextual data pools at inference-aligned transmission speeds. This design eliminates frequent context data migration between isolated memory and storage layers, lowering latency and elevating system throughput.

NVIDIA BlueField-4

Architecture Optimized for Inference Workloads

Tailored exclusively for inference data pipelines, MemKV fits into the G3.5 layer of MinIO’s GPU memory hierarchy framework. Built on NVMe storage infrastructure, it achieves petabyte-level capacity while retaining microsecond access latency, successfully decoupling memory scalability from GPU compute resources.

The system abandons cumbersome traditional storage abstractions, transferring data straight from NVMe drives to AI data pipelines via end-to-end RDMA transmission. This cuts out performance overhead brought by HTTP protocols, file system conversion and intermediate storage servers—common bottlenecks in object and file-based storage architectures.

Source: Google

Key architectural optimizations include native ARM64 binary execution on NVIDIA BlueField-4 STX, embedded directly within the storage layer to reduce dependence on external x86 storage nodes. All data transfers between GPU memory and NVMe storage adopt RDMA transmission, bypassing redundant conventional storage stacks. Additionally, MemKV utilizes enlarged block sizes ranging from 2 MB to 16 MB, which are optimized for GPU throughput characteristics instead of the legacy 4 KB storage blocks. It supports cutting-edge high-speed interconnection fabrics such as NVIDIA Spectrum-X Ethernet and PCIe Gen6, facilitating near wire-speed data transmission across clusters.

Availability

MinIO MemKV is now commercially available for enterprise deployment.

Beijing Qianxing Jietong Technology Co., Ltd.
Sandy Yang/Global Strategy Director
WhatsApp / WeChat: +86 13426366826
Email: yangyd@qianxingdata.com
Website: www.qianxingdata.com/www.storagesserver.com
Business Focus:
ICT Product Distribution/System Integration & Services/Infrastructure Solutions
With 20+ years of IT distribution experience, we partner with leading global brands to deliver reliable products and professional services.
“Using Technology to Build an Intelligent World”Your Trusted ICT Product Service Provider!

PREV: Scality Introduces ADI Platform for AI-Driven, Sovereign Data Infrastructure

NEXT: ORICO X50 Review: Thunderbolt 5 Speed in a Portable SSD Enclosure

Contact Details

Beijing Qianxing Jietong Technology Co., Ltd.

Contact Person: Ms. Sandy Yang

Tel: 13426366826

MinIO Introduces MemKV for Petabyte-Scale AI Inference Memory

Rack Storage Server

Huawei Fusion Server

Dell Poweredge Server

H3C Server

Datacom Switches

WLAN Device

Smart Wireless Router

Hard Drive HDD

Internal Hard Drive SSD

Geforce Graphic Card

INTEL CPU Processor

Server Memory RAM

Refurbished Storage Server

SFP Transceiver Module

Fibre Channel Switch

MinIO Introduces MemKV for Petabyte-Scale AI Inference Memory

MinIO AIStor

Addressing the Memory-Scale Tradeoff

Micron HBM4

NVIDIA BlueField-4

Architecture Optimized for Inference Workloads

Availability

Rack Storage Server

12 Bays 1U Rackmount Server Lenovo ThinkSystem SR630 Rack Server

ThinkSystem SR250 V2 4SFF Rack Storage Server Intel Xeon E-2378G Processor

Intel C621A Rack Storage Server Inspur NF5180M6 1U Rack Mount Server

Huawei Fusion Server

FusionServer 5288 V6 4U Rack Server 32 DDR4 DIMMs 44 3.5 Inch Hard Disks

Ultra High Density Huawei Fusion Server 1U Network Storage Server 1288H V5

New Gen OceanStor 5310 Huawei Rack Server Hybrid Flash Storage