-+ 0.00%
-+ 0.00%
-+ 0.00%

GF Securities: AI Reasoning RAG Vector Database Drives SSD Demand Growth Suggestions Focus on Core Beneficial Targets in the Industrial Chain

智通財經·12/31/2025 01:41:04
語音播報

The Zhitong Finance App learned that GF Securities released a research report saying that the RAG architecture provides long-term memory for large models, and that corporate and individual needs are driving the growth in demand for RAG storage. RAG vector database storage media in AI inference are transitioning from “memory participation in retrieval” to “full SSD storage architectures,” driving the demand for high-bandwidth, high-capacity SSDs to continue to increase. It is recommended to focus on the core beneficiary targets of the industrial chain.

The main views of GF Securities are as follows:

RAG provides “long-term memory” for large models, and corporate & individual needs drive the growth of RAG demand

In the RAG (Search Enhanced Generation) architecture, LLM (Big Language Model) initiates queries to vector databases before generating responses. Among them, vector databases act as a key hub connecting user queries and external knowledge, and are responsible for efficiently storing, managing, and retrieving high-dimensional vectorized knowledge representations, thereby improving the accuracy and timeliness of generated results. From an enterprise perspective, RAG is gradually penetrating into online scenarios (e-commerce, web search, etc.) and offline scenarios (enterprise, law, engineering research, etc.). From a personal perspective, personalized RAG preserves users' long-term memories, preferences and contextual information, forming a “user-level vector space”, which significantly boosts the growth of RAG demand.

AI inference RAG vector database drives SSD demand growth

Vector database storage media must carry large-scale vector data and index structures, and must support high throughput and low latency to meet similarity retrieval requirements in high concurrency scenarios. Currently, vector database storage media are moving from “memory participation in retrieval” to “full SSD storage architecture”. According to “All-in-Storage AnnsAlgorithms Optimize VectorDBusability within AragSystem”, using kioxiaAISASAQ as an example, vectors, PQ quantization results, and indexes are stored uniformly on an SSD. The required SSD capacity for a 10B vector scale is 11.2 TB, of which PQVectors account for 1.28 TB and the index account for 10 TB. Using TLC/QLCSSD, aiSAQ has a cost advantage of 4-7 times compared to DiskANN media costs; in addition, all AISAQ tenants are active, and tenants can directly start querying. There is no “cold start” delay that must be loaded from the SSD to DRAM before starting the query, improving the large-scale scalability and economic viability of the RAG system.

Volcano engine TosVectors opens a new paradigm of vector storage, increasing demand for SSDs

According to the Volcano Engine developer community account, TOS launched Vector Bucket. The architecture uses Byte's self-developed cloud-native vector index library Kiwi and a multi-level local cache cooperative architecture (covering DRAM, SSD, and remote object storage). In scenarios of large-scale, long-term storage and low-frequency queries, this architecture not only satisfies the hierarchical requirements of high/low frequency data, but also significantly lowers the threshold for enterprises to use vector data on a large scale. TOSvector collaborates deeply with products such as Volcano Engine's high-performance vector database and Volcano AI agent to store high-frequency access memories (such as users' core preferences, recent task execution results, etc.) in vector databases to achieve millisecond high-frequency retrieval; deposits low-frequency access memories (such as interactive records or historical execution results from half a year ago) into ToSVector to allow seconds of delay, in exchange for lower storage costs and wider memory space; in the case of an agent scenario that processes complex tasks TosVectors can not only carry massive semantic vector storage, but also guarantee the sustainable accumulation of long-term data.

Risk Alerts

The development and demand of the AI industry fell short of expectations; AI server shipments fell short of expectations, and the technology and product progress of domestic manufacturers fell short of expectations.