Vector Databases: Storage Considerations and Solutions

This post was originally published on Pure Storage

This blog on vector databases and storage was originally published on Medium.com. It has been republished with the author’s credit and consent.

At this year’s GTC in San Jose, one slide from a NVIDIA session caught my eye, and when I talked about it with my colleagues, they all got excited. In the session, the speaker claimed that data expands when embedded, then stored uncompressed for optimal RAG, can increase data storage by up to 10x.

A 10x increase is a lot—100TB becomes 1PB, and 1PB becomes 10PB. No wonder why people in storage companies got really excited. But is it true that generative AI and RAG really expands data storage usage that much? And why is that? Let me try to test and confirm this.

In my previous blog, I briefly wrote about how RAG works and its data infrastructure. RAG encodes external data so that it can easily retrieve the relevant parts of the data on query. The best option for storing and retrieving external data for RAG is a vector database because it supports similarity search that enables RAG to quickly retrieve data that is relevant to user query. To understand its storage usage, we need

Read the rest of this post, which was originally published on Pure Storage.

Previous Post

Edge Computing Architecture Guide

Next Post

AWS is investing heavily in building tools for LLMops