David Szigeti 10/15/24 David Szigeti 10/15/24

RAGs to RIChes

When it comes to code generative AI, RAG (Retrieval-Augmented Generation) is like the internet for computers. However, the common RAG approach of chunking files and indexing isn't ideal for working with git projects. It deteriorates contextual awareness and creates a mismatch between code and natural language. We've developed a novel technique called RIC: Retrieval Input Compression. RIC leverages the strength of LLMs to find the correct files to answer your prompts. Here's how it works:

1. We index the git history and files statically.
2. We compress the information at the retrieval stage, retaining semantic meaning.
3. This gives the LLM context-fitting data to infer the exact appropriate files.

The results are astounding.

Video of a chat with 3 different git projects:
- Large (1500 files);
- Small (50 files)
- Medium (600 files).

I'm pleased to announce that we came up with a novel technique that leverages the strength of LLMs to find the correct files to answer your prompts. We're calling the strategy RIC: Retrieval Input Compression. We index the git history and the files statically. Then, essentially, we compress the information at the retrieval stage, which retains the semantic meaning. This gives the LLM context-fitting data to infer the exact appropriate files. It solves the apples to oranges problem and eliminates the necessity for file chunking. The results are astounding. We can chat with git projects at scale.

Here is a video where I chat with Algorand's indexer project, which ingests Algrorand transactions for ease of discoverability. It has nearly 400 files checked into git, and it works unreasonably well.