RAGs to RIChes

Oct 15

Video of a chat with 3 different git projects:
- Large (1500 files);
- Small (50 files)
- Medium (600 files).

I'm pleased to announce that we came up with a novel technique that leverages the strength of LLMs to find the correct files to answer your prompts. We're calling the strategy RIC: Retrieval Input Compression. We index the git history and the files statically. Then, essentially, we compress the information at the retrieval stage, which retains the semantic meaning. This gives the LLM context-fitting data to infer the exact appropriate files. It solves the apples to oranges problem and eliminates the necessity for file chunking. The results are astounding. We can chat with git projects at scale.

Here is a video where I chat with Algorand's indexer project, which ingests Algrorand transactions for ease of discoverability. It has nearly 400 files checked into git, and it works unreasonably well.

David Szigeti