Local Retrieval Augmented Generation (RAG)

With Semantic Kernel

Posted by admin on August 09, 2024

Large Language Models (LLMs) are a hot topic in the world of software development.

I personally think they're somewhat magical.

I would love to "chat" with my own data to gain insights, etc.

But I don't want to share my data with anyone.

Enter local Retrieval Augmented Generation (RAG).

What is RAG?

Retrieval Augmented Generation is a way for you to include your own data when chating with a Large Language Model (LLM) chatbot.

Including your own data gives tremendous context to the LLM to provide more personalized, or domain specific information not otherwised found in the large corpus of text used to train these language models.

Additionally, you can have the chatbot provide citations references to your own documents your user's consideration.

What is Semantic Kernel?

Semantic Kernel is Microsoft's framework for orchestrating artificial intelligence (AI) services.

You can learn more by checking out my series on it:

I used Semantic Kernel because I love programming in .NET and C#.

A Contrived Example

Here is a basic list of facts that will simulate "my data", which I'll use to then chat with a local LLM and see what I get.

Credit: This example was inspired by another's blog post, but I cannot find it. I'll update my post if I find the orginal article I used when doing this.

            var facts = new OrgFact[]
                {
                    new("Our headquarters is located in Sydney, Australia.", "Headquarters", "City: Sydney"),
                    new("We have been in business for 25 years.", "Years in Operation", "Years: 25"),
                    new("Our corporate sponsor is the Melbourne Football Club.", "Corporate Sponsorship", "Team: Melbourne Football Club"),
                    new("We have 2 major departments.", "Departments", "Number: 2"),
                    new("Our team includes developers among other professionals.", "Occupation", "Job Title: Developer"),
                    new("Our team enjoys outdoor activities such as bushwalking.", "Team Activities", "Activity: Bushwalking"),
                    new("We have a company pet policy that allows dogs.", "Company Pet Policy", "Type: Dog"),
                    new("We prefer catering options featuring Australian cuisine.", "Catering Preferences", "Cuisine: Australian"),
                    new("We have expanded our operations to 5 countries.", "International Presence", "Countries: 5"),
                    new("Our staff includes graduates from the University of Sydney.", "Education", "University: Sydney"),
                    new("Our team is multilingual, speaking 3 languages.", "Languages Spoken", "Number: 3"),
                    new("We have a strict allergen policy, including precautions for peanuts.", "Allergen Policy", "Allergen: Peanuts"),
                    new("We support athletic achievements, such as participating in marathons.", "Athletic Support", "Event: Marathon"),
                    new("We have a company-wide collection of Australian art.", "Company Initiatives", "Item: Australian Art"),
                    new("Our team enjoys the Australian spring season for company events.", "Seasonal Preferences", "Season: Spring"),
                    new("Our corporate book club's favorite book is 'The Book Thief'.", "Corporate Book Club", "Book: The Book Thief"),
                    new("We offer vegetarian, vegan, gluten free and halal options in our corporate diet policy.", "Dietary Policies", "Diet: Vegetarian"),
                    new("We actively support volunteering in local community projects.", "Community Engagement", "Place: Local Community Projects"),
                    new("We aim to expand our presence to every continent.", "Expansion Goals", "Goal: Every Continent"),
                    new("Many of our staff members hold advanced degrees, including in Computer Science.", "Advanced Education", "Degree: Master's in Computer Science")
                };

When loaded into my local chat bot, I get the following results:

Conclusion

Retrieval Augemented Generation is a great way to chat with your data.

And Semantic Kernel is a C# developer's way to create some great AI experiences.

If you found value in this post, consider following me on X @davidpuplava for more valuable information about Game Dev, OrchardCore, C#/.NET and other topics.