How it began?

Initial Concept And Challenges

Our AI Advisor was created to encapsulate the AI prompt/logic into single advisor to be called from service and that seamed like great idea. At the beginning it was!

Simple class extending RequestResponseAdvisor, with one dedicated prompt, for system message, was instructing our AI to be polite and concise with its answers to user prompt. As advertised, it chained perfectly with already available SimpleLoggerAdvisor, so, so far so good. Of course, to get some extra value from it, we needed the AI to be aware about some articles, so we could chat about them.

However, as the complexity of user interactions increased, particularly with the need to integrate knowledge from various articles, it became evident that our simple advisor needed enhancement.

You probably guess, we started reading about RAGs and how we can use them to achieve what we want.

Transition to VectorStoreDataAdvisor

After investigation, we transitioned to using Neo4j as our vector database. We decided for Neo4j because in the future we can use to build our own knowledge graph and make our simple RAG much smarter. This shift allowed us to store article content, metadata, and embeddings more effectively, turning our advisor into the VectorStoreDataAdvisor. Simple request building now included fetch data from vector store using embeddings and similarity search, incorporating those articles into String template containing information what is context, what is user question and how to answer if needed data is not present in provided context.

This new form was not just a name change but a fundamental expansion in capability, incorporating advanced data fetching from the vector store based on embeddings and similarity searches. The system could now contextualize the information around the user’s questions and provide more accurate responses.

Enhancements

As we started with some more intensive testing, we found out that our AI app is still having problem with both finding proper articles, as well as giving proper response.

We initially thought our prompt engineering skills were on a coffee break , endlessly. So, we buckled down to craft a better prompt template, but no matter how hard we tried, the issues stuck around like a stubborn software bug refusing to be debugged.

Getting back to drawing board, we decided to take look into Advanced RAG principles in order to improve our search accuracy both in Vector Store and later on when sending context with the question to LLM.
In there we sound different approaches we could use to make this work. Two most promising are Pre-Retrieval and Post-Retrieval, so we started with Pre-Retrieval and Query Transformation.

In order to understand how we should approach the issue we played a bit with similarity search via our chatbot.
Using ‘Vice President Kamala Haris’ (and ‘Kamala Harris’) with date picker example, we realize that similarity search is far from full text search and if syntagm is present in the text, this doesn’t mean it will be found by the similarity search at all. As we don’t know how the embeddings are generated, chatbot was our source of truth. By playing with topK and similarityThreshold options we realized that in such cases we shouldn’t go with similarity search beyond 0.7. With 0.5 especially 0.3, we got results not containing any of words contained in the phrase ‘Kamala Harris’! What we think could increase search accuracy and data relevance inside the data store, is possible improvement on the ingestion as well as on the retrieval side (using co-reference resolution or summarization during ingestion and full text search in addition to similarity search).

During our testing, incorporating timeframes into queries did not yield any useful results. To address this, we decided to adopt a Rewrite-Retrieve-Read approach in the Pre-Retrieval and Query Transformation phase. This involves improving the queries itself and enabling the detection and understanding of timeframes.

Initially, our model lacked awareness of real-world context. For example, when asked about the current date, it responded with:

“I’m sorry, but I can’t provide real-time information such as today’s date. However, you can easily check the date on your device or calendar.”
or
“Today’s date is November 1, 2023.“

Disappointing… How we overcame it is explained in When Code Gets Creative: The Birth of Our First AI Agentic Workflow

This functionality is being implemented using an AI-driven approach, leveraging smaller, chained calls to a large language model (LLM), before reaching out and retrieve from VectorStore.

Result… worked as a miracle, … we gained both improvements in performance for fetch phase (actually for whole flow even thought we had one more AI call) as well as accuracy of retrieve step.

With those results in mind, we started thinking, “what more can we do to help main AI call” and results were multiple with most prominent being:

ask AI to extract only semantically relevant parts of user question for actual vector database search;
generate n additional variations of user question and run search for all of them;
add one more data source (open search) to support also full text search option.

We started working on each of those, re-running our tests after we implemented any such change ensuring that every new feature not only meets but exceeds our expectations.

Optimizing AI Interactions

Our VectorStoreDataAdvisor soon got support for optimization of search request for search. By leveraging AI, we’ve honed our ability to dissect user questions, extracting only the essential elements. This distilled input is then used to generate multiple query variations, enriching the database queries and, by extension, the AI’s responses.

Growing in Capability and Complexity

Each new feature has expanded the advisor’s capabilities, enhancing the quality of context it provides. However, these improvements also increased the complexity of the system. While each adjustment proved effective in some scenarios, they often required tailored solutions, making our progress as case-dependent as it was innovative.

Navigating Challenges with Re-Ranking and Decision Making

Faced with a multitude of potential enhancements, our team tackled the challenge of use case identification. How could we ensure the AI consistently delivers the best outcomes without succumbing to pitfalls? Initially, we explored letting the AI autonomously determine the appropriate use case. However, the unique nature of our application made this approach impractical.

Instead, we turned to a custom re-ranking strategy. By processing and grading results from various sources, we could confidently utilize only the highest quality data for further interactions. We decided to execute all available processes, gather results from every possible source, and then evaluate and compare these outcomes. This approach ensured that we only utilized the top-ranked results for further AI interactions. Developing a custom re-ranking solution was our answer to streamlining this process. Testing this solution proved to be a formidable challenge, primarily due to uncertainties about how to effectively rank results, particularly for specific scenarios. However, through a meticulous process of trial and error—adjusting each part, tweaking every formula, and fine-tuning weights—we achieved a re-ranking solution that met our rigorous standards of satisfaction.

The Advisor was now equipped with advanced capabilities including preprocessing user queries, extracting structured data, expanding semantically relevant parts of queries, fetching data from multiple sources, and re-ranking this information before constructing the context.

The Quest for Optimal Structure

Despite these advances, the growing complexity of our advisor posed a new challenge: maintaining clean code and good programming practices. With most of the logic tightly packed into a single agent class, we began considering a modular, chain-based approach to simplify management and enhance scalability.

🅆🄾🅄🄻🄳 🄸🅃 🄱🄴 🄱🄴🅃🅃🄴🅁 🄸🄵 🅆🄴 🅆🄴🄽🅃 🅆🄸🅃🄷 🄰 🄲🄷🄰🄸🄽 🄰🄿🄿🅁🄾🄰🄲🄷?

Looking Forward

As we continue to refine our VectorStoreDataAdvisor, we remain committed to pushing the boundaries of what our AI can achieve. The path hasn’t always been straightforward, but with each challenge comes greater insight and a clearer direction for future enhancements.

Stay tuned to our blog for more updates on our AI development journey and the innovative solutions we’re exploring at TN-Tech.

Evolving the Advisor Interface in Our Spring AI Application

How it began?

Initial Concept And Challenges

Transition to VectorStoreDataAdvisor

Enhancements

Optimizing AI Interactions

Growing in Capability and Complexity

Navigating Challenges with Re-Ranking and Decision Making

The Quest for Optimal Structure

Looking Forward

When Code Gets Creative: The Birth of Our First AI Agentic Workflow

The Art of Chunking: A Personal Journey Through Our RAG System Development

Leveraging Generative AI for Enhanced News Article Interaction

WORKING HOURS

WE ARE HERE

WORKING HOURS

WE ARE HERE

Consultation phone call

Consultation phone call